You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* refine abstractions for hexagonal architecture of caches
* implement redis as a cache (#6)
* add timeout to each redis interaction
* Do not fail on missing entry from cache (#10)
* feat(metrics): monitor cache hit and miss after awaiting concurrently executed queries (#12)
* bump storage version
* doc: upadate README
* chore(README): Vertamedia to Contentsquare
Co-authored-by: Francois Milhem <[email protected]>
@@ -29,7 +29,7 @@ Chproxy, is an http proxy and load balancer for [ClickHouse](https://ClickHouse.
29
29
- Exposes various useful [metrics](#metrics) in [prometheus text format](https://prometheus.io/docs/instrumenting/exposition_formats/).
30
30
- Configuration may be updated without restart - just send `SIGHUP` signal to `chproxy` process.
31
31
- Easy to manage and run - just pass config file path to a single `chproxy` binary.
32
-
- Easy to [configure](https://github.com/Vertamedia/chproxy/blob/master/config/examples/simple.yml):
32
+
- Easy to [configure](https://github.com/ContentSquare/chproxy/blob/master/config/examples/simple.yml):
33
33
```yml
34
34
server:
35
35
http:
@@ -52,7 +52,7 @@ clusters:
52
52
53
53
### Precompiled binaries
54
54
55
-
Precompiled `chproxy` binaries are available [here](https://github.com/Vertamedia/chproxy/releases).
55
+
Precompiled `chproxy` binaries are available [here](https://github.com/ContentSquare/chproxy/releases).
56
56
Just download the latest stable binary, unpack and run it with the desired [config](#configuration):
57
57
58
58
```
@@ -64,7 +64,7 @@ Just download the latest stable binary, unpack and run it with the desired [conf
64
64
Chproxy is written in [Go](https://golang.org/). The easiest way to install it from sources is:
65
65
66
66
```
67
-
go get -u github.com/Vertamedia/chproxy
67
+
go get -u github.com/ContentSquare/chproxy
68
68
```
69
69
70
70
If you don't have Go installed on your system - follow [this guide](https://golang.org/doc/install).
@@ -89,7 +89,7 @@ All the `INSERT`s may be routed to a [distributed table](http://clickhouse-docs.
89
89
90
90
It would be better to spread `INSERT`s among available shards and to route them directly to per-shard tables instead of distributed tables. The routing logic may be embedded either directly into applications generating `INSERT`s or may be moved to a proxy. Proxy approach is better since it allows re-configuring `ClickHouse` cluster without modification of application configs and without application downtime. Multiple identical proxies may be started on distinct servers for scalability and availability purposes.
91
91
92
-
The following minimal `chproxy` config may be used for [this use case](https://github.com/Vertamedia/chproxy/blob/master/config/examples/spread.inserts.yml):
92
+
The following minimal `chproxy` config may be used for [this use case](https://github.com/ContentSquare/chproxy/blob/master/config/examples/spread.inserts.yml):
93
93
```yml
94
94
server:
95
95
http:
@@ -127,7 +127,7 @@ All the `SELECT`s may be routed to a [distributed table](http://clickhouse-docs.
127
127
128
128
It would be better to create identical distributed tables on each shard and spread `SELECT`s among all the available shards.
129
129
130
-
The following minimal `chproxy` config may be used for [this use case](https://github.com/Vertamedia/chproxy/blob/master/config/examples/spread.selects.yml):
130
+
The following minimal `chproxy` config may be used for [this use case](https://github.com/ContentSquare/chproxy/blob/master/config/examples/spread.selects.yml):
131
131
```yml
132
132
server:
133
133
http:
@@ -157,10 +157,10 @@ clusters:
157
157
### Authorize users by passwords via HTTPS
158
158
159
159
Suppose you need to access `ClickHouse` cluster from anywhere by username/password.
160
-
This may be used for building graphs from [ClickHouse-grafana](https://github.com/Vertamedia/ClickHouse-grafana) or [tabix](https://tabix.io/).
160
+
This may be used for building graphs from [ClickHouse-grafana](https://github.com/ContentSquare/ClickHouse-grafana) or [tabix](https://tabix.io/).
161
161
It is bad idea to transfer unencrypted password and data over untrusted networks.
162
162
So HTTPS must be used for accessing the cluster in such cases.
163
-
The following `chproxy` config may be used for [this use case](https://github.com/Vertamedia/chproxy/blob/master/config/examples/https.yml):
163
+
The following `chproxy` config may be used for [this use case](https://github.com/ContentSquare/chproxy/blob/master/config/examples/https.yml):
164
164
```yml
165
165
server:
166
166
https:
@@ -206,16 +206,18 @@ clusters:
206
206
207
207
caches:
208
208
- name: "shortterm"
209
-
dir: "/path/to/cache/dir"
210
-
max_size: 150Mb
209
+
mode: "file_system"
210
+
file_system:
211
+
dir: "/path/to/cache/dir"
212
+
max_size: 150Mb
211
213
212
214
# Cached responses will expire in 130s.
213
215
expire: 130s
214
216
```
215
217
216
218
### All the above configs combined
217
219
218
-
All the above cases may be combined in a single `chproxy` [config](https://github.com/Vertamedia/chproxy/blob/master/config/examples/combined.yml):
220
+
All the above cases may be combined in a single `chproxy` [config](https://github.com/ContentSquare/chproxy/blob/master/config/examples/combined.yml):
219
221
220
222
```yml
221
223
server:
@@ -278,17 +280,19 @@ clusters:
278
280
279
281
caches:
280
282
- name: "shortterm"
281
-
dir: "/path/to/cache/dir"
282
-
max_size: 150Mb
283
+
mode: "file_system"
284
+
file_system:
285
+
dir: "/path/to/cache/dir"
286
+
max_size: 150Mb
283
287
expire: 130s
284
288
```
285
289
286
290
## Configuration
287
291
288
292
### Server
289
-
`Chproxy`may accept requests over `HTTP` and `HTTPS` protocols. [HTTPS](https://github.com/Vertamedia/chproxy/blob/master/config#https_config) must be configured with custom certificate or with automated [Let's Encrypt](https://letsencrypt.org/) certificates.
293
+
`Chproxy`may accept requests over `HTTP` and `HTTPS` protocols. [HTTPS](https://github.com/ContentSquare/chproxy/blob/master/config#https_config) must be configured with custom certificate or with automated [Let's Encrypt](https://letsencrypt.org/) certificates.
290
294
291
-
Access to `chproxy` can be limitied by list of IPs or IP masks. This option can be applied to [HTTP](https://github.com/Vertamedia/chproxy/blob/master/config#http_config), [HTTPS](https://github.com/Vertamedia/chproxy/blob/master/config#https_config), [metrics](https://github.com/Vertamedia/chproxy/blob/master/config#metrics_config), [user](https://github.com/Vertamedia/chproxy/blob/master/config#user_config) or [cluster-user](https://github.com/Vertamedia/chproxy/blob/master/config#cluster_user_config).
295
+
Access to `chproxy` can be limitied by list of IPs or IP masks. This option can be applied to [HTTP](https://github.com/ContentSquare/chproxy/blob/master/config#http_config), [HTTPS](https://github.com/ContentSquare/chproxy/blob/master/config#https_config), [metrics](https://github.com/ContentSquare/chproxy/blob/master/config#metrics_config), [user](https://github.com/ContentSquare/chproxy/blob/master/config#user_config) or [cluster-user](https://github.com/ContentSquare/chproxy/blob/master/config#cluster_user_config).
292
296
293
297
### Users
294
298
There are two types of users: `in-users`(in global section) and `out-users` (in cluster section).
@@ -298,13 +302,13 @@ with overriding credentials.
298
302
Suppose we have one ClickHouse user `web` with `read-only` permissions and `max_concurrent_queries: 4` limit.
299
303
There are two distinct applications `reading` from ClickHouse. We may create two distinct `in-users` with `to_user: "web"` and `max_concurrent_queries: 2` each in order to avoid situation when a single application exhausts all the 4-request limit on the `web` user.
300
304
301
-
Requests to `chproxy` must be authorized with credentials from [user_config](https://github.com/Vertamedia/chproxy/blob/master/config#user_config). Credentials can be passed via [BasicAuth](https://en.wikipedia.org/wiki/Basic_access_authentication) or via `user` and `password` [query string](https://en.wikipedia.org/wiki/Query_string) args.
305
+
Requests to `chproxy` must be authorized with credentials from [user_config](https://github.com/ContentSquare/chproxy/blob/master/config#user_config). Credentials can be passed via [BasicAuth](https://en.wikipedia.org/wiki/Basic_access_authentication) or via `user` and `password` [query string](https://en.wikipedia.org/wiki/Query_string) args.
302
306
303
307
Limits for `in-users` and `out-users` are independent.
304
308
305
309
### Clusters
306
310
`Chproxy`can be configured with multiple `cluster`s. Each `cluster` must have a name and either a list of nodes
307
-
or a list of replicas with nodes. See [cluster-config](https://github.com/Vertamedia/chproxy/tree/master/config#cluster_config) for details.
311
+
or a list of replicas with nodes. See [cluster-config](https://github.com/ContentSquare/chproxy/tree/master/config#cluster_config) for details.
308
312
Requests to each cluster are balanced among replicas and nodes using `round-robin` + `least-loaded` approach.
309
313
The node priority is automatically decreased for a short interval if recent requests to it were unsuccessful.
310
314
This means that the `chproxy` will choose the next least loaded healthy node among least loaded replica
@@ -313,14 +317,14 @@ for every new request.
313
317
Additionally each node is periodically checked for availability. Unavailable nodes are automatically excluded from the cluster until they become available again. This allows performing node maintenance without removing unavailable nodes from the cluster config.
314
318
315
319
`Chproxy`automatically kills queries exceeding `max_execution_time` limit. By default `chproxy` tries to kill such queries
316
-
under `default` user. The user may be overriden with [kill_query_user](https://github.com/Vertamedia/chproxy/blob/master/config#kill_query_user_config).
320
+
under `default` user. The user may be overriden with [kill_query_user](https://github.com/ContentSquare/chproxy/blob/master/config#kill_query_user_config).
317
321
318
-
If `cluster`'s [users](https://github.com/Vertamedia/chproxy/blob/master/config#cluster_user_config) section isn't specified, then `default` user is used with no limits.
322
+
If `cluster`'s [users](https://github.com/ContentSquare/chproxy/blob/master/config#cluster_user_config) section isn't specified, then `default` user is used with no limits.
319
323
320
324
### Caching
321
325
322
326
`Chproxy`may be configured to cache responses. It is possible to create multiple
323
-
[cache-configs](https://github.com/Vertamedia/chproxy/blob/master/config/#cache_config) with various settings.
327
+
cache-configs with various settings.
324
328
Response caching is enabled by assigning cache name to user. Multiple users may share the same cache.
325
329
Currently only `SELECT` responses are cached.
326
330
Caching is disabled for request with `no_cache=1` in query string.
@@ -329,8 +333,21 @@ distinct responses for the identical query under distinct cache namespaces. Addi
329
333
an instant cache flush may be built on top of cache namespaces - just switch to new namespace in order
330
334
to flush the cache.
331
335
336
+
Two types of cache configuration are supported:
337
+
- local instance cache
338
+
- distributed cache
339
+
340
+
#### Local cache
341
+
Local cache is stored on machine's file system. Therefore it is suitable for single replica deployments.
342
+
Configuration template for local cache can be found [here](https://github.com/ContentSquare/chproxy/blob/master/config/#file_system_cache_config)
343
+
344
+
#### Distributed cache
345
+
Distributed cache relies on external database to share cache across multiple replicas. Therefore it is suitable for
346
+
multiple replicas deployments. Currently only [redis](https://redis.io/) key value store is supported.
347
+
Configuration template for distributed cache can be found [here](https://github.com/ContentSquare/chproxy/blob/master/config/#distributed_cache_config)
348
+
332
349
### Security
333
-
`Chproxy`removes all the query params from input requests (except the user's [params](https://github.com/Vertamedia/chproxy/blob/master/config#param_groups_config) and listed [here](https://github.com/Vertamedia/chproxy/blob/master/scope.go#L292))
350
+
`Chproxy`removes all the query params from input requests (except the user's [params](https://github.com/ContentSquare/chproxy/blob/master/config#param_groups_config) and listed [here](https://github.com/ContentSquare/chproxy/blob/master/scope.go#L292))
334
351
before proxying them to `ClickHouse` nodes. This prevents from unsafe overriding
335
352
of various `ClickHouse` [settings](http://clickhouse-docs.readthedocs.io/en/latest/interfaces/http_interface.html).
336
353
@@ -339,7 +356,7 @@ By default `chproxy` tries detecting the most obvious configuration errors such
339
356
340
357
Special option `hack_me_please: true` may be used for disabling all the security-related checks during config validation (if you are feeling lucky :) ).
341
358
342
-
#### Example of [full](https://github.com/Vertamedia/chproxy/blob/master/config/testdata/full.yml) configuration:
359
+
#### Example of [full](https://github.com/ContentSquare/chproxy/blob/master/config/testdata/full.yml) configuration:
343
360
```yml
344
361
# Whether to print debug logs.
345
362
#
@@ -354,18 +371,30 @@ hack_me_please: true
354
371
# Optional response cache configs.
355
372
#
356
373
# Multiple distinct caches with different settings may be configured.
374
+
375
+
name: "shortterm"
376
+
mode: "file_system"
377
+
file_system:
378
+
dir: "/path/to/cache/dir"
379
+
max_size: 150Mb
380
+
expire: 130s
357
381
caches:
358
382
# Cache name, which may be passed into `cache` option on the `user` level.
359
383
#
360
384
# Multiple users may share the same cache.
361
385
- name: "longterm"
362
386
363
-
# Path to directory where cached responses will be stored.
364
-
dir: "/path/to/longterm/cachedir"
365
-
366
-
# Maximum cache size.
367
-
# `Kb`, `Mb`, `Gb` and `Tb` suffixes may be used.
368
-
max_size: 100Gb
387
+
# Cache mode, either [[file_system]] or [[redis]]
388
+
mode: "file_system"
389
+
390
+
# Applicable for cache mode: file_system
391
+
file_system:
392
+
# Path to directory where cached responses will be stored.
393
+
dir: "/path/to/longterm/cachedir"
394
+
395
+
# Maximum cache size.
396
+
# `Kb`, `Mb`, `Gb` and `Tb` suffixes may be used.
397
+
max_size: 100Gb
369
398
370
399
# Expiration time for cached responses.
371
400
expire: 1h
@@ -381,8 +410,14 @@ caches:
381
410
grace_time: 20s
382
411
383
412
- name: "shortterm"
384
-
dir: "/path/to/shortterm/cachedir"
385
-
max_size: 100Mb
413
+
mode: "redis"
414
+
415
+
# Applicable for cache mode: redis
416
+
redis:
417
+
addresses:
418
+
- "localhost:6379"
419
+
username: "user"
420
+
password: "pass"
386
421
expire: 10s
387
422
388
423
# Optional network lists, might be used as values for `allowed_networks`.
@@ -627,7 +662,7 @@ clusters:
627
662
allowed_networks: ["office"]
628
663
```
629
664
630
-
#### Full specification is located [here](https://github.com/Vertamedia/chproxy/blob/master/config)
665
+
#### Full specification is located [here](https://github.com/ContentSquare/chproxy/blob/master/config)
631
666
632
667
## Metrics
633
668
Metrics are exposed in [prometheus text format](https://prometheus.io/docs/instrumenting/exposition_formats/) at `/metrics` path.
@@ -660,7 +695,7 @@ Metrics are exposed in [prometheus text format](https://prometheus.io/docs/instr
660
695
| timeout_request_total | Counter | The number of timed out requests | `user`, `cluster`, `cluster_user`, `replica`, `cluster_node` |
661
696
| user_queue_overflow_total | Counter | The number of overflows for per-user request queues | `user`, `cluster`, `cluster_user` |
662
697
663
-
An example of [Grafana's](https://grafana.com) dashboard for `chproxy` metrics is available [here](https://github.com/Vertamedia/chproxy/blob/master/chproxy_overview.json)
698
+
An example of [Grafana's](https://grafana.com) dashboard for `chproxy` metrics is available [here](https://github.com/ContentSquare/chproxy/blob/master/chproxy_overview.json)
// AsyncCache is a transactional cache allowing the results from concurrent queries.
11
+
// When query A is equal to query B and A arrives no more than defined graceTime, query A will await for the results of query B for the max time equal to:
0 commit comments