Skip to content

Commit 7331357

Browse files
authored
Merge pull request #7 from g-n-a-d/sharding
add_sharding
2 parents dc00f9e + e2f4474 commit 7331357

File tree

5 files changed

+158
-0
lines changed

5 files changed

+158
-0
lines changed

ansible/roles/prometheus/defaults/main.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ prometheus_web_config:
4040
http_server_config: {}
4141
basic_auth_users: {}
4242

43+
# Prometheus customized arguments
44+
prometheus_enable_target_sharding: false
45+
4346
# Configuration file options
4447
# --------------------------
4548
prometheus_global:
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
from ansible.module_utils.basic import AnsibleModule
2+
import yaml
3+
import json
4+
import os
5+
6+
def load_config_file(path):
7+
_, ext = os.path.splitext(path)
8+
if ext in ['.yaml', '.yml']:
9+
with open(path, 'r') as f:
10+
return yaml.safe_load(f)
11+
elif ext == '.json':
12+
with open(path, 'r') as f:
13+
return json.load(f)
14+
15+
def save_config_file(data, path):
16+
_, ext = os.path.splitext(path)
17+
if ext in ['.yaml', '.yml']:
18+
with open(path, 'w') as f:
19+
yaml.dump(data, f, default_flow_style=False)
20+
elif ext == '.json':
21+
with open(path, 'w') as f:
22+
json.dump(data, f, indent=2, ensure_ascii=False)
23+
24+
def add_sharding(config, modulus, hash_value):
25+
for job in config.get('scrape_configs', []):
26+
job['relabel_configs'] = [
27+
{
28+
'source_labels': ['__address__'],
29+
'modulus': modulus,
30+
'target_label': '__tmp_hash__',
31+
'action': 'hashmod'
32+
},
33+
{
34+
'source_labels': ['__tmp_hash__'],
35+
'regex': hash_value,
36+
'action': 'keep'
37+
}
38+
]
39+
return config
40+
41+
def main():
42+
module = AnsibleModule(
43+
argument_spec=dict(
44+
source=dict(type='str', required=True),
45+
modulus=dict(type='int', required=True),
46+
hash_value=dict(type='int', required=True)
47+
)
48+
)
49+
50+
source = module.params['source']
51+
modulus = module.params['modulus']
52+
hash_value = module.params['hash_value']
53+
54+
config = load_config_file(source)
55+
if not isinstance(config, dict):
56+
module.exit_json(changed=False)
57+
return
58+
sharded_config = add_sharding(config, modulus, hash_value)
59+
save_config_file(sharded_config, source)
60+
61+
module.exit_json(changed=True)
62+
63+
if __name__ == '__main__':
64+
main()

ansible/roles/prometheus/tasks/config.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,44 @@
8585
- "{{ ansitheus_custom_config_dir }}/prometheus/{{ inventory_hostname }}/prometheus.yml"
8686
- "{{ ansitheus_custom_config_dir }}/prometheus/prometheus.yml"
8787
- "{{ role_path }}/templates/prometheus.yml.j2"
88+
89+
- name: Shard target configurations
90+
inject_sharding_config:
91+
source: "{{ prometheus_config_dir }}/prometheus.yml"
92+
modulus: "{{ groups['prometheus'] | length }}"
93+
hash_value: "{{ groups['prometheus'].index(inventory_hostname) }}"
94+
when: prometheus_enable_target_sharding | bool
95+
96+
- name: Find file_sd configurations on prometheus hosts
97+
find:
98+
paths: "{{ prometheus_config_dir }}/file_sd"
99+
patterns: "*.yml,*.yaml,*.json"
100+
use_regex: false
101+
register: host_file_sd
102+
when:
103+
- prometheus_enable_target_sharding | bool
104+
- prometheus_file_sd is defined
105+
- prometheus_file_sd.files | length > 0
106+
107+
- name: Shard target configurations file_sd
108+
inject_sharding_config:
109+
source: "{{ item.path }}"
110+
modulus: "{{ groups['prometheus'] | length }}"
111+
hash_value: "{{ groups['prometheus'].index(inventory_hostname) }}"
112+
loop: "{{ host_file_sd.files }}"
113+
when:
114+
- prometheus_enable_target_sharding | bool
115+
- host_file_sd is defined
116+
- host_file_sd.files | length > 0
117+
118+
- name: Validate prometheus config
119+
meta: noop
88120
notify:
89121
- Validate prometheus config
122+
123+
- name: Reload prometheus config
124+
meta: noop
125+
notify:
90126
- Reload prometheus config
91127

92128
- name: Check prometheus containers
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Target sharding for multi-node promnetheus deployment
2+
3+
## 1. Hashmod
4+
5+
`hashmod` is a built-in metrics filtering mechanism of Prometheus. In a multi-node Prometheus setup, we can use `hashmod` to distribute the target endpoints accross Promwetheus nodes, helping to reduce the workload of Prometheus instance on every nodes.
6+
7+
### 1.1 How it works
8+
9+
Each metric scraped by Prometheus has its own labels and values, `hashmod` uses these values to compute a `hash_value` for each metric given a defined `modulus` as described in [Hashing and Sharding on Label Values](https://training.promlabs.com/training/relabeling/writing-relabeling-rules/hashing-and-sharding-on-label-values/).
10+
11+
`hasmod` can be configured inside `relabel_configs` field in Prometheus configuration file `prometheus.yml` or service discovery files, depending on specific setup.
12+
13+
Example configuration:
14+
```yaml
15+
relabel_configs:
16+
- action: hashmod
17+
modulus: 2
18+
source_labels:
19+
- __address__
20+
target_label: __tmp_hash__
21+
```
22+
23+
The `hash_value` is assigned to the `__tmp_hash__` label after performing hashmod on the value of `__address__` label.
24+
25+
### 1.2 How we distribute targets
26+
27+
During Ansible execution, the number of Prometheus nodes and the index of each node is determined based on the inventory provided. We set `modulus = <number-of-prometheus-nodes>` and `regex = <index-of-current-node>` to assign targets to a Prometheus node only when the `hash_value` of that target matches the index of that node.
28+
29+
```yaml
30+
relabel_configs:
31+
- action: hashmod
32+
modulus: <number-of-prometheus-nodes>
33+
source_labels:
34+
- __address__
35+
target_label: __tmp_hash__
36+
- action: keep
37+
regex: <index-of-current-node>
38+
source_labels:
39+
- __tmp_hash__
40+
```
41+
42+
**Notes**
43+
We use *special insternal label* `__address__` and `__tmp_hash__`, which cause Prometheus to filter all metrics belonging to a target at once, as these labels are assigned to all metrics scraped from that target. Therefore, this allow filtering before scraping. The targets whose metrics are to be dropped will not be scraped.
44+
45+
## 2. Setup
46+
47+
> Sharding should only be enabled if Prometheus nodes are running in remote-write mode. It is incompatible with Prometheus running in HA mode, as metrics are distributed across Prometheus instances, and the master Prometheus node may not always contain the desired metrics.
48+
49+
To enable sharding, set variable `prometheus_enable_target_sharding: true` in ansible configuration file.

etc/ansitheus/config.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,12 @@ enable_fluentd: "no"
4242
enable_openstack_exporter: "no"
4343
enable_nginx_exporter: "no"
4444

45+
##################
46+
# Prometheus options
47+
##################
48+
49+
prometheus_enable_target_sharding: true
50+
4551
##################
4652
# Port mappings
4753
#################

0 commit comments

Comments
 (0)