Skip to content

Commit 69c024e

Browse files
Merge pull request openshift#2188 from suhanime/doc_update
Add Fixed Labels to metrics doc
2 parents df65e7e + b0b4cd0 commit 69c024e

File tree

1 file changed

+68
-66
lines changed

1 file changed

+68
-66
lines changed

docs/hive_metrics.md

Lines changed: 68 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -39,104 +39,106 @@ Opt into them via `HiveConfig.Spec.MetricsConfig.AdditionalClusterDeploymentLabe
3939
For example, including `{"ocp_major_version": "hive.openshift.io/version-major"}` will cause affected metrics to include a label key ocp_major_version with the value from the `hive.openshift.io/version-major` ClusterDeployment label -- e.g. "4".
4040
Every metric that allows optional labels will always have all the labels mentioned present. If the corresponding Cluster Deployment label is not present then the metric label will report its value as "unspecified".
4141

42+
The hive operator will panic if provided additional labels overlap with the fixed labels of the corresponding metric. Please refer to the fixed labels of each metric [here](#list-of-all-hive-metrics).
43+
4244
Note: It is up to the cluster admins to be mindful of cardinality and ensure these labels are not too specific, like cluster id, otherwise it can negatively impact your observability system's performance
4345

4446
### List of all Hive metrics
4547

4648
#### Hive Operator metrics
4749
These metrics are observed by the Hive Operator. None of these are optional.
4850

49-
| Metric Name | Optional Label Support |
50-
|:-------------------------------:|:----------------------:|
51-
| hive_hiveconfig_conditions | N |
52-
| hive_operator_reconcile_seconds | N |
51+
| Metric Name | Optional Label Support | Fixed Labels |
52+
|:-------------------------------:|:----------------------:|---------------------------|
53+
| hive_hiveconfig_conditions | N | {"condition", "reason"} |
54+
| hive_operator_reconcile_seconds | N | {"controller", "outcome"} |
5355

5456
#### Metrics reported by all controllers
5557
These metrics are observed by all Hive Controllers. None of these are optional.
5658

57-
| Metric Name | Optional Label Support |
58-
|:-----------------------------------------:|:----------------------:|
59-
| hive_kube_client_requests_total | N |
60-
| hive_kube_client_request_seconds | N |
61-
| hive_kube_client_requests_cancelled_total | N |
59+
| Metric Name | Optional Label Support | Fixed Labels |
60+
|:-----------------------------------------:|:----------------------:|----------------------------------------------------------|
61+
| hive_kube_client_requests_total | N | {"controller", "method", "resource", "remote", "status"} |
62+
| hive_kube_client_request_seconds | N | {"controller", "method", "resource", "remote", "status"} |
63+
| hive_kube_client_requests_cancelled_total | N | {"controller", "method", "resource", "remote"} |
6264

6365
#### ClusterDeployment controller metrics
6466
These metrics are observed while processing ClusterDeployments. None of these are optional.
6567

66-
| Metric Name | Optional Label Support |
67-
|:--------------------------------------------------------:|:----------------------:|
68-
| hive_cluster_deployment_install_job_duration_seconds | N |
69-
| hive_cluster_deployment_install_job_delay_seconds | N |
70-
| hive_cluster_deployment_imageset_job_delay_seconds | N |
71-
| hive_cluster_deployment_dns_delay_seconds | N |
72-
| hive_cluster_deployment_completed_install_restart | Y |
73-
| hive_cluster_deployments_created_total | Y |
74-
| hive_cluster_deployments_installed_total | Y |
75-
| hive_cluster_deployments_deleted_total | Y |
76-
| hive_cluster_deployments_provision_failed_terminal_total | Y |
68+
| Metric Name | Optional Label Support | Fixed Labels |
69+
|:--------------------------------------------------------:|:----------------------:|--------------------------------------------------|
70+
| hive_cluster_deployment_install_job_duration_seconds | N | {} |
71+
| hive_cluster_deployment_install_job_delay_seconds | N | {} |
72+
| hive_cluster_deployment_imageset_job_delay_seconds | N | {} |
73+
| hive_cluster_deployment_dns_delay_seconds | N | {} |
74+
| hive_cluster_deployment_completed_install_restart | Y | {} |
75+
| hive_cluster_deployments_created_total | Y | {} |
76+
| hive_cluster_deployments_installed_total | Y | {} |
77+
| hive_cluster_deployments_deleted_total | Y | {} |
78+
| hive_cluster_deployments_provision_failed_terminal_total | Y | {"clusterpool_namespacedname", "failure_reason"} |
7779

7880
#### ClusterProvision controller metrics
7981
These metrics are observed while processing ClusterProvisions. None of these are optional.
8082

81-
| Metric Name | Optional Label Support |
82-
|:---------------------------------------------:|:----------------------:|
83-
| hive_cluster_provision_results_total | Y |
84-
| hive_install_errors | Y |
85-
| hive_cluster_deployment_install_failure_total | Y |
86-
| hive_cluster_deployment_install_success_total | Y |
83+
| Metric Name | Optional Label Support | Fixed Labels |
84+
|:---------------------------------------------:|:----------------------:|-------------------------------------------------------------------------|
85+
| hive_cluster_provision_results_total | Y | {"result"} |
86+
| hive_install_errors | Y | {"reason"} |
87+
| hive_cluster_deployment_install_failure_total | Y | {"platform", "region", "cluster_version", "workers", "install_attempt"} |
88+
| hive_cluster_deployment_install_success_total | Y | {"platform", "region", "cluster_version", "workers", "install_attempt"} |
8789

8890
#### ClusterDeprovision controller metrics
8991
These metrics are observed while processing ClusterDeprovisions. None of these are optional.
9092

91-
| Metric Name | Optional Label Support |
92-
|:------------------------------------------------------:|:----------------------:|
93-
| hive_cluster_deployment_uninstall_job_duration_seconds | N |
93+
| Metric Name | Optional Label Support | Fixed Labels |
94+
|:------------------------------------------------------:|:----------------------:|--------------|
95+
| hive_cluster_deployment_uninstall_job_duration_seconds | N | {} |
9496

9597
#### ClusterPool controller metrics
9698
These metrics are observed while processing ClusterPools. None of these are optional.
9799

98-
| Metric Name | Optional Label Support |
99-
|:-------------------------------------------------:|:----------------------:|
100-
| hive_clusterpool_clusterdeployments_assignable | N |
101-
| hive_clusterpool_clusterdeployments_claimed | N |
102-
| hive_clusterpool_clusterdeployments_deleting | N |
103-
| hive_clusterpool_clusterdeployments_installing | N |
104-
| hive_clusterpool_clusterdeployments_unclaimed | N |
105-
| hive_clusterpool_clusterdeployments_standby | N |
106-
| hive_clusterpool_clusterdeployments_stale | N |
107-
| hive_clusterpool_clusterdeployments_broken | N |
108-
| hive_clusterpool_stale_clusterdeployments_deleted | N |
109-
| hive_clusterclaim_assignment_delay_seconds | N |
100+
| Metric Name | Optional Label Support | Fixed Labels |
101+
|:-------------------------------------------------:|:----------------------:|-----------------------------------------------|
102+
| hive_clusterpool_clusterdeployments_assignable | N | {"clusterpool_namespace", "clusterpool_name"} |
103+
| hive_clusterpool_clusterdeployments_claimed | N | {"clusterpool_namespace", "clusterpool_name"} |
104+
| hive_clusterpool_clusterdeployments_deleting | N | {"clusterpool_namespace", "clusterpool_name"} |
105+
| hive_clusterpool_clusterdeployments_installing | N | {"clusterpool_namespace", "clusterpool_name"} |
106+
| hive_clusterpool_clusterdeployments_unclaimed | N | {"clusterpool_namespace", "clusterpool_name"} |
107+
| hive_clusterpool_clusterdeployments_standby | N | {"clusterpool_namespace", "clusterpool_name"} |
108+
| hive_clusterpool_clusterdeployments_stale | N | {"clusterpool_namespace", "clusterpool_name"} |
109+
| hive_clusterpool_clusterdeployments_broken | N | {"clusterpool_namespace", "clusterpool_name"} |
110+
| hive_clusterpool_stale_clusterdeployments_deleted | N | {"clusterpool_namespace", "clusterpool_name"} |
111+
| hive_clusterclaim_assignment_delay_seconds | N | {"clusterpool_namespace", "clusterpool_name"} |
110112

111113
#### Metrics controller metrics
112114
These metrics are accumulated across all instance of that type.
113115
Some of these metrics are optional and the admin can opt for logging them via `HiveConfig.Spec.MetricsConfig.MetricsWithDuration`
114116

115-
| Metric Name | Optional Label Support | Optional |
116-
|:--------------------------------------------------------------:|:----------------------:|:--------:|
117-
| hive_cluster_deployments | N | N |
118-
| hive_cluster_deployments_installed | N | N |
119-
| hive_cluster_deployments_uninstalled | N | N |
120-
| hive_cluster_deployments_deprovisioning | N | N |
121-
| hive_cluster_deployments_conditions | N | N |
122-
| hive_install_jobs | N | N |
123-
| hive_uninstall_jobs | N | N |
124-
| hive_imageset_jobs | N | N |
125-
| hive_selectorsyncset_clusters_total | N | N |
126-
| hive_selectorsyncset_clusters_unapplied_total | N | N |
127-
| hive_syncsets_total | N | N |
128-
| hive_syncsets_unapplied_total | N | N |
129-
| hive_cluster_deployment_deprovision_underway_seconds | N | N |
130-
| hive_clustersync_failing_seconds | N | Y |
131-
| hive_cluster_deployments_hibernation_transition_seconds | N | Y |
132-
| hive_cluster_deployments_running_transition_seconds | N | Y |
133-
| hive_cluster_deployments_stopping_seconds | N | Y |
134-
| hive_cluster_deployments_resuming_seconds | N | Y |
135-
| hive_cluster_deployments_waiting_for_cluster_operators_seconds | N | Y |
136-
| hive_controller_reconcile_seconds | N | N |
137-
| hive_cluster_deployment_syncset_paused | N | N |
138-
| hive_cluster_deployment_provision_underway_seconds | N | N |
139-
| hive_cluster_deployment_provision_underway_install_restarts | N | N |
117+
| Metric Name | Optional Label Support | Optional | Fixed Labels |
118+
|:--------------------------------------------------------------:|:----------------------:|:--------:|-----------------------------------------------------------------------------------------------------------------|
119+
| hive_cluster_deployments | N | N | {"cluster_type", "age_lt", "power_state"} |
120+
| hive_cluster_deployments_installed | N | N | {"cluster_type", "age_lt"} |
121+
| hive_cluster_deployments_uninstalled | N | N | {"cluster_type", "age_lt", "uninstalled_gt"} |
122+
| hive_cluster_deployments_deprovisioning | N | N | {"cluster_type", "age_lt", "deprovisioning_gt"} |
123+
| hive_cluster_deployments_conditions | N | N | {"cluster_type", "age_lt", "condition"} |
124+
| hive_install_jobs | N | N | {"cluster_type", "state"} |
125+
| hive_uninstall_jobs | N | N | {"cluster_type", "state"} |
126+
| hive_imageset_jobs | N | N | {"cluster_type", "state"} |
127+
| hive_selectorsyncset_clusters_total | N | N | {"name"} |
128+
| hive_selectorsyncset_clusters_unapplied_total | N | N | {"name"} |
129+
| hive_syncsets_total | N | N | {} |
130+
| hive_syncsets_unapplied_total | N | N | {} |
131+
| hive_cluster_deployment_deprovision_underway_seconds | N | N | {"cluster_deployment", "namespace", "cluster_type"} |
132+
| hive_clustersync_failing_seconds | Y | Y | {"namespaced_name", "unreachable"} |
133+
| hive_cluster_deployments_hibernation_transition_seconds | N | Y | {"cluster_version", "platform", "cluster_pool_namespace", "cluster_pool_name"} |
134+
| hive_cluster_deployments_running_transition_seconds | N | Y | {"cluster_version", "platform", "cluster_pool_namespace", "cluster_pool_name"} |
135+
| hive_cluster_deployments_stopping_seconds | N | Y | {"cluster_deployment_namespace", "cluster_deployment", "platform", "cluster_version", "cluster_pool_namespace"} |
136+
| hive_cluster_deployments_resuming_seconds | N | Y | {"cluster_deployment_namespace", "cluster_deployment", "platform", "cluster_version", "cluster_pool_namespace"} |
137+
| hive_cluster_deployments_waiting_for_cluster_operators_seconds | N | Y | {"cluster_deployment_namespace", "cluster_deployment", "platform", "cluster_version", "cluster_pool_namespace"} |
138+
| hive_controller_reconcile_seconds | N | N | {"controller", "outcome"} |
139+
| hive_cluster_deployment_syncset_paused | N | N | {"cluster_deployment", "namespace", "cluster_type"} |
140+
| hive_cluster_deployment_provision_underway_seconds | N | N | {"cluster_deployment", "namespace", "cluster_type", "condition", "reason", "platform", "image_set"} |
141+
| hive_cluster_deployment_provision_underway_install_restarts | N | N | {"cluster_deployment", "namespace", "cluster_type", "condition", "reason", "platform", "image_set"} |
140142

141143
### Example: Configure metricsConfig
142144

0 commit comments

Comments
 (0)