-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Component(s)
exporter/awsemf, receiver/prometheus, receiver/statsd
What happened?
Description
We have aws-otel-collector
0.30.0
running alongside a Java App (which exposes Prometheus metrics) and AWS/Envoy Sidecar (which exposes StatsD metrics). aws-otel-collector
is configured to process both those sources using separate pipelines, and to push the metrics to AWS CloudWatch using awsemfexporter
. We have previously used version 0.16.1
of the aws-otel-collector
and are only now upgraging.
Previously, metrics from both sources were stored in CloudWatch "as-is". After the upgrade, however, we noticed, that the Prometheus metrics gained a new Dimension: OTelLib
, with value otelcol/prometheusreceiver
. This, obviously broke a few things on our end (like CloudWatch Alarms).
After digging a bit, I found this two tickets, which were supposed to get both of these receivers to the same place in terms of populating otel.library.name
:
- [receiver/prometheusreceiver] Send receiver name and version with metrics #20902
- [receiver/stastd] Populate library.version and library.name in scope #23563
Unfortunately I was not able to grasp how that translates to OTelLib
metric dimension set in awsemfexporter
but it seems somehow related at this point.
My understanding is, that it's de-facto standard for the receivers to add the name and version of the library to processed metrics, but I do not understand how or why at all is that information being added as a dimension. I also do not understand if that's an expected outcome, thus, it's hard for me to figure out whether it's a bug in prometheusreceiver
(that it adds that as a dimension), statsdreceiver
(that it doesn't add it as a dimension) or awsemfexporter
. I'd be grateful for any guidance on this matter.
Steps to Reproduce
- Use the collector configuration below with two separate sources of metrics (StatsD and Prometheus).
- You can adjust (or disable) metric filtering if your sources vary from mine.
Expected Result
I would expect the following:
- Make the receivers produce metrics the same way, so that the
awsemfexporter
would add the newOTelLib
Dimension regardless where the metrics come from. Or would not add that at all. I'm not sure what is considered the "correct" behaviour here. I would expect it to be consistent across receivers, however. - I'm not very proficient in Go, but from what I can make of the
awsemfexporter
configuration, it has dedicated logic to handle thatOTelLib
Dimension. I think it would be a good idea to be able to implement a switch that would control whether theOTelLib
Dimension is being added or not. In our case, forcefully adding this new Dimension to all collected metrics will break A LOT of things around our observability solution.
Actual Result
- Metrics collected by
prometheusreceiver
are stored byawsemfexporter
with additionalOTelLib
dimension set tootelcol/prometheusreceiver
. - Metrics collected by
statsdreceiver
are stored by identical configuration ofawsemfexporter
withoutOTelLib
dimension. - There's no way to configure
awsemfexporter
in a way that it would not add theOTelLib
dimension.
Collector version
v0.78.0 (according to: https://github.com/aws-observability/aws-otel-collector/releases/tag/v0.30.0)
Environment information
Environment
OS: AWS ECS / Fargate
We're running custom-built Docker Image, based on amazonlinux:2
, with a Dockerfile lookling like below:
FROM amazonlinux:2 as appmesh-otel-collector
ARG OTEL_VERSION=0.30.0
RUN yum install -y \
procps \
shadow-utils \
https://aws-otel-collector.s3.amazonaws.com/amazon_linux/amd64/v${OTEL_VERSION}/aws-otel-collector.rpm \
&& yum clean all
RUN useradd -m --uid 1337 sidecar && \
echo "sidecar ALL=NOPASSWD: ALL" >> /etc/sudoers && \
chown -R sidecar /opt/aws/aws-otel-collector
USER sidecar
ENV RUN_IN_CONTAINER="True"
ENV HOME="/home/sidecar"
ENTRYPOINT ["/opt/aws/aws-otel-collector/bin/aws-otel-collector"]
OpenTelemetry Collector configuration
"exporters":
"awsemf/prometheus/custom_metrics":
"dimension_rollup_option": "NoDimensionRollup"
"log_group_name": "/aws/ecs/staging/kafka-snowflake-connector"
"log_stream_name": "emf/otel/prometheus/custom_metrics/{TaskId}"
"namespace": "staging/KafkaSnowflakeConnector"
"awsemf/statsd/envoy_metrics":
"dimension_rollup_option": "NoDimensionRollup"
"log_group_name": "/aws/ecs/staging/kafka-snowflake-connector"
"log_stream_name": "emf/otel/statsd/envoy_metrics/{TaskId}"
"namespace": "staging/AppMeshEnvoy"
"processors":
"batch/prometheus/custom_metrics":
"timeout": "60s"
"batch/statsd/envoy_metrics":
"timeout": "60s"
"filter/prometheus/custom_metrics":
"metrics":
"include":
"match_type": "regexp"
"metric_names":
- "^kafka_consumer_consumer_fetch_manager_metrics_bytes_consumed_rate$"
- "^kafka_consumer_consumer_fetch_manager_metrics_records_consumed_rate$"
- "^kafka_connect_connect_worker_metrics_connector_running_task_count$"
- "^kafka_connect_connect_worker_metrics_connector_failed_task_count$"
- "^kafka_consumer_consumer_fetch_manager_metrics_records_lag_max$"
- "^kafka_consumer_consumer_fetch_manager_metrics_records_lag$"
- "^snowflake_kafka_connector_.*_OneMinuteRate$"
"filter/statsd/envoy_metrics":
"metrics":
"include":
"match_type": "regexp"
"metric_names":
- "^envoy\\.http\\.rq_total$"
- "^envoy\\.http\\.downstream_rq_xx$"
- "^envoy\\.http\\.downstream_rq_total$"
- "^envoy\\.http\\.downstream_rq_time$"
- "^envoy\\.cluster\\.upstream_cx_connect_timeout$"
- "^envoy\\.cluster\\.upstream_rq_timeout$"
- "^envoy\\.appmesh\\.RequestCountPerTarget$"
- "^envoy\\.appmesh\\.TargetResponseTime$"
- "^envoy\\.appmesh\\.HTTPCode_.+$"
"resource":
"attributes":
- "action": "extract"
"key": "aws.ecs.task.arn"
"pattern": "^arn:aws:ecs:(?P<Region>.*):(?P<AccountId>.*):task/(?P<ClusterName>.*)/(?P<TaskId>.*)$"
"resourcedetection":
"detectors":
- "env"
- "ecs"
"receivers":
"prometheus/custom_metrics":
"config":
"global":
"scrape_interval": "1m"
"scrape_timeout": "10s"
"scrape_configs":
- "job_name": "staging/KafkaSnowflakeConnector"
"metrics_path": ""
"sample_limit": 10000
"static_configs":
- "targets":
- "localhost:9404"
"statsd/envoy_metrics":
"aggregation_interval": "60s"
"endpoint": "0.0.0.0:8125"
"service":
"pipelines":
"metrics/prometheus/custom_metrics":
"exporters":
- "awsemf/prometheus/custom_metrics"
"processors":
- "resourcedetection"
- "resource"
- "filter/prometheus/custom_metrics"
- "batch/prometheus/custom_metrics"
"receivers":
- "prometheus/custom_metrics"
"metrics/statsd/envoy_metrics":
"exporters":
- "awsemf/statsd/envoy_metrics"
"processors":
- "resourcedetection"
- "resource"
- "filter/statsd/envoy_metrics"
- "batch/statsd/envoy_metrics"
"receivers":
- "statsd/envoy_metrics"
Log output
N/A
Additional context
N/A