Skip to content

prometheusreceiver and statsdreceiver behave differently in terms of setting "OTelLib" when awsemfexporter is used #24298

@mkielar

Description

@mkielar

Component(s)

exporter/awsemf, receiver/prometheus, receiver/statsd

What happened?

Description

We have aws-otel-collector 0.30.0 running alongside a Java App (which exposes Prometheus metrics) and AWS/Envoy Sidecar (which exposes StatsD metrics). aws-otel-collector is configured to process both those sources using separate pipelines, and to push the metrics to AWS CloudWatch using awsemfexporter. We have previously used version 0.16.1 of the aws-otel-collector and are only now upgraging.

Previously, metrics from both sources were stored in CloudWatch "as-is". After the upgrade, however, we noticed, that the Prometheus metrics gained a new Dimension: OTelLib, with value otelcol/prometheusreceiver. This, obviously broke a few things on our end (like CloudWatch Alarms).

After digging a bit, I found this two tickets, which were supposed to get both of these receivers to the same place in terms of populating otel.library.name:

Unfortunately I was not able to grasp how that translates to OTelLib metric dimension set in awsemfexporter but it seems somehow related at this point.

My understanding is, that it's de-facto standard for the receivers to add the name and version of the library to processed metrics, but I do not understand how or why at all is that information being added as a dimension. I also do not understand if that's an expected outcome, thus, it's hard for me to figure out whether it's a bug in prometheusreceiver (that it adds that as a dimension), statsdreceiver (that it doesn't add it as a dimension) or awsemfexporter. I'd be grateful for any guidance on this matter.

Steps to Reproduce

  1. Use the collector configuration below with two separate sources of metrics (StatsD and Prometheus).
  2. You can adjust (or disable) metric filtering if your sources vary from mine.

Expected Result

I would expect the following:

  1. Make the receivers produce metrics the same way, so that the awsemfexporter would add the new OTelLib Dimension regardless where the metrics come from. Or would not add that at all. I'm not sure what is considered the "correct" behaviour here. I would expect it to be consistent across receivers, however.
  2. I'm not very proficient in Go, but from what I can make of the awsemfexporter configuration, it has dedicated logic to handle that OTelLib Dimension. I think it would be a good idea to be able to implement a switch that would control whether the OTelLib Dimension is being added or not. In our case, forcefully adding this new Dimension to all collected metrics will break A LOT of things around our observability solution.

Actual Result

  1. Metrics collected by prometheusreceiver are stored by awsemfexporter with additional OTelLib dimension set to otelcol/prometheusreceiver.
  2. Metrics collected by statsdreceiver are stored by identical configuration of awsemfexporter without OTelLib dimension.
  3. There's no way to configure awsemfexporter in a way that it would not add the OTelLib dimension.

Collector version

v0.78.0 (according to: https://github.com/aws-observability/aws-otel-collector/releases/tag/v0.30.0)

Environment information

Environment

OS: AWS ECS / Fargate
We're running custom-built Docker Image, based on amazonlinux:2, with a Dockerfile lookling like below:

FROM amazonlinux:2 as appmesh-otel-collector
ARG OTEL_VERSION=0.30.0
RUN yum install -y \
        procps \
        shadow-utils \
        https://aws-otel-collector.s3.amazonaws.com/amazon_linux/amd64/v${OTEL_VERSION}/aws-otel-collector.rpm \
    && yum clean all
RUN useradd -m --uid 1337 sidecar && \
    echo "sidecar ALL=NOPASSWD: ALL" >> /etc/sudoers && \
    chown -R sidecar /opt/aws/aws-otel-collector
USER sidecar
ENV RUN_IN_CONTAINER="True"
ENV HOME="/home/sidecar"
ENTRYPOINT ["/opt/aws/aws-otel-collector/bin/aws-otel-collector"]

OpenTelemetry Collector configuration

"exporters":
  "awsemf/prometheus/custom_metrics":
    "dimension_rollup_option": "NoDimensionRollup"
    "log_group_name": "/aws/ecs/staging/kafka-snowflake-connector"
    "log_stream_name": "emf/otel/prometheus/custom_metrics/{TaskId}"
    "namespace": "staging/KafkaSnowflakeConnector"
  "awsemf/statsd/envoy_metrics":
    "dimension_rollup_option": "NoDimensionRollup"
    "log_group_name": "/aws/ecs/staging/kafka-snowflake-connector"
    "log_stream_name": "emf/otel/statsd/envoy_metrics/{TaskId}"
    "namespace": "staging/AppMeshEnvoy"
"processors":
  "batch/prometheus/custom_metrics":
    "timeout": "60s"
  "batch/statsd/envoy_metrics":
    "timeout": "60s"
  "filter/prometheus/custom_metrics":
    "metrics":
      "include":
        "match_type": "regexp"
        "metric_names":
        - "^kafka_consumer_consumer_fetch_manager_metrics_bytes_consumed_rate$"
        - "^kafka_consumer_consumer_fetch_manager_metrics_records_consumed_rate$"
        - "^kafka_connect_connect_worker_metrics_connector_running_task_count$"
        - "^kafka_connect_connect_worker_metrics_connector_failed_task_count$"
        - "^kafka_consumer_consumer_fetch_manager_metrics_records_lag_max$"
        - "^kafka_consumer_consumer_fetch_manager_metrics_records_lag$"
        - "^snowflake_kafka_connector_.*_OneMinuteRate$"
  "filter/statsd/envoy_metrics":
    "metrics":
      "include":
        "match_type": "regexp"
        "metric_names":
        - "^envoy\\.http\\.rq_total$"
        - "^envoy\\.http\\.downstream_rq_xx$"
        - "^envoy\\.http\\.downstream_rq_total$"
        - "^envoy\\.http\\.downstream_rq_time$"
        - "^envoy\\.cluster\\.upstream_cx_connect_timeout$"
        - "^envoy\\.cluster\\.upstream_rq_timeout$"
        - "^envoy\\.appmesh\\.RequestCountPerTarget$"
        - "^envoy\\.appmesh\\.TargetResponseTime$"
        - "^envoy\\.appmesh\\.HTTPCode_.+$"
  "resource":
    "attributes":
    - "action": "extract"
      "key": "aws.ecs.task.arn"
      "pattern": "^arn:aws:ecs:(?P<Region>.*):(?P<AccountId>.*):task/(?P<ClusterName>.*)/(?P<TaskId>.*)$"
  "resourcedetection":
    "detectors":
    - "env"
    - "ecs"
"receivers":
  "prometheus/custom_metrics":
    "config":
      "global":
        "scrape_interval": "1m"
        "scrape_timeout": "10s"
      "scrape_configs":
      - "job_name": "staging/KafkaSnowflakeConnector"
        "metrics_path": ""
        "sample_limit": 10000
        "static_configs":
        - "targets":
          - "localhost:9404"
  "statsd/envoy_metrics":
    "aggregation_interval": "60s"
    "endpoint": "0.0.0.0:8125"
"service":
  "pipelines":
    "metrics/prometheus/custom_metrics":
      "exporters":
      - "awsemf/prometheus/custom_metrics"
      "processors":
      - "resourcedetection"
      - "resource"
      - "filter/prometheus/custom_metrics"
      - "batch/prometheus/custom_metrics"
      "receivers":
      - "prometheus/custom_metrics"
    "metrics/statsd/envoy_metrics":
      "exporters":
      - "awsemf/statsd/envoy_metrics"
      "processors":
      - "resourcedetection"
      - "resource"
      - "filter/statsd/envoy_metrics"
      - "batch/statsd/envoy_metrics"
      "receivers":
      - "statsd/envoy_metrics"

Log output

N/A

Additional context

N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions