Skip to content

Incorrect metric values after importing to Cloud Monitoring in GCP #37570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sachin-wickramasinghe opened this issue Jan 29, 2025 · 5 comments
Closed
Assignees

Comments

@sachin-wickramasinghe
Copy link

Describe the bug

I used opentelemetry-collector-contrib:0.118.0 to collect metrics from the RabbitMQ Prometheus endpoint and export them to the Google Cloud Monitoring API. However, after importing, the actual values of some metrics do not match the exported values in Metrics Explorer in Cloud Monitoring.

Steps to reproduce

What did you expect to see?

Metric Name Value
rabbitmq_connections 1
rabbitmq_connections_opened_total 1
rabbitmq_global_messages_acknowledged_total 0
rabbitmq_identity_info 1
rabbitmq_process_max_tcp_sockets 943.629k

What did you see instead?

Metric Name Value
rabbitmq_connections 0
rabbitmq_connections_opened_total 0
rabbitmq_global_messages_acknowledged_total 0
rabbitmq_identity_info 1
rabbitmq_process_max_tcp_sockets 943.629k

What version did you use?

opentelemetry-collector-contrib:0.118.0

What config did you use?

otel-collector-config.yaml
apiVersion: v1
data:
  config.yaml: |
    receivers:
      prometheus:
        config:
          scrape_configs:
            - job_name: 'rabbitmq'
              scrape_interval: 5s
              static_configs:
                - targets: ['rabbitmq-prometheus.rabbitmq.svc.cluster.local:15692']
processors:
  resourcedetection:
    detectors: [env, gcp]
    timeout: 10s
    override: false

  filter/1:
    metrics:
      include:
        match_type: regexp
        metric_names:
          - "rabbitmq_process_max_tcp_sockets"
          - "rabbitmq_queue_consumers"
          - "rabbitmq_queue_messages"
          - "rabbitmq_queue_messages_ready"
          - "rabbitmq_queue_messages_unacked"
          - "rabbitmq_connections_opened_total"
          - "rabbitmq_queues"
          - "rabbitmq_build_info"
          - "rabbitmq_identity_info"
          - "rabbitmq_global_messages_received_total"
          - "rabbitmq_global_messages_acknowledged_total"
          - "rabbitmq_queue_messages_published_total"
          - "rabbitmq_connections"

exporters:
  googlecloud:
    log:
      default_log_name: opentelemetry.io/collector-exported-log

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection, filter/1]
      exporters: [googlecloud]

kind: ConfigMap
metadata:
name: otel-collector-config
namespace: opentelementry

Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
  namespace: opentelemetry
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      containers:
      - args:
        - --config=/etc/otel-collector/config.yaml
        command:
        - ./otelcol-contrib
        image: otel/opentelemetry-collector-contrib:0.118.0
        imagePullPolicy: Always
        name: otel-collector
        volumeMounts:
        - mountPath: /etc/otel-collector
          name: config-volume
      volumes:
      - configMap:
          name: otel-collector-config
        name: config-volume
    

Environment

I used Google Kubernetes Engine for deploy collector with above mentioned docker image.

System Information

  • OS: Google Kubernetes Engine (GKE)
  • Kernel version: 6.1.100+
  • OS image: Container-Optimized OS from Google
  • Container runtime version: containerd://1.7.22
  • Kubelet version: v1.30.5-gke.1443001
  • Kube-proxy version: v1.30.5-gke.1443001
@sachin-wickramasinghe sachin-wickramasinghe added the bug Something isn't working label Jan 29, 2025
@mx-psi mx-psi transferred this issue from open-telemetry/opentelemetry-collector Jan 29, 2025
@crobert-1 crobert-1 added receiver/prometheus Prometheus receiver exporter/googlecloud needs triage New item requiring triage labels Jan 29, 2025
Copy link
Contributor

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

Copy link
Contributor

Pinging code owners for exporter/googlecloud: @aabmass @dashpole @jsuereth @punya @psx95. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

@dashpole dashpole removed the needs triage New item requiring triage label Jan 30, 2025
@dashpole dashpole self-assigned this Jan 30, 2025
@dashpole
Copy link
Contributor

dashpole commented Jan 30, 2025

I know this documentation is for google managed prometheus, but in this case it applies to the googlecloud exporter as well, and does a good job of explaining why that happens. In short, Google Cloud Monitoring requires start timestamps. The Prometheus ecosystem is in the process of adding start timestams, but most existing Prometheus endpoints don't include them today. To address this, the googlecloud exporter will drop the first point that doesn't have a start timestamp, and use that point's timestamp as the start timestamp for all future points. To ensure rates are correct, it "subtracts" the value of the initial point from all future points as well.

I'm working on factoring this logic out into its own processor to make it more transparent in the collector: #37186

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Mar 31, 2025
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants