Skip to content

Conversation

@MrAlias
Copy link
Contributor

@MrAlias MrAlias commented Dec 5, 2025

Fix #7673

Issue being addressed:

  1. fn is called
  2. It returns an error
  3. The code checks if the error is retryable, it always is
  4. Time delay is checked
  5. Wait is called
  6. The wait select statement is evaluated
    • On slow systems both cases are true
    • On fast systems only the context cancel is true
      • The retry stops here with only 1 execution

Do not rely on non-deterministic select statement to catch ended context prior to waiting for a retry delay. Explicitly check the context prior to entering the wait.

This resolves the flaky test and ensure in normal operation that requests with canceled context are ended without having to wait for any additional delays.

@MrAlias MrAlias added this to the v1.40.0 milestone Dec 5, 2025
@MrAlias MrAlias added bug Something isn't working Skip Changelog PRs that do not require a CHANGELOG.md entry pkg:exporter:otlp Related to the OTLP exporter package labels Dec 5, 2025
Fix open-telemetry#7673

Do not rely on non-deterministic `select` statement to catch ended
context prior to waiting for a retry delay. Explicitly check the context
prior to entering the wait.
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.0%. Comparing base (61765e7) to head (fedaeb3).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...s/otlp/otlplog/otlploggrpc/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
...s/otlp/otlplog/otlploghttp/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
.../otlpmetric/otlpmetricgrpc/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
.../otlpmetric/otlpmetrichttp/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
...lp/otlptrace/otlptracegrpc/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
...lp/otlptrace/otlptracehttp/internal/retry/retry.go 50.0% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #7678     +/-   ##
=======================================
- Coverage   86.1%   86.0%   -0.2%     
=======================================
  Files        298     298             
  Lines      21709   21727     +18     
=======================================
- Hits       18707   18696     -11     
- Misses      2625    2630      +5     
- Partials     377     401     +24     
Files with missing lines Coverage Δ
...s/otlp/otlplog/otlploggrpc/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️
...s/otlp/otlplog/otlploghttp/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️
.../otlpmetric/otlpmetricgrpc/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️
.../otlpmetric/otlpmetrichttp/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️
...lp/otlptrace/otlptracegrpc/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️
...lp/otlptrace/otlptracehttp/internal/retry/retry.go 86.0% <50.0%> (-11.5%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MrAlias MrAlias marked this pull request as ready for review December 5, 2025 20:46
Copilot AI review requested due to automatic review settings December 5, 2025 20:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds explicit context cancellation checks to OTLP exporters' retry logic to eliminate reliance on non-deterministic select statement behavior. The change ensures requests with canceled contexts fail immediately without waiting for retry delays, fixing a flaky test issue.

  • Adds ctx.Err() check immediately after determining an error is retryable but before attempting any retry delay
  • Uses consistent error wrapping pattern fmt.Errorf("%w: %w", ctx.Err(), err) to preserve both the context error and original error
  • Applied uniformly across all OTLP exporter variants (trace/metric/log over HTTP/gRPC)

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/shared/otlp/retry/retry.go.tmpl Template source adding context check before retry wait logic
exporters/otlp/otlptrace/otlptracehttp/internal/retry/retry.go Generated file for HTTP trace exporter with context check
exporters/otlp/otlptrace/otlptracegrpc/internal/retry/retry.go Generated file for gRPC trace exporter with context check
exporters/otlp/otlpmetric/otlpmetrichttp/internal/retry/retry.go Generated file for HTTP metric exporter with context check
exporters/otlp/otlpmetric/otlpmetricgrpc/internal/retry/retry.go Generated file for gRPC metric exporter with context check
exporters/otlp/otlplog/otlploghttp/internal/retry/retry.go Generated file for HTTP log exporter with context check
exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go Generated file for gRPC log exporter with context check

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MrAlias MrAlias modified the milestones: v1.40.0, v1.39.0 Dec 7, 2025
@MrAlias MrAlias merged commit d03b033 into open-telemetry:main Dec 7, 2025
30 of 31 checks passed
@MrAlias MrAlias deleted the fix-7673 branch December 7, 2025 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working pkg:exporter:otlp Related to the OTLP exporter package Skip Changelog PRs that do not require a CHANGELOG.md entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: TestBackoffRetryCanceledContext in exporters/otlp/otlplog/otlploggrpc

3 participants