Skip to content

Fix race condition of GlobalOpenTelemetry initialization with AutoConfiguredOpenTelemetrySdkBuilder #7365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

fandreuz
Copy link
Contributor

@fandreuz fandreuz commented May 23, 2025

Fixes #7354.

Not sure if reflection is the best way to achieve this, I've seen it's quite used in the SDK to hide things that should not be directly exposed to the users. Let me know if a proper API exposed by GlobalOpenTelemetry would be more appropriate.

@fandreuz fandreuz requested a review from a team as a code owner May 23, 2025 21:42
Copy link

codecov bot commented May 23, 2025

Codecov Report

Attention: Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.79%. Comparing base (ada5af6) to head (9d00edd).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...nfigure/AutoConfiguredOpenTelemetrySdkBuilder.java 92.85% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #7365      +/-   ##
============================================
+ Coverage     89.75%   89.79%   +0.04%     
- Complexity     6980     6985       +5     
============================================
  Files           797      797              
  Lines         21165    21174       +9     
  Branches       2057     2057              
============================================
+ Hits          18996    19013      +17     
+ Misses         1505     1501       -4     
+ Partials        664      660       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the idea of accessing the GlobalOpenTelemetry's private mutex reflectively...

Could we achieve the same effect by adding GlobalOpenTelemetry#set(Supplier<OpenTelemetry> supplier), with an implementation that obtains the the lock on mutex before calling Supplier.get()?

@fandreuz fandreuz marked this pull request as draft June 8, 2025 21:31
* OpenTelemetry} object, and finally calls {@link #set(OpenTelemetry)}, all while holding the
* {@link GlobalOpenTelemetry} mutex.
*/
public static <T> T set(Supplier<T> supplier, Function<T, OpenTelemetry> converter) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like: 5f06de2

This API signature will look strange outside of the autoconfigure use case.

I like this approach in general though and would love to get it merged!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't like that signature either, your commit made it look much better. I chery-picked it into my branch

@fandreuz fandreuz marked this pull request as ready for review June 9, 2025 18:57
@jkwatson
Copy link
Contributor

Won't this lead to some difficult situations where someone has a long-lived handle to an instance that they got from GlobalOpenTelemetry.get() which is no longer the "active" instance? Have we thoroughly thought through the implications of that? Will this behavior be surprising when we support dynamically changing the behavior of GlobalOpenTelemetry via (eg.) opamp?

@jkwatson
Copy link
Contributor

Won't this lead to some difficult situations where someone has a long-lived handle to an instance that they got from GlobalOpenTelemetry.get() which is no longer the "active" instance? Have we thoroughly thought through the implications of that? Will this behavior be surprising when we support dynamically changing the behavior of GlobalOpenTelemetry via (eg.) opamp?

never mind. I missed that set still doesn't let you change the global instance. No problem then! Looks good!

@jkwatson
Copy link
Contributor

Can we write a unit test that verifies that this fixes the race condition?

@fandreuz
Copy link
Contributor Author

Can we write a unit test that verifies that this fixes the race condition?

A fully deterministic reproducer would be relatively hard to write for this issue since it's a race condition. I could do that, but I'd probably have to resort to artificial sleeps to reproduce the conditions for the race. Would such a test be acceptable @jkwatson ?

@jkwatson
Copy link
Contributor

Can we write a unit test that verifies that this fixes the race condition?

A fully deterministic reproducer would be relatively hard to write for this issue since it's a race condition. I could do that, but I'd probably have to resort to artificial sleeps to reproduce the conditions for the race. Would such a test be acceptable @jkwatson ?

Yes, better than no test at all, so we don't accidentally re-introduce the issue in the future.

@fandreuz fandreuz marked this pull request as draft June 12, 2025 23:44
@fandreuz fandreuz marked this pull request as ready for review June 17, 2025 23:14
@fandreuz
Copy link
Contributor Author

Hi @jkwatson, please check the new test in AutoConfiguredOpenTelemetrySdkTest

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. @jkwatson there's a unit test now for this. Any additional resevations?

@jkwatson
Copy link
Contributor

Looks good. @jkwatson there's a unit test now for this. Any additional resevations?

🚢

Copy link
Contributor

@jkwatson jkwatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jack-berg jack-berg merged commit b0a9deb into open-telemetry:main Jun 24, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race-condition of GlobalOpenTelemetry initialization with AutoConfiguredOpenTelemetrySdkBuilder
3 participants