[processor/tailsampling] Record which sampling policy was responsible for the decision #37797

djluck · 2025-02-09T22:21:17Z

Re-opening a stale PR: #36312
Resolves #35180.

We we're close to it being merged. @jpkrohling do you mind finalizing the review when you get a chance?

jpkrohling

LGTM, changing the existing benchmark to record the policy shows that the costs are OK:

This change (plus setting tsp.recordPolicy = true in BenchmarkSampling):

Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^BenchmarkSampling$ github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor

goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
BenchmarkSampling-16    	   40365	     28595 ns/op	    6282 B/op	     258 allocs/op
PASS
ok  	github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor	1.478s

Baseline:

Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^BenchmarkSampling$ github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor

goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
BenchmarkSampling-16    	   47973	     24604 ns/op	    6260 B/op	     257 allocs/op
PASS
ok  	github.com/open-telemetry/opentelemetry-collector-contrib/processor/tailsamplingprocessor	1.458s

djluck · 2025-02-10T18:01:46Z

@jpkrohling thanks for the review 🙇

jpkrohling · 2025-02-11T11:55:09Z

Once the CI is green, I think this is ready to be merged.

djluck · 2025-02-12T03:06:52Z

Apologies, I missed that check- just pushed a change after running make generate

…olicy) associated with an inclusive tail processor sampling decision. Resolves !35180. - This functionality lives behind a feature flag that is disabled by default - The original issue described a solution where we might attach the attribute solely to the root span. I'm not sure I agree with the commenter that we can rely on this (e.g. we might decide to sample halfway through a long-running trace) so I have attached the attributes to all present scope spans. This feels like a decent trade off between complexity + network cost, as finding the highest non-root parent would require multiple passes of the spans and keeping all span ids in a set - Added automated tests to verify enabling the flag both records the expected decision while not impacting existing logic - Built a custom version and ran it in our preprod environment to ensure it was stable over a 1h period (still evaluating, will update PR with any further observations) Does this require a CHANGELOG entry?

- Added README entry for the feature flag - Added missing mutex lock around reading trace data in `SetAttrOnScopeSpans` - Added tests + benchmarks for `SetAttrOnScopeSpans`

…olicy) associated with an inclusive tail processor sampling decision. Resolves !35180. - This functionality lives behind a feature flag that is disabled by default - The original issue described a solution where we might attach the attribute solely to the root span. I'm not sure I agree with the commenter that we can rely on this (e.g. we might decide to sample halfway through a long-running trace) so I have attached the attributes to all present scope spans. This feels like a decent trade off between complexity + network cost, as finding the highest non-root parent would require multiple passes of the spans and keeping all span ids in a set - Added automated tests to verify enabling the flag both records the expected decision while not impacting existing logic - Built a custom version and ran it in our preprod environment to ensure it was stable over a 1h period (still evaluating, will update PR with any further observations) Does this require a CHANGELOG entry?

djluck · 2025-02-12T06:16:11Z

@jpkrohling I re-ran make gci after make generate and it seems they are fighting each other 😢 I didn't commit the changes make gci wanted to make but if the pipeline fails, could you provide guidance on the command I should be running?

djluck · 2025-02-14T05:32:30Z

@jpkrohling looks like we're green 🥳 Do you mind merging when you get a moment?

jpkrohling · 2025-02-14T07:58:54Z

Merged, thanks!

djluck · 2025-02-14T08:52:46Z

Wonderful, thanks again for all your help @jpkrohling

jade-guiton-dd · 2025-02-14T13:08:09Z

It looks like this PR conflicts with #37035 merged two days ago, which removed the options variadic argument from newTracesProcessor (see current signature). The code in this PR still uses it, so the tailsampling processor now fails to build.

Update: I opened #37931 to fix this.

#### Description Two PRs were merged recently on the tailsamplingprocessor, #37797 and #37035. #37035 changed the signature of an internal function in a way that broke #37797. The result is that the component [fails to build](https://github.com/open-telemetry/opentelemetry-collector/actions/runs/13329091378/job/37228871811?pr=12384). This PR fixes that. This wasn't noticed before merging because 1. there were no merge conflicts, 2. the latest rebase of #37797 was before #37035 was merged, and 3. there is no merge queue to perform final checks. --------- Signed-off-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Juraci Paixão Kröhling <[email protected]>

djluck requested review from jpkrohling and a team as code owners February 9, 2025 22:21

github-actions bot assigned andrzej-stencel Feb 9, 2025

github-actions bot added the processor/tailsampling Tail sampling processor label Feb 9, 2025

github-actions bot requested a review from portertech February 9, 2025 22:21

djluck mentioned this pull request Feb 9, 2025

[processor/tailsampling] record sampling policy #36312

Closed

jpkrohling approved these changes Feb 10, 2025

View reviewed changes

jpkrohling changed the title ~~Tailsampling record policy~~ [processor/tailsampling] Record which sampling policy was responsible for the decision Feb 10, 2025

djluck added 11 commits February 12, 2025 14:11

- Added changelog

a2e8396

- Added README entry for the feature flag - Added missing mutex lock around reading trace data in `SetAttrOnScopeSpans` - Added tests + benchmarks for `SetAttrOnScopeSpans`

Running make fmt

ee5244d

Fixing make lint errors

cf3fe19

Running make fmt

e914e42

Making suggested fixes

d080259

Running make gci

2db8adb

Formatting and fixing build errs

b06b56c

Rebasing and fixing merge conflicts

9680035

Ran make generate

6a3b105

djluck force-pushed the tailsampling-record-policy branch from 913ab9c to 6a3b105 Compare February 12, 2025 03:13

djluck requested review from dashpole, MovieStoreGuy, andrzej-stencel and crobert-1 as code owners February 12, 2025 03:13

github-actions bot added exporter/zipkin receiver/googlecloudspanner labels Feb 12, 2025

Reverting changes to untouched files

76d4267

jpkrohling merged commit 843499f into open-telemetry:main Feb 14, 2025
162 checks passed

github-actions bot added this to the next release milestone Feb 14, 2025

jade-guiton-dd added a commit to jade-guiton-dd/opentelemetry-collector-contrib that referenced this pull request Feb 14, 2025

[tailsamplingprocessor] [chore] Fix merge of open-telemetry#37797

0e25e24

jade-guiton-dd mentioned this pull request Feb 14, 2025

[tailsamplingprocessor] [chore] Fix merge of #37797 and #37035 #37931

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[processor/tailsampling] Record which sampling policy was responsible for the decision #37797

[processor/tailsampling] Record which sampling policy was responsible for the decision #37797

Uh oh!

djluck commented Feb 9, 2025 •

edited by jpkrohling

Loading

Uh oh!

jpkrohling left a comment •

edited

Loading

Uh oh!

djluck commented Feb 10, 2025

Uh oh!

jpkrohling commented Feb 11, 2025

Uh oh!

djluck commented Feb 12, 2025

Uh oh!

djluck commented Feb 12, 2025 •

edited

Loading

Uh oh!

djluck commented Feb 14, 2025

Uh oh!

Uh oh!

jpkrohling commented Feb 14, 2025

Uh oh!

djluck commented Feb 14, 2025

Uh oh!

jade-guiton-dd commented Feb 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

[processor/tailsampling] Record which sampling policy was responsible for the decision #37797

[processor/tailsampling] Record which sampling policy was responsible for the decision #37797

Uh oh!

Conversation

djluck commented Feb 9, 2025 • edited by jpkrohling Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpkrohling left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

djluck commented Feb 10, 2025

Uh oh!

jpkrohling commented Feb 11, 2025

Uh oh!

djluck commented Feb 12, 2025

Uh oh!

djluck commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djluck commented Feb 14, 2025

Uh oh!

Uh oh!

jpkrohling commented Feb 14, 2025

Uh oh!

djluck commented Feb 14, 2025

Uh oh!

jade-guiton-dd commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

djluck commented Feb 9, 2025 •

edited by jpkrohling

Loading

jpkrohling left a comment •

edited

Loading

djluck commented Feb 12, 2025 •

edited

Loading

jade-guiton-dd commented Feb 14, 2025 •

edited

Loading