Skip to content

[stanza] Use buffer pool for the read buffers to limit allocations #39373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025

Conversation

bogdandrutu
Copy link
Member

No description provided.

@atoulme
Copy link
Contributor

atoulme commented Apr 14, 2025

ran the benchmarks to see the performance impact, got this:

pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer
                           │ /Users/interop/Downloads/before.txt │  /Users/interop/Downloads/after.txt  │
                           │               sec/op                │    sec/op     vs base                │
FileInput/Single-10                                 14.03µ ± ∞ ¹   14.03µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/Glob-10                                   33.00µ ± ∞ ¹   33.70µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/MultiGlob-10                              33.32µ ± ∞ ¹   32.98µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/MaxConcurrent-10                          60.20µ ± ∞ ¹   61.61µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/FngrPrntLarge-10                          13.04µ ± ∞ ¹   12.70µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/FngrPrntSmall-10                          12.87µ ± ∞ ¹   12.83µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/NoFlush-10                                12.62µ ± ∞ ¹   12.59µ ± ∞ ¹       ~ (p=1.000 n=1) ²
FileInput/Many-10                                   3.349m ± ∞ ¹   1.577m ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                             40.02µ         36.44µ        -8.96%

Copy link
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atoulme this change is supposed to affect allocations. Your comment only shows comparison of CPU time.

In my testing, the Fileconsumer benchmark shows no change in either CPU time or allocation size or count:

$ benchstat /tmp/fileconsumer-main-1fdf89dded.txt /tmp/fileconsumer-bufpool-51ce54a828.txt
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                           │ /tmp/fileconsumer-main-1fdf89dded.txt │ /tmp/fileconsumer-bufpool-51ce54a828.txt │
                           │                sec/op                 │       sec/op        vs base              │
FileInput/Single-20                                    20.22µ ± 4%          20.80µ ± 7%       ~ (p=0.240 n=6)
FileInput/Glob-20                                      22.26µ ± 5%          21.63µ ± 2%       ~ (p=0.084 n=6)
FileInput/MultiGlob-20                                 21.51µ ± 5%          21.80µ ± 5%       ~ (p=0.132 n=6)
FileInput/MaxConcurrent-20                             90.51µ ± 5%          86.61µ ± 7%       ~ (p=0.065 n=6)
FileInput/FngrPrntLarge-20                             21.71µ ± 4%          21.71µ ± 9%       ~ (p=0.937 n=6)
FileInput/FngrPrntSmall-20                             22.16µ ± 5%          21.68µ ± 8%       ~ (p=0.240 n=6)
FileInput/NoFlush-20                                   21.79µ ± 4%          21.04µ ± 6%       ~ (p=0.065 n=6)
FileInput/Many-20                                      254.0µ ± 2%          254.4µ ± 2%       ~ (p=0.818 n=6)
geomean                                                35.15µ               34.78µ       -1.08%

                           │ /tmp/fileconsumer-main-1fdf89dded.txt │ /tmp/fileconsumer-bufpool-51ce54a828.txt │
                           │                 B/op                  │        B/op         vs base              │
FileInput/Single-20                                   10.00Ki ± 0%         10.00Ki ± 0%       ~ (p=1.000 n=6)
FileInput/Glob-20                                     40.01Ki ± 0%         40.00Ki ± 0%       ~ (p=0.859 n=6)
FileInput/MultiGlob-20                                40.00Ki ± 0%         40.00Ki ± 0%       ~ (p=0.870 n=6)
FileInput/MaxConcurrent-20                            40.02Ki ± 0%         40.01Ki ± 0%       ~ (p=0.076 n=6)
FileInput/FngrPrntLarge-20                            10.00Ki ± 0%         10.01Ki ± 0%       ~ (p=0.426 n=6)
FileInput/FngrPrntSmall-20                            10.00Ki ± 0%         10.00Ki ± 0%       ~ (p=1.000 n=6)
FileInput/NoFlush-20                                  10.00Ki ± 0%         10.00Ki ± 0%       ~ (p=0.864 n=6)
FileInput/Many-20                                    1001.9Ki ± 0%        1001.2Ki ± 0%  -0.07% (p=0.041 n=6)
geomean                                               29.92Ki              29.92Ki       -0.01%

                           │ /tmp/fileconsumer-main-1fdf89dded.txt │ /tmp/fileconsumer-bufpool-51ce54a828.txt │
                           │               allocs/op               │    allocs/op      vs base                │
FileInput/Single-20                                     10.00 ± 0%         10.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/Glob-20                                       40.00 ± 0%         40.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/MultiGlob-20                                  40.00 ± 0%         40.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/MaxConcurrent-20                              40.00 ± 0%         40.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/FngrPrntLarge-20                              10.00 ± 0%         10.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/FngrPrntSmall-20                              10.00 ± 0%         10.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/NoFlush-20                                    10.00 ± 0%         10.00 ± 0%       ~ (p=1.000 n=6) ¹
FileInput/Many-20                                      1.002k ± 0%        1.002k ± 0%       ~ (p=0.706 n=6)
geomean                                                 29.91              29.91       +0.00%
¹ all samples are equal

@bogdandrutu this change seems beneficial, but not seeing this in benchmark results seems to contradict this. Can you explain why the benchmark does not show the expected gains?

@atoulme
Copy link
Contributor

atoulme commented Apr 14, 2025

Odd. Given I ran this quickly with just one iteration, I guess I got different figures than you did.

@bogdandrutu
Copy link
Member Author

@bogdandrutu this change seems beneficial, but not seeing this in benchmark results seems to contradict this. Can you explain why the benchmark does not show the expected gains?

Let's start by saying that the benchmark is not really well written to ensure we can test for this kind of changes? A simple reason is that multiple executions produce extremely random results varying in CPU and memory allocations.

@bogdandrutu
Copy link
Member Author

Added a more deterministic benchmark, that does show a bit of improvement (not 100% all allocs yet), will need to check better on why that, but checked with logs that the path is avoiding allocs.

@bogdandrutu bogdandrutu merged commit f9909c5 into open-telemetry:main Apr 14, 2025
171 checks passed
@bogdandrutu bogdandrutu deleted the pool-bufs-factory branch April 14, 2025 20:05
@github-actions github-actions bot added this to the next release milestone Apr 14, 2025
bufferSize := r.initialBufferSize
if r.TokenLenState.MinimumLength > bufferSize {
var buf []byte
fmt.Println(r.fileName)
Copy link
Member

@andrzej-stencel andrzej-stencel Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be there. EDIT: See #39414.

akshays-19 pushed a commit to akshays-19/opentelemetry-collector-contrib that referenced this pull request Apr 23, 2025
Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this pull request Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants