-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Remove unnecessary copy while decoding and constructing string #37734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary copy while decoding and constructing string #37734
Conversation
62f1762
to
ba0e4ee
Compare
This comment was marked as resolved.
This comment was marked as resolved.
c40b78d
to
4450e14
Compare
Signed-off-by: Bogdan Drutu <[email protected]>
4450e14
to
9493b6f
Compare
…telemetry#37734) Before (including copy of the bytes needed upstream): ``` goos: darwin goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader cpu: Apple M2 Max BenchmarkFileRead BenchmarkFileRead-12 6890 159408 ns/op 125802 B/op 132 allocs/op PASS ``` After ``` goos: darwin goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader cpu: Apple M2 Max BenchmarkFileRead BenchmarkFileRead-12 7662 154923 ns/op 121656 B/op 130 allocs/op PASS ``` Signed-off-by: Bogdan Drutu <[email protected]>
@bogdandrutu I'm not sure I understand the benchmark results you posted. When I ran the Here's what I did (using Go v1.22.12): $ git checkout 0820ea0d51e2d9acd2649911bdb06ae16a558f12 # commit on main just before this change
$ # copy the BenchmarkFileRead code into `reader_test.go`
$ (cd pkg/stanza/fileconsumer/internal/reader && go test -benchmem -run=^$ -bench ^BenchmarkFileRead$ . -v -count=10 | tee /tmp/fileread-$(git rev-parse --short HEAD).txt)
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader
BenchmarkFileRead
BenchmarkFileRead-12 9499 125685 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 8906 126189 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9572 127992 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 8439 126469 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9093 126672 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9602 125069 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9024 125003 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9194 126363 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 8574 126052 ns/op 23288 B/op 31 allocs/op
BenchmarkFileRead-12 9513 127504 ns/op 23288 B/op 31 allocs/op
PASS
ok github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader 12.965s
$ git reset --hard # remove the changed code before checking out next commit
$ git checkout 4244166551b2b42c824de934009a7ce535b0e656 # commit on main with changes from this PR
$ (cd pkg/stanza/fileconsumer/internal/reader && go test -benchmem -run=^$ -bench ^BenchmarkFileRead$ . -v -count=10 | tee /tmp/fileread-$(git rev-parse --short HEAD).txt)
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader
BenchmarkFileRead
BenchmarkFileRead-12 8442 138627 ns/op 121657 B/op 130 allocs/op
BenchmarkFileRead-12 8010 138947 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8406 139942 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8325 139564 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8341 139051 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8055 140335 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8577 143245 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8512 139856 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8336 138491 ns/op 121656 B/op 130 allocs/op
BenchmarkFileRead-12 8548 139459 ns/op 121656 B/op 130 allocs/op
PASS
ok github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader 13.213s These benchmarks differ in allocations: $ benchstat /tmp/fileread-0820ea0d51.txt /tmp/fileread-4244166551.txt
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader
│ /tmp/fileread-0820ea0d51.txt │ /tmp/fileread-4244166551.txt │
│ sec/op │ sec/op vs base │
FileRead-12 126.3µ ± 1% 139.5µ ± 1% +10.48% (p=0.000 n=10)
│ /tmp/fileread-0820ea0d51.txt │ /tmp/fileread-4244166551.txt │
│ B/op │ B/op vs base │
FileRead-12 22.74Ki ± 0% 118.80Ki ± 0% +422.40% (p=0.000 n=10)
│ /tmp/fileread-0820ea0d51.txt │ /tmp/fileread-4244166551.txt │
│ allocs/op │ allocs/op vs base │
FileRead-12 31.00 ± 0% 130.00 ± 0% +319.35% (p=0.000 n=10) |
I see nearly identical memory and allocation increases as @andrzej-stencel. I think we should revert the change unless there is an easy fix. Automatic reverting is not possible because of subsequent changes. @bogdandrutu |
In the "before" you check only the calls to "Emit" without the content of emit which does a copy. See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/37734/files#diff-34aeecd35124f500d94093a2effc11eecf71a07c1f26476db9279b22f2200332L50 which is not part of when you run the benchmark. I could not write the benchmark is a way that is easy to replicate that. So in order for you to test equivalent changes, you need to force a copy of the bytes. |
Right, this makes sense, thanks Bogdan. We should have a benchmark that shows this. I created the following proposal #38054 for a benchmark on the level of Stanza's File input. It covers the work done in the File input's This makes sense to me. If I understand correctly, this change effectively moves the memory allocation for each entry from one place to another. Before, the allocation happened inside the File input benchmark results before and after this change before$ git checkout 0820ea0d51e2d9acd2649911bdb06ae16a558f12
$ # Copy the benchmark's code from #38054
$ (cd pkg/stanza/operator/input/file/ && go test -benchmem -run=^$ -bench Benchmark . -v -count=6 | tee /tmp/fileinput-$(git rev-parse --short HEAD).txt)
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
BenchmarkReadExistingLogs
BenchmarkReadExistingLogs/logs-0-lines
BenchmarkReadExistingLogs/logs-0-lines-20 43616 24020 ns/op 26452 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 46342 23227 ns/op 26448 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 50653 23240 ns/op 26442 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 47140 25323 ns/op 26455 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 47074 23100 ns/op 26443 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 46510 23646 ns/op 26444 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-1-lines
BenchmarkReadExistingLogs/logs-1-lines-20 2845 435859 ns/op 118955 B/op 228 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2617 421960 ns/op 117637 B/op 225 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2772 437044 ns/op 119124 B/op 228 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2403 426566 ns/op 118024 B/op 226 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2662 438006 ns/op 121655 B/op 233 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2760 428723 ns/op 118587 B/op 227 allocs/op
BenchmarkReadExistingLogs/logs-2-lines
BenchmarkReadExistingLogs/logs-2-lines-20 2725 430722 ns/op 118195 B/op 231 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2786 433269 ns/op 118858 B/op 232 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2466 427705 ns/op 118718 B/op 232 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2706 432655 ns/op 118441 B/op 232 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2414 443278 ns/op 119860 B/op 234 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2774 448268 ns/op 119632 B/op 234 allocs/op
BenchmarkReadExistingLogs/logs-10-lines
BenchmarkReadExistingLogs/logs-10-lines-20 2604 500115 ns/op 122907 B/op 277 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2631 468637 ns/op 124662 B/op 280 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2553 465892 ns/op 125267 B/op 281 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2281 495524 ns/op 124650 B/op 280 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2581 491338 ns/op 126800 B/op 284 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2538 473622 ns/op 124520 B/op 280 allocs/op
BenchmarkReadExistingLogs/logs-20-lines
BenchmarkReadExistingLogs/logs-20-lines-20 2307 482158 ns/op 129732 B/op 341 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2498 503358 ns/op 130901 B/op 343 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2490 495973 ns/op 133026 B/op 346 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2487 502535 ns/op 130124 B/op 341 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2480 490218 ns/op 130190 B/op 342 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2107 477549 ns/op 131464 B/op 344 allocs/op
BenchmarkReadExistingLogs/logs-100-lines
BenchmarkReadExistingLogs/logs-100-lines-20 1741 627211 ns/op 187097 B/op 910 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1966 607626 ns/op 180012 B/op 898 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1699 618002 ns/op 183666 B/op 904 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1737 634392 ns/op 182386 B/op 902 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1741 619593 ns/op 181393 B/op 900 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1760 614189 ns/op 182303 B/op 902 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines
BenchmarkReadExistingLogs/logs-1000-lines-20 612 1871147 ns/op 764324 B/op 7199 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 594 1827584 ns/op 763163 B/op 7197 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 596 1924905 ns/op 768511 B/op 7206 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 580 1883590 ns/op 766077 B/op 7202 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 606 1834081 ns/op 765567 B/op 7201 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 609 1914972 ns/op 762934 B/op 7197 allocs/op
PASS
ok github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file 70.221s after$ git checkout 4244166551b2b42c824de934009a7ce535b0e656
$ # Copy the benchmark's code from #38054
$ (cd pkg/stanza/operator/input/file/ && go test -benchmem -run=^$ -bench Benchmark . -v -count=6 | tee /tmp/fileinput-$(git rev-parse --short HEAD).txt)
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
BenchmarkReadExistingLogs
BenchmarkReadExistingLogs/logs-0-lines
BenchmarkReadExistingLogs/logs-0-lines-20 53511 24559 ns/op 26445 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 47437 23872 ns/op 26441 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 48286 23788 ns/op 26448 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 48123 25532 ns/op 26433 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 47091 24595 ns/op 26438 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-0-lines-20 45786 24721 ns/op 26458 B/op 44 allocs/op
BenchmarkReadExistingLogs/logs-1-lines
BenchmarkReadExistingLogs/logs-1-lines-20 2827 413479 ns/op 105834 B/op 218 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2866 403773 ns/op 105807 B/op 218 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2834 421020 ns/op 108157 B/op 223 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2377 421040 ns/op 106774 B/op 220 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2400 422217 ns/op 107886 B/op 222 allocs/op
BenchmarkReadExistingLogs/logs-1-lines-20 2539 421018 ns/op 106846 B/op 220 allocs/op
BenchmarkReadExistingLogs/logs-2-lines
BenchmarkReadExistingLogs/logs-2-lines-20 2775 428352 ns/op 107883 B/op 226 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2576 424679 ns/op 107209 B/op 225 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2887 403452 ns/op 105917 B/op 223 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2815 420321 ns/op 107685 B/op 226 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2653 438202 ns/op 108466 B/op 228 allocs/op
BenchmarkReadExistingLogs/logs-2-lines-20 2847 428282 ns/op 107841 B/op 226 allocs/op
BenchmarkReadExistingLogs/logs-10-lines
BenchmarkReadExistingLogs/logs-10-lines-20 2167 474094 ns/op 115946 B/op 278 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2664 461281 ns/op 114805 B/op 276 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2235 451945 ns/op 113743 B/op 274 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2594 454700 ns/op 112852 B/op 272 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2601 455286 ns/op 115034 B/op 276 allocs/op
BenchmarkReadExistingLogs/logs-10-lines-20 2655 470482 ns/op 114636 B/op 276 allocs/op
BenchmarkReadExistingLogs/logs-20-lines
BenchmarkReadExistingLogs/logs-20-lines-20 2223 494120 ns/op 119360 B/op 335 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2164 484985 ns/op 120693 B/op 337 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2088 483260 ns/op 121877 B/op 339 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2305 493402 ns/op 120727 B/op 337 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2455 484570 ns/op 120075 B/op 336 allocs/op
BenchmarkReadExistingLogs/logs-20-lines-20 2168 480794 ns/op 120381 B/op 337 allocs/op
BenchmarkReadExistingLogs/logs-100-lines
BenchmarkReadExistingLogs/logs-100-lines-20 1834 605985 ns/op 171613 B/op 895 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 2002 620702 ns/op 173700 B/op 899 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1978 612737 ns/op 171550 B/op 895 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 2023 642387 ns/op 173368 B/op 898 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1676 617445 ns/op 173475 B/op 899 allocs/op
BenchmarkReadExistingLogs/logs-100-lines-20 1693 602219 ns/op 173949 B/op 900 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines
BenchmarkReadExistingLogs/logs-1000-lines-20 604 1878948 ns/op 755451 B/op 7196 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 577 1858635 ns/op 752512 B/op 7190 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 566 1877634 ns/op 755836 B/op 7196 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 582 1856978 ns/op 752424 B/op 7190 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 601 1866527 ns/op 755582 B/op 7196 allocs/op
BenchmarkReadExistingLogs/logs-1000-lines-20 588 1950122 ns/op 754923 B/op 7195 allocs/op
PASS
ok github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file 66.965s Comparison: $ benchstat /tmp/fileinput-0820ea0d51.txt /tmp/fileinput-4244166551.txt
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
│ /tmp/fileinput-0820ea0d51.txt │ /tmp/fileinput-4244166551.txt │
│ sec/op │ sec/op vs base │
ReadExistingLogs/logs-0-lines-20 23.44µ ± 8% 24.58µ ± 4% ~ (p=0.093 n=6)
ReadExistingLogs/logs-1-lines-20 432.3µ ± 2% 421.0µ ± 4% -2.61% (p=0.004 n=6)
ReadExistingLogs/logs-2-lines-20 433.0µ ± 4% 426.5µ ± 5% ~ (p=0.065 n=6)
ReadExistingLogs/logs-10-lines-20 482.5µ ± 4% 458.3µ ± 3% -5.02% (p=0.041 n=6)
ReadExistingLogs/logs-20-lines-20 493.1µ ± 3% 484.8µ ± 2% ~ (p=0.485 n=6)
ReadExistingLogs/logs-100-lines-20 618.8µ ± 3% 615.1µ ± 4% ~ (p=0.485 n=6)
ReadExistingLogs/logs-1000-lines-20 1.877m ± 3% 1.872m ± 4% ~ (p=1.000 n=6)
geomean 383.2µ 379.3µ -1.02%
│ /tmp/fileinput-0820ea0d51.txt │ /tmp/fileinput-4244166551.txt │
│ B/op │ B/op vs base │
ReadExistingLogs/logs-0-lines-20 25.83Ki ± 0% 25.82Ki ± 0% ~ (p=0.420 n=6)
ReadExistingLogs/logs-1-lines-20 116.0Ki ± 2% 104.3Ki ± 1% -10.07% (p=0.002 n=6)
ReadExistingLogs/logs-2-lines-20 116.0Ki ± 1% 105.2Ki ± 2% -9.28% (p=0.002 n=6)
ReadExistingLogs/logs-10-lines-20 121.7Ki ± 2% 112.0Ki ± 2% -7.97% (p=0.002 n=6)
ReadExistingLogs/logs-20-lines-20 127.5Ki ± 2% 117.7Ki ± 1% -7.67% (p=0.002 n=6)
ReadExistingLogs/logs-100-lines-20 178.1Ki ± 3% 169.4Ki ± 1% -4.89% (p=0.002 n=6)
ReadExistingLogs/logs-1000-lines-20 747.0Ki ± 0% 737.5Ki ± 0% -1.28% (p=0.002 n=6)
geomean 132.5Ki 124.6Ki -5.95%
│ /tmp/fileinput-0820ea0d51.txt │ /tmp/fileinput-4244166551.txt │
│ allocs/op │ allocs/op vs base │
ReadExistingLogs/logs-0-lines-20 44.00 ± 0% 44.00 ± 0% ~ (p=1.000 n=6) ¹
ReadExistingLogs/logs-1-lines-20 227.5 ± 2% 220.0 ± 1% -3.30% (p=0.002 n=6)
ReadExistingLogs/logs-2-lines-20 232.0 ± 1% 226.0 ± 1% -2.59% (p=0.002 n=6)
ReadExistingLogs/logs-10-lines-20 280.0 ± 1% 276.0 ± 1% -1.43% (p=0.004 n=6)
ReadExistingLogs/logs-20-lines-20 342.5 ± 1% 337.0 ± 1% -1.61% (p=0.002 n=6)
ReadExistingLogs/logs-100-lines-20 902.0 ± 1% 898.5 ± 0% -0.39% (p=0.030 n=6)
ReadExistingLogs/logs-1000-lines-20 7.200k ± 0% 7.196k ± 0% -0.06% (p=0.002 n=6)
geomean 392.9 387.7 -1.35%
¹ all samples are equal |
This benchmark has a scope that sits between the [Stanza File consumer benchmark](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.120.0/pkg/stanza/fileconsumer/benchmark_test.go) and a File Log receiver benchmark (which currently does not exist). The major difference between the File consumer benchmark and the File input benchmark is that the File input benchmark includes measuring of the memory allocations made in the File input's [emit](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.120.0/pkg/stanza/operator/input/file/input.go#L40) function. This should allow to assess performance impact of this change: #37734, or any similar changes in the future.
Before (including copy of the bytes needed upstream):
After