Reduce GC pressure in LinesCombiner by ~90% #1144
Merged
+7
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The LinesCombiner uses the
Array#zipmethod to merge both line coverage arrays A and B together. This creates a new array, immediately discarding A and B. When we're combining a lot of files, this creates significant pressure on the garbage collection because we're allocating a lot of intermediate arrays.For example, all GitLab unit tests emit about 328 MB of minified, uncompressed
.resultset.jsonfiles. If we useSimpleCov.collateto merge them (using the default configuration), we invoke the GC over 1,700 times.If we instead reuse the longest array A or B to merge coverage in-place without creating a new array, we reduce the GC invocations to 55.
This results in a total reduction in GC time by around 90%. On my M3 Max, this reduces the garbage collection time with our unit test coverage result sets from 2.78s to 0.32s. In total, we're observing a total speed up of coverage merging in this example of about 25% (9.55 -> 7.05s).
Because both approaches of using
Array#zipor updating the array in-place are functionally identical, we can replace the former with the latter without any other changes.How to test locally
It's a bit cumbersome to set this up without setting up the whole GitLab project. We can try though.
.resultset.jsonfor our unit tests. This link might break at some point (then you'd have to look for another pipeline that runs our unit tests): https://gitlab.com/gitlab-org/gitlab/-/jobs/11891530948 (on the right, click on Download)<simplecov_repo>/testtime ruby test.rb, both on the default branch and thissingedorruby-profto flamegraph, or useGC::Profilerto get a GC profileHyperfine results
GC dump
Before
After
Flamegraph
Before
After