Description
The pprof format specifies that profiles must be gzip compressed on disk. Go implements this by unconditionally applying gzip compression (level 1) to all pprof profiles it produces.
This is problematic because gzip is no longer considered to be competitive in the compression space, see the accepted proposal for adding compress/zstd to the stdlib. Also see the compression comparison below, showing that zstd-3 can produce profiles that are 18% smaller than gzip-1 while being 13% faster.
Data volumes are directly correlated to cost (egress, ingress, load balancers), so continuous profiling tools have to make an unpleasant tradeoff: They can either decompress the profiles from the runtime and recompress them as zstd and accept increased CPU/memory overhead. Or they can leave the gzip-1 compression as-is and accept increased network overhead.
Possible Solutions
- Provide an API to disable the compression
- Provide an API to make the compression algorithm configurable
- Switch to zstd compression by default (would depend on #62513 and might require pprof to support zstd as well)
Initial discussions at yesterday's runtime: performance and diagnostics meeting seemed to hint at rough consensus for option 1 (meeting notes should be available soon). This would also be aligned with runtime/trace
which produces uncompressed data. However, for CPU profiles this will probably depend on the implementation of #42502. For the other profile types, the debug
argument to Profile.WriteTo could be used.
If that sounds roughly right, I can turn this issue into a proposal for option 1.
Compression Comparison
Below is somewhat haphazard, but illustrative comparison between a few different compression algorithms for compressing pprof data. The source code is available.
- file: A random cpu profile that is 2.4 MiB before compression (not supplied here)
- algorithm: A algorithm-level tuple.
- zstd is
github.com/klauspost/compress/zstd
- kgzip is
github.com/klauspost/compress/gzip
- lz4 is
github.com/pierrec/lz4/v4
- gzip is
compress/gzip
- zstd is
- compression_ratio:
uncompressed bytes / compressed bytes
- speed_mb_per_sec:
uncompressed bytes / duration
(median of 10 runs) - utility:
compression_ratio * speed_mb_per_sec
(suggested by this blog post)
file | algorithm | compression_ratio | speed_mb_per_sec | utility |
---|---|---|---|---|
cpu.pprof | zstd-1 | 2.93 | 304 | 889.06 |
cpu.pprof | zstd-2 | 3.13 | 224 | 700.85 |
cpu.pprof | lz4-0 | 2.04 | 292 | 593.92 |
cpu.pprof | kgzip-1 | 2.69 | 190 | 510.83 |
cpu.pprof | zstd-3 | 3.27 | 141 | 460.03 |
cpu.pprof | kgzip-6 | 2.92 | 121 | 351.93 |
cpu.pprof | gzip-1 | 2.68 | 123 | 328.17 |
cpu.pprof | lz4-1 | 2.53 | 56 | 141.02 |
cpu.pprof | lz4-9 | 2.53 | 51 | 127.88 |
cpu.pprof | lz4-4 | 2.53 | 51 | 127.86 |
cpu.pprof | gzip-6 | 3.02 | 39 | 117.89 |
cpu.pprof | zstd-4 | 3.43 | 26 | 90.29 |
cpu.pprof | gzip-9 | 3.03 | 16 | 48.9 |
cpu.pprof | kgzip-9 | 3.05 | 15 | 46.34 |
Conclusion: For this profile, zstd-3 produces profiles that are 18% (1-2.68/3.27
) smaller while being 13% faster (1-123/141
) than gzip-1.