Description
I'm on Go 1.4 and linux/amd64. I've noticed this since at least Go 1.2.
Large maps incur large GC pause times. This happens even when the map key and value types do not contain pointers, and when the map isn't changing between GC runs. I assume it comes from the collector traversing internal pointers in the map implementation.
Roughly speaking, with a single map containing millions of entries, one will typically see GC pause times of hundreds of ms.
Here's an example program that shows a few different approaches: http://play.golang.org/p/AeHSLyLz_c. In particular, it compares
- Large map with a pointer-typed value
- Large map with a pointer-free value
- Sharded map
- Big slice
In the real program where this issue caused me problems, I have a server that used a very large map (>10M entries) as an index into off-heap data. Even after getting rid of pointers in the map value and sharding the map, the pause times were >100ms which was too large for my use case. So I ended up writing a very simple hashmap implementation using only slices/indexes and that brought the pause times to under 1ms.
I wonder if the map implementation and GC can be made to work together better to mitigate this.
(It's possible that this is expected/unfortunate, or that it will get better with improved GC along with everything else, but I wanted to have a record of it to center the discussion.)