|
| 1 | +Custom Labels |
| 2 | +============= |
| 3 | + |
| 4 | +# Meta |
| 5 | + |
| 6 | +- **Author(s)**: Tommy Reilly |
| 7 | +- **Start Date**: 2025-05-15 |
| 8 | +- **Goal End Date**: 2025-06-15 |
| 9 | +- **Primary Reviewers**: Florian Lehner, Timo Teräs, Brennan Vincent |
| 10 | + |
| 11 | +# Problem |
| 12 | + |
| 13 | +Sometimes understanding performance issues is hard because there's no way to dissect hotspots by attributes that aren't visible in the program structure. For instance in a database that uses a generic query execution path to execute all queries you may want to see how much CPU cycles are on behalf of internal queries vs external queries, or you might want to see which user is doing the most queries. This requires attaching metadata to each sample. In Go this is typically done with pprof labels and pprof data can be split out by different values of these labels (example: https://www.polarsignals.com/blog/posts/2021/04/13/demystifying-pprof-labels-with-go). |
| 14 | + |
| 15 | +In addition to pprof labels more examples of where custom labels could be used (out of the box only pprof labels are supported, these theoretical use cases are only intended to help understand the design space better). |
| 16 | + |
| 17 | +- Trace IDs for supporting queries of CPU resources used by a particular traceid |
| 18 | +- Runtime metadata like "goid" so that CPU resources associated by a particular Goroutine can be discerned |
| 19 | +- Arbitrary application/workload specific metadata like user, client or query |
| 20 | + |
| 21 | +This design doc describes how we can surface Go pprof labels in the OTel profiler and lays the groundwork for doing similar things for other languages but how languages besides Go are supported is beyond the scope of this document. |
| 22 | + |
| 23 | +# Success criteria |
| 24 | + |
| 25 | +- Any native language unwinder should be able to add custom labels to each sample, ie it should not be Go specific even if Go is the initial target |
| 26 | +- Custom labels should have its own trace type for enable/disable purposes even though it is technically not an unwinder |
| 27 | +- When disabled custom labels has little to no impact on performance or memory usage of the profiler |
| 28 | +- Custom labels should be limited so that even if a program has thousands of eligible labels the number supported is reasonably small (mostly enforced by eBPF itself) |
| 29 | +- Custom labels should be short and have fixed memory overhead |
| 30 | +- The custom labels should be made available to the reporter backend but otherwise it should be left up to implementors what to do with them |
| 31 | + |
| 32 | +# Scope |
| 33 | + |
| 34 | +The initial proposal will only deal with Go pprof labels which are just string/string key/value pairs, more custom labels for Go or other languages may be added in the future. The initial proposal is to get up to 10 labels in best effort fashion, if any eBPF errors occur there may be fewer labels and there is no proposed mechanism for deciding which labels to grab. Even though the OTel proto allows arbitrary types for the value the initial implementation will be scoped to just strings. |
| 35 | + |
| 36 | +# Proposed Solution |
| 37 | + |
| 38 | +The solution we propose is to add support for 10 64 byte custom labels associated with each sample with 16 bytes for the label key and 48 bytes for the label value. These will be stored in the Trace struct with the stack frame information for each sample so each Trace will be 640 bytes larger than before. |
| 39 | + |
| 40 | +In Go 1.23 and lower labels are stored in a map so its non-deterministic which labels are read from the program, in Go 1.24+ the labels are stored in a list sorted by their keys so it will be first come first serve which labels are extracted. If the labels key or value is larger than 16/48 bytes they will be truncated. No effort is made to validate the strings from a UTF8 perspective. |
0 commit comments