-
Notifications
You must be signed in to change notification settings - Fork 46
Fix issue #71: Optimize nested structure decoding performance #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Jecoms
wants to merge
6
commits into
go-playground:master
Choose a base branch
from
Jecoms:fix-issue-71-nested-performance
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Fix issue #71: Optimize nested structure decoding performance #73
Jecoms
wants to merge
6
commits into
go-playground:master
from
Jecoms:fix-issue-71-nested-performance
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…mance Replace O(n) linear search in findAlias() with O(1) hash map lookup. This resolves exponential performance degradation when decoding deeply nested structures with hundreds or thousands of elements. Performance improvements: - 100 values: 162ms → 37ms (77% faster) - 1000 values: 162s → 4.8s (97% faster) Changes: - Added aliasMap field to decoder struct for O(1) alias lookups - Modified parseMapData() to populate and clear the map - Updated findAlias() to use map lookup instead of slice iteration - Added comprehensive tests and benchmarks in decoder_test.go All existing tests pass. No breaking changes.
- Add error checking in BenchmarkIssue71Nested100 and BenchmarkIssue71Nested1000 - Reorganize aliasMap initialization logic for clarity
Reorder initialization check to create map before clearing it
Uses build tags to detect if race detector is enabled during tests: - Without -race: strict thresholds (10ms, 100ms, 10s) - With -race: lenient thresholds (50ms, 500ms, 35s) The race detection constants are in *_test.go files, so they're only compiled during testing and not included in production builds. This allows fast local development testing while ensuring CI tests with race detector pass on all Go versions including 1.17.x. Verified passing on: - Go 1.24.5 without race: 0.5ms, 41ms, 4.5s ✓ - Go 1.24.5 with race: 3.2ms, 261ms, 25.1s ✓ - Go 1.17.13 with race: 3.8ms, 279ms, 26.5s ✓
CI runners on Go 1.17.x with race detector were taking 44-45s, exceeding the 35s threshold. Increased to 50s to account for slow/loaded CI runners while still catching performance regressions. Also added debug logging to show which thresholds are active.
…ssion Changed from (10, 100, 1000) to (10, 50, 200) values. Since the bug scales exponentially, 200 values is sufficient to prove the optimization works (without fix: 16s+, with fix: 150ms). Benefits: - Test runs in ~1-2s instead of 25-65s on CI with race detector - Still catches performance regressions (10x+ speedup is obvious) - More reliable on slow/variable CI runners - Faster local development feedback Verified on: - Go 1.24.5 without race: 0.5ms, 11ms, 152ms ✓ - Go 1.24.5 with race: 3.2ms, 68ms, 916ms ✓ - Go 1.17.13 with race: 3.7ms, 71ms, 1.07s ✓
Thanks, will try to take a look at this weekend. |
I'd like to help further improve the performance, but this is a start. This library is super useful :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When decoding structs with data nested inside two or more layers of slices or maps, the decoder exhibited exponential performance degradation based on the number of values.
Example Structure
Performance Before Fix
The performance degradation was exponential, making the decoder unusable for real-world nested data.
Root Cause
The
findAlias()
function performed a linear O(n) search through thedataMap
slice for every alias lookup. With deeply nested structures, this function was called thousands or millions of times, resulting in O(n²) or worse complexity.For example, with 1000 nested elements, the parser would:
foos
, 1 forfoos[0].bars
, 1000 forfoos[0].bars[N].lookup
)findAlias()
many times during parsing and decodingfindAlias()
call would iterate through the entire dataMap linearlySolution
Replaced the linear search with a hash map lookup (O(1)):
aliasMap map[string]*recursiveData
field to thedecoder
structparseMapData()
to populate the map as aliases are createdfindAlias()
to use the map instead of iterating through the sliceCode Changes
decoder.go:
aliasMap
field todecoder
struct for O(1) lookupsparseMapData()
recursiveData
entriesfindAlias()
to use map lookup instead of linear searchdecoder_test.go:
Test infrastructure (test-only, not in production binary):
race_test.go
/norace_test.go
: Detect race detector to adjust performance thresholdsPerformance After Fix
Without race detector (local development):
With race detector (CI environment):
The optimization provides a ~100x speedup for nested structures with hundreds of elements.
Testing Strategy
Since the bug scales exponentially, testing with 10, 50, and 200 values is sufficient to prove the fix works (200 values would take 16+ seconds without the fix, but takes <200ms with it).
The test uses build tags to detect if the race detector is enabled:
-race
: Strict thresholds for fast local feedback-race
: Lenient thresholds accounting for 5-10x race detector overheadThis ensures tests pass reliably on CI while still catching performance regressions.
Impact
Verification
All CI checks pass:
Tested locally on:
Fixes #71