-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[translator/azurelogs] Improve performance #39340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[translator/azurelogs] Improve performance #39340
Conversation
@@ -227,26 +218,42 @@ func copyPropertiesAndApplySemanticConventions(category string, properties *any, | |||
return | |||
} | |||
|
|||
// TODO: check if this is a valid JSON string and parse it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this TODO: if we parse as JSON we will lose some fields and it would be a breaking change at this point. I think it will also be worse for performance to unmarshal instead of just iterating over the map. But having any
is an issue for performance we should look at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cparkins I would like to come back to this as I think it is best to unmarshal instead of using any
. But this will always lead to a breaking change: if we don't explicitly handle the field, we lose it. What do you think?
…; remove redundant casting
// TODO Add unit test again once bug gets fixed. | ||
// Bug https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/39186#issuecomment-2798517892 | ||
// "log_maximum": { | ||
// logFilename: "log-maximum.json", | ||
// expectedFilename: "log-maximum-expected.yaml", | ||
// }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨
I wouldn't mind one more review from @cparkins |
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - #39176 - #39200 Work in progress: - #39340 Work planned: - #39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
#### Description This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests. These are the main changes: - Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id. - Remove the map `mappings`. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away. - Use config fastest for `jsoniter` and borrow iterator. Results: ``` goos: linux goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before │ this PR │ │ sec/op │ sec/op vs base │ UnmarshalLogs/1000_record-16 2.226m ± 2% 1.590m ± 6% -28.56% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.890µ ± 1% 2.155µ ± 1% -25.45% (p=0.000 n=10) UnmarshalLogs/100_record-16 217.9µ ± 2% 155.4µ ± 1% -28.69% (p=0.000 n=10) geomean 111.9µ 81.05µ -27.58% │ before │ this PR │ │ B/op │ B/op vs base │ UnmarshalLogs/1000_record-16 2.093Mi ± 0% 1.293Mi ± 0% -38.24% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.484Ki ± 0% 1.506Ki ± 0% -39.39% (p=0.000 n=10) UnmarshalLogs/100_record-16 216.0Ki ± 0% 144.9Ki ± 0% -32.91% (p=0.000 n=10) geomean 104.8Ki 66.11Ki -36.91% │ before │ this PR │ │ allocs/op │ allocs/op vs base │ UnmarshalLogs/1000_record-16 38.05k ± 0% 20.03k ± 0% -47.36% (p=0.000 n=10) UnmarshalLogs/1_record-16 52.00 ± 0% 31.00 ± 0% -40.38% (p=0.000 n=10) UnmarshalLogs/100_record-16 3.835k ± 0% 2.025k ± 0% -47.20% (p=0.000 n=10) geomean 1.965k 1.079k -45.07% ``` Performance increased in all metrics. There are still improvements that can be done, but I will not add them to this PR so it won't get too big: - We should not carry an attributes map, but instead we should add them to the record as soon as possible. These issues might also get affected by open-telemetry#39186 if it goes forward. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39119. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests and benchmark.
…elemetry#39457) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - open-telemetry#39176 - open-telemetry#39200 Work in progress: - open-telemetry#39340 Work planned: - open-telemetry#39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
#### Description This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests. These are the main changes: - Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id. - Remove the map `mappings`. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away. - Use config fastest for `jsoniter` and borrow iterator. Results: ``` goos: linux goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before │ this PR │ │ sec/op │ sec/op vs base │ UnmarshalLogs/1000_record-16 2.226m ± 2% 1.590m ± 6% -28.56% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.890µ ± 1% 2.155µ ± 1% -25.45% (p=0.000 n=10) UnmarshalLogs/100_record-16 217.9µ ± 2% 155.4µ ± 1% -28.69% (p=0.000 n=10) geomean 111.9µ 81.05µ -27.58% │ before │ this PR │ │ B/op │ B/op vs base │ UnmarshalLogs/1000_record-16 2.093Mi ± 0% 1.293Mi ± 0% -38.24% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.484Ki ± 0% 1.506Ki ± 0% -39.39% (p=0.000 n=10) UnmarshalLogs/100_record-16 216.0Ki ± 0% 144.9Ki ± 0% -32.91% (p=0.000 n=10) geomean 104.8Ki 66.11Ki -36.91% │ before │ this PR │ │ allocs/op │ allocs/op vs base │ UnmarshalLogs/1000_record-16 38.05k ± 0% 20.03k ± 0% -47.36% (p=0.000 n=10) UnmarshalLogs/1_record-16 52.00 ± 0% 31.00 ± 0% -40.38% (p=0.000 n=10) UnmarshalLogs/100_record-16 3.835k ± 0% 2.025k ± 0% -47.20% (p=0.000 n=10) geomean 1.965k 1.079k -45.07% ``` Performance increased in all metrics. There are still improvements that can be done, but I will not add them to this PR so it won't get too big: - We should not carry an attributes map, but instead we should add them to the record as soon as possible. These issues might also get affected by open-telemetry#39186 if it goes forward. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39119. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests and benchmark.
…elemetry#39457) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - open-telemetry#39176 - open-telemetry#39200 Work in progress: - open-telemetry#39340 Work planned: - open-telemetry#39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
Description
This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests.
These are the main changes:
mappings
. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away.jsoniter
and borrow iterator.Results:
Performance increased in all metrics.
There are still improvements that can be done, but I will not add them to this PR so it won't get too big:
These issues might also get affected by #39186 if it goes forward.
Link to tracking issue
Relates #39119.
Testing
Unit tests and benchmark.