-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[translator/azurelogs] Rethink log structure, and define resource attributes #39186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging code owners: @atoulme, @cparkins, @MikeGoldsmith (as this didn't get done automatically). |
/label pkg/translator/azurelogs |
Pinging code owners for pkg/translator/azurelogs: @atoulme @cparkins @MikeGoldsmith. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label. |
I think resources without attributes should be even considered a bug at this point. Resources with the same attributes (or in this case, without any attributes) are considered the same, and this is obviously not the case. You can find this issue manifesting in our unit tests
See the logs comparison function for better understanding. |
/label bug |
@cparkins what's your take? |
@atoulme I have no reason to disagree. But how logs should be structured is not something I have much of an educated opinion on. I think what is being proposed makes sense. |
Thanks @cparkins @constanca-m take it away (and feel free to make yourself a codeowner) |
#### Description This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests. These are the main changes: - Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id. - Remove the map `mappings`. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away. - Use config fastest for `jsoniter` and borrow iterator. Results: ``` goos: linux goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before │ this PR │ │ sec/op │ sec/op vs base │ UnmarshalLogs/1000_record-16 2.226m ± 2% 1.590m ± 6% -28.56% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.890µ ± 1% 2.155µ ± 1% -25.45% (p=0.000 n=10) UnmarshalLogs/100_record-16 217.9µ ± 2% 155.4µ ± 1% -28.69% (p=0.000 n=10) geomean 111.9µ 81.05µ -27.58% │ before │ this PR │ │ B/op │ B/op vs base │ UnmarshalLogs/1000_record-16 2.093Mi ± 0% 1.293Mi ± 0% -38.24% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.484Ki ± 0% 1.506Ki ± 0% -39.39% (p=0.000 n=10) UnmarshalLogs/100_record-16 216.0Ki ± 0% 144.9Ki ± 0% -32.91% (p=0.000 n=10) geomean 104.8Ki 66.11Ki -36.91% │ before │ this PR │ │ allocs/op │ allocs/op vs base │ UnmarshalLogs/1000_record-16 38.05k ± 0% 20.03k ± 0% -47.36% (p=0.000 n=10) UnmarshalLogs/1_record-16 52.00 ± 0% 31.00 ± 0% -40.38% (p=0.000 n=10) UnmarshalLogs/100_record-16 3.835k ± 0% 2.025k ± 0% -47.20% (p=0.000 n=10) geomean 1.965k 1.079k -45.07% ``` Performance increased in all metrics. There are still improvements that can be done, but I will not add them to this PR so it won't get too big: - We should not carry an attributes map, but instead we should add them to the record as soon as possible. These issues might also get affected by #39186 if it goes forward. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates #39119. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests and benchmark.
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - #39176 - #39200 Work in progress: - #39340 Work planned: - #39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
#### Description This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests. These are the main changes: - Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id. - Remove the map `mappings`. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away. - Use config fastest for `jsoniter` and borrow iterator. Results: ``` goos: linux goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before │ this PR │ │ sec/op │ sec/op vs base │ UnmarshalLogs/1000_record-16 2.226m ± 2% 1.590m ± 6% -28.56% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.890µ ± 1% 2.155µ ± 1% -25.45% (p=0.000 n=10) UnmarshalLogs/100_record-16 217.9µ ± 2% 155.4µ ± 1% -28.69% (p=0.000 n=10) geomean 111.9µ 81.05µ -27.58% │ before │ this PR │ │ B/op │ B/op vs base │ UnmarshalLogs/1000_record-16 2.093Mi ± 0% 1.293Mi ± 0% -38.24% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.484Ki ± 0% 1.506Ki ± 0% -39.39% (p=0.000 n=10) UnmarshalLogs/100_record-16 216.0Ki ± 0% 144.9Ki ± 0% -32.91% (p=0.000 n=10) geomean 104.8Ki 66.11Ki -36.91% │ before │ this PR │ │ allocs/op │ allocs/op vs base │ UnmarshalLogs/1000_record-16 38.05k ± 0% 20.03k ± 0% -47.36% (p=0.000 n=10) UnmarshalLogs/1_record-16 52.00 ± 0% 31.00 ± 0% -40.38% (p=0.000 n=10) UnmarshalLogs/100_record-16 3.835k ± 0% 2.025k ± 0% -47.20% (p=0.000 n=10) geomean 1.965k 1.079k -45.07% ``` Performance increased in all metrics. There are still improvements that can be done, but I will not add them to this PR so it won't get too big: - We should not carry an attributes map, but instead we should add them to the record as soon as possible. These issues might also get affected by open-telemetry#39186 if it goes forward. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39119. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests and benchmark.
…elemetry#39457) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - open-telemetry#39176 - open-telemetry#39200 Work in progress: - open-telemetry#39340 Work planned: - open-telemetry#39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
#### Description This PR is only to improve performance, it does not change any functionality or any output as you can see by the passing unit tests. These are the main changes: - Iterate over the azure logs only one time. Previously we had a slice for the keys, and a map that store all logs corresponding to the same resource id. - Remove the map `mappings`. It is expensive to look up the field this way. As an alternative, we now have a function that checks the field, and adds that to the attribute. This function is used from the beginning, as we know the category right away. - Use config fastest for `jsoniter` and borrow iterator. Results: ``` goos: linux goarch: amd64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz │ before │ this PR │ │ sec/op │ sec/op vs base │ UnmarshalLogs/1000_record-16 2.226m ± 2% 1.590m ± 6% -28.56% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.890µ ± 1% 2.155µ ± 1% -25.45% (p=0.000 n=10) UnmarshalLogs/100_record-16 217.9µ ± 2% 155.4µ ± 1% -28.69% (p=0.000 n=10) geomean 111.9µ 81.05µ -27.58% │ before │ this PR │ │ B/op │ B/op vs base │ UnmarshalLogs/1000_record-16 2.093Mi ± 0% 1.293Mi ± 0% -38.24% (p=0.000 n=10) UnmarshalLogs/1_record-16 2.484Ki ± 0% 1.506Ki ± 0% -39.39% (p=0.000 n=10) UnmarshalLogs/100_record-16 216.0Ki ± 0% 144.9Ki ± 0% -32.91% (p=0.000 n=10) geomean 104.8Ki 66.11Ki -36.91% │ before │ this PR │ │ allocs/op │ allocs/op vs base │ UnmarshalLogs/1000_record-16 38.05k ± 0% 20.03k ± 0% -47.36% (p=0.000 n=10) UnmarshalLogs/1_record-16 52.00 ± 0% 31.00 ± 0% -40.38% (p=0.000 n=10) UnmarshalLogs/100_record-16 3.835k ± 0% 2.025k ± 0% -47.20% (p=0.000 n=10) geomean 1.965k 1.079k -45.07% ``` Performance increased in all metrics. There are still improvements that can be done, but I will not add them to this PR so it won't get too big: - We should not carry an attributes map, but instead we should add them to the record as soon as possible. These issues might also get affected by open-telemetry#39186 if it goes forward. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39119. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests and benchmark.
…elemetry#39457) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PRs adds myself (@constanca-m) to the codeowners of the component. Work done: - open-telemetry#39176 - open-telemetry#39200 Work in progress: - open-telemetry#39340 Work planned: - open-telemetry#39186 The company I work at (Elastic) is relying on this component in one of our products, so I plan to keep contributing to it. I hope that this change will: - Help and improve maintenance - Prevent breaking changes without warning - Help collaboration between codeowners from different companies.
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The issue for this PR was first reported here: #39186. Currently, the resource logs have no attributes. This means that all records are supposedly the same, which is not true. This PR adds attributes to the resources. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates #39186. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests fixed.
…etry#39571) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The issue for this PR was first reported here: open-telemetry#39186. Currently, the resource logs have no attributes. This means that all records are supposedly the same, which is not true. This PR adds attributes to the resources. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39186. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests fixed.
…etry#39571) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The issue for this PR was first reported here: open-telemetry#39186. Currently, the resource logs have no attributes. This means that all records are supposedly the same, which is not true. This PR adds attributes to the resources. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39186. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests fixed.
…etry#39571) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description The issue for this PR was first reported here: open-telemetry#39186. Currently, the resource logs have no attributes. This means that all records are supposedly the same, which is not true. This PR adds attributes to the resources. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Relates open-telemetry#39186. <!--Describe what testing was performed and which tests were added.--> #### Testing Unit tests fixed.
Uh oh!
There was an error while loading. Please reload this page.
Component(s)
pkg/translator/azurelogs
Describe the issue you're reporting
Consider this example log:
The expectation for this would be to have two resources: one for resource
/firstResource
, and the other for resource/secondResource
. TheseresourceId
would fit resource attributes best:Rather, the properties of these records should be log record attributes (these are not pictured in the example file above). Category and operation name could be as well resource attributes.
These are the produced logs:
Even though it is correct that we have two resources, these resources have empty attributes, and the attributes of the log record are the resource attributes.
The properties (placed inside the body) could then be the actual log record attributes. See for example the AWS VPC flow log to get a better idea.
At the moment, inside the
body
we have akvlistValue
field:Which maybe is unconventional, and would fit best in the log record attributes.
The text was updated successfully, but these errors were encountered: