[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

celian-garcia · 2024-11-06T15:29:24Z

Description

Currently there is a mechanism allowing to split the result by dimension, thanks to a filter param hack.
This is great that all the dimensions are collected as labels in the metrics, but for some resources types it could be unwanted. In cause some concern about cardinality, or continuity of the queries between one version to another (e.g with prometheus exporter, if one does not do a "sum by ..." and the otelcol version is updated, the query can display different results)

To mitigate that I propose to put an optout for this collection so if we want for some resource types and not for others, we can create two separate receivers for example. Or we completely opt out if we don't want additional labels.

Edit:

After a first review, we agreed on the fact that it could do a bit more like allowing us to specify a list of dimensions for a particular metric.
e.g from added documentation

receivers:
  azuremonitor:
    dimensions:
      enabled: true
      overrides:
        "Microsoft.Network/azureFirewalls":
          # Real example of an Azure limitation here:
          # Dimensions exposed are Reason, Status, Protocol,
          # but when selecting Protocol in the filters, it returns nothing.
          # Note here that the metric display name is ``Network rules hit count`` but it's programmatic value is ``NetworkRuleHit``
          # Ref: https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/microsoft-network-azurefirewalls-metrics
          "NetworkRuleHit": [Reason, Status]

Without this config you won't have the azure_networkrulehit_average_Count metric at all in your results
But with the config we have it with reason and status labels:

# HELP azure_networkrulehit_average_Count
# TYPE azure_networkrulehit_average_Count gauge
azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Allow",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 21.875
azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Deny",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 10

Link to tracking issue

Fixes #36611

Testing

Documentation

linux-foundation-easycla · 2024-11-06T15:29:31Z

The committers listed above are authorized under a signed CLA.

✅ login: celian-garcia / name: Célian GARCIA (9709d07)

celian-garcia · 2024-11-18T13:00:16Z

Hey @codeboten, @nslaughter, @jpkrohling. Any opinion on that? I feel it's a no brainer so if you can run the pipeline and have a look, it would be cool!

tesharp · 2024-11-20T11:41:35Z

We have noticed that some metrics do not return any data if you include dimensions that do not have any data, are empty or nil (assumption based on results we have seen). For instance Azure Firewall Network Rule Hits have 3 dimensions Status, Reason and Protocol. If filtering on Status and Reason we get back data, but as soon as you add Protocol it returns no data at all. It returns no data if you only try to filter on Protocol too. Seen same for other metrics that have "empty" dimension values.

Would it maybe make sense to not only opt in / out but set which dimensions you would like to filter on? We would still want Reason and Status but can drop Protocol dimension

celian-garcia · 2024-11-20T13:07:46Z

We have noticed that some metrics do not return any data if you include dimensions that do not have any data, are empty or nil (assumption based on results we have seen). For instance Azure Firewall Network Rule Hits have 3 dimensions Status, Reason and Protocol. If filtering on Status and Reason we get back data, but as soon as you add Protocol it returns no data at all. It returns no data if you only try to filter on Protocol too. Seen same for other metrics that have "empty" dimension values.

Would it maybe make sense to not only opt in / out but set which dimensions you would like to filter on? We would still want Reason and Status but can drop Protocol dimension

That's interesting. Then for that particular metric, using split_by_dimensions config field would allow you to receive data but you would lose the status and reason granularity. I just checked and indeed we have the same result even in the Azure Portal UI.

Selecting Network rules hit count and Split By = Protocol gives no result.
But for some metrics, like Application rules hits count, you have results split by http/https.
Also the SNAT Port utilization, you have results split by tcp/udp.

I believe this is a bug on the Network rules hit count definition that shouldn't propose protocol actually. I'm not sure that only this reason deserve a hack on our side, but still this feature would be interesting to me. Like it would be interesting to exclude a particular dimensions that would have too big cardinality or that we wouldn't be interested in.

Let me propose a config structure:

dimensions:
  # default to true to not introduce breaking change. This would mean that all the available dimensions will be collected, except if an exclusion exist.
  enabled: true | false 
  exclusions:
    "Microsoft.Network/azureFirewalls": # service name
        "Network rules hit count": # metric name
          - "Protocol"

We can also implement it the other way

dimensions:
  enabled: true 
  overrides:
    "Microsoft.Network/azureFirewalls":
        "Network rules hit count": [Reason, Status]

WDYT?

tesharp · 2024-11-20T13:18:00Z

I also think this is a bug in the api not only affecting this metric. We noticed it yesterday at 11:10 when suddenly all Postgresql flexible databases stoped reporting cpu, memory, storage etc. It had been working before that filtering on ServerName, but suddenly doesn't work anymore.

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

celian-garcia · 2024-11-20T14:36:12Z

I also think this is a bug in the api not only affecting this metric. We noticed it yesterday at 11:10 when suddenly all Postgresql flexible databases stoped reporting cpu, memory, storage etc. It had been working before that filtering on ServerName, but suddenly doesn't work anymore.

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

Okay I will make an update.

ahurtaud · 2024-11-21T03:56:18Z

Nevertheless it seems like a good feature when you do not need the granularity with all dimensions. I think then the last option makes more sense to override and specify which dimensions you need.

I agree, second proposed config format is better to me

celian-garcia · 2024-11-21T14:28:03Z

ping @tesharp I finished and pushed and I confirm that before this new overrides, the network rule hit metric was not even in the result ! Now it works well 💪

Note it's probably worth it that I move these dimensions functions out of scraper.go file. As if I make scraper_batch right after, I will reuse them..

Edit: moved to a dedicated dimensions.go files + added some unit tests

tesharp · 2024-11-22T08:37:52Z

Nice :) Looks good.

MovieStoreGuy · 2024-11-27T03:24:09Z

Please make sure you run the linters :)

celian-garcia · 2024-11-27T08:56:25Z

Please make sure you run the linters :)

Thanks @MovieStoreGuy, sorry for that. It should be fine now.

github-actions · 2024-12-25T05:21:25Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions · 2025-01-09T05:20:56Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

…mension Signed-off-by: Célian Garcia <[email protected]>

…mension (#37168) Recreated from a diffferent fork. Context: #36240 >  > #### Description > > Currently there is a mechanism allowing to split the result by dimension, thanks to a filter param hack. > This is great that all the dimensions are collected as labels in the metrics, but for some resources types it could be unwanted. In cause some concern about cardinality, or continuity of the queries between one version to another (e.g with prometheus exporter, if one does not do a "sum by ..." and the otelcol version is updated, the query can display different results) > > To mitigate that I propose to put an optout for this collection so if we want for some resource types and not for others, we can create two separate receivers for example. Or we completely opt out if we don't want additional labels. > > ### Edit: > After a first review, we agreed on the fact that it could do a bit more like allowing us to specify a list of dimensions for a particular metric. > e.g from added documentation > ```yaml > receivers: > azuremonitor: > dimensions: > enabled: true > overrides: > "Microsoft.Network/azureFirewalls": > # Real example of an Azure limitation here: > # Dimensions exposed are Reason, Status, Protocol, > # but when selecting Protocol in the filters, it returns nothing. > # Note here that the metric display name is ``Network rules hit count`` but it's programmatic value is ``NetworkRuleHit`` > # Ref: https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/microsoft-network-azurefirewalls-metrics > "NetworkRuleHit": [Reason, Status] > ``` > Without this config you won't have the ``azure_networkrulehit_average_Count`` metric at all in your results > But with the config we have it with reason and status labels: > > ```# HELP azure_networkrulehit_average_Count``` > ```# TYPE azure_networkrulehit_average_Count gauge``` > ```azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Allow",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 21.875``` > ```azure_networkrulehit_average_Count{azuremonitor_resource_id="/subscriptions/<redacted>/resourceGroups/<redacted>/providers/Microsoft.Network/azureFirewalls/<redacted>",location="<redacted>",metadata_reason="<redacted>",metadata_status="Deny",name="<redacted>",resource_group="<redacted>",type="Microsoft.Network/azureFirewalls"} 10``` > >  > #### Link to tracking issue > Fixes #36611 > >  > #### Testing > >  > #### Documentation > >  Signed-off-by: Célian Garcia <[email protected]>

celian-garcia requested review from codeboten and a team as code owners November 6, 2024 15:29

github-actions bot assigned jpkrohling Nov 6, 2024

github-actions bot added the receiver/azuremonitor label Nov 6, 2024

github-actions bot requested a review from nslaughter November 6, 2024 15:29

celian-garcia changed the title ~~feat: Allow to not split result by dimension~~ [receiver/azuremonitorreceiver] feat: Allow to not split result by dimension Nov 6, 2024

celian-garcia mentioned this pull request Nov 20, 2024

[azuremonitorreceiver] Add the ability to pull Resource Metrics using AZ Query Batch API #29593

Closed

github-actions bot added Stale and removed Stale labels Dec 25, 2024

github-actions bot added the Stale label Jan 9, 2025

[receiver/azuremonitorreceiver] feat: Allow to not split result by di…

9709d07

…mension Signed-off-by: Célian Garcia <[email protected]>

github-actions bot removed the Stale label Jan 10, 2025

celian-garcia mentioned this pull request Jan 13, 2025

REQUEST: New membership for celian-garcia open-telemetry/community#2508

Closed

6 tasks

celian-garcia closed this by deleting the head repository Jan 13, 2025

celian-garcia mentioned this pull request Jan 13, 2025

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #37168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

Uh oh!

celian-garcia commented Nov 6, 2024 •

edited

Loading

Uh oh!

linux-foundation-easycla bot commented Nov 6, 2024 •

edited

Loading

Uh oh!

celian-garcia commented Nov 18, 2024

Uh oh!

tesharp commented Nov 20, 2024

Uh oh!

celian-garcia commented Nov 20, 2024

Uh oh!

tesharp commented Nov 20, 2024

Uh oh!

celian-garcia commented Nov 20, 2024

Uh oh!

ahurtaud commented Nov 21, 2024

Uh oh!

celian-garcia commented Nov 21, 2024 •

edited

Loading

Uh oh!

tesharp commented Nov 22, 2024

Uh oh!

MovieStoreGuy commented Nov 27, 2024

Uh oh!

celian-garcia commented Nov 27, 2024

Uh oh!

github-actions bot commented Dec 25, 2024

Uh oh!

github-actions bot commented Jan 9, 2025

Uh oh!

Uh oh!

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

[receiver/azuremonitorreceiver] feat: Allow to not split result by dimension #36240

Uh oh!

Conversation

celian-garcia commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Edit:

Link to tracking issue

Testing

Documentation

Uh oh!

linux-foundation-easycla bot commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

celian-garcia commented Nov 18, 2024

Uh oh!

tesharp commented Nov 20, 2024

Uh oh!

celian-garcia commented Nov 20, 2024

Uh oh!

tesharp commented Nov 20, 2024

Uh oh!

celian-garcia commented Nov 20, 2024

Uh oh!

ahurtaud commented Nov 21, 2024

Uh oh!

celian-garcia commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tesharp commented Nov 22, 2024

Uh oh!

MovieStoreGuy commented Nov 27, 2024

Uh oh!

celian-garcia commented Nov 27, 2024

Uh oh!

github-actions bot commented Dec 25, 2024

Uh oh!

github-actions bot commented Jan 9, 2025

Uh oh!

Uh oh!

celian-garcia commented Nov 6, 2024 •

edited

Loading

linux-foundation-easycla bot commented Nov 6, 2024 •

edited

Loading

celian-garcia commented Nov 21, 2024 •

edited

Loading