Skip to content

Conversation

mauri870
Copy link
Member

@mauri870 mauri870 commented May 6, 2025

Proposed commit message

Currently, beats receivers http monitoring is started on the call to receiver Start. On the OTel collector pipeline, the receiver is first instanciated via the CreateLogs factory, and only then started. This commit moves the http monitoring to the receiver creation, ensuring the monitoring endpoints are available as soon as possible during the collector startup.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test

go test -v ./x-pack/filebeat/fbreceiver ./x-pack/metricbeat/mbreceiver

Related issues

Currently, beats receivers http monitoring is started on the call to
receiver Start. On the OTel collector pipeline, the receiver is first
instanciated via the CreateLogs factory, and only then started. This commit
moves the http monitoring to the receiver creation, ensuring the monitoring
endpoints are available as soon as possible during the collector startup.
@mauri870 mauri870 added this to the otel milestone May 6, 2025
@mauri870 mauri870 self-assigned this May 6, 2025
@mauri870 mauri870 requested a review from a team as a code owner May 6, 2025 20:48
@mauri870 mauri870 added cleanup Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels May 6, 2025
@mauri870 mauri870 requested review from AndersonQ and faec May 6, 2025 20:48
@mauri870 mauri870 added backport-9.0 Automated backport to the 9.0 branch backport-8.19 Automated backport to the 8.19 branch labels May 6, 2025
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 6, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 6, 2025
Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. would just like to add checking /inputs as well if we can since that was what let us find this.

@mauri870 mauri870 marked this pull request as draft May 7, 2025 11:09
@mauri870 mauri870 force-pushed the factory-http-metrics branch from 6773d2f to 9dfbe3c Compare May 7, 2025 13:46
@mauri870 mauri870 requested a review from leehinman May 7, 2025 17:11
@mauri870 mauri870 marked this pull request as ready for review May 7, 2025 19:27
@mauri870
Copy link
Member Author

mauri870 commented May 8, 2025

/test

@mauri870 mauri870 force-pushed the factory-http-metrics branch from 9c4672f to 4b00271 Compare May 8, 2025 12:01
@mauri870 mauri870 force-pushed the factory-http-metrics branch from 18839da to 1c87be2 Compare May 8, 2025 12:40
@mauri870
Copy link
Member Author

mauri870 commented May 8, 2025

Okay, CI is definitely not happy with my changes. The test TestInputMetricsFromPipeline is constantly failing for agentbeat and filebeat integration tests. I don't see any relation with the code I added here, but there must be something interfering with these tests. I'll look into it. In the meantime, if anyone has an idea what that could be I appreciate any insights.

@mauri870 mauri870 force-pushed the factory-http-metrics branch from 67e9fb3 to 3544c05 Compare May 9, 2025 14:07
@mauri870
Copy link
Member Author

mauri870 commented May 9, 2025

@AndersonQ I’m seeing a failure in the TestInputMetricsFromPipeline test you added a month ago:

https://buildkite.com/elastic/beats-xpack-filebeat/builds/14895#0196b571-2baa-4c40-b1d7-26b43bb9e057/195-716

I’m having trouble understanding why it’s failing now. Sure, my changes touch input metrics but it only applies to beats receivers and not plain filebeat. Would you mind taking a quick look to see if you have any insights? Thanks!

@AndersonQ
Copy link
Member

@AndersonQ I’m seeing a failure in the TestInputMetricsFromPipeline test you added a month ago:

https://buildkite.com/elastic/beats-xpack-filebeat/builds/14895#0196b571-2baa-4c40-b1d7-26b43bb9e057/195-716

I’m having trouble understanding why it’s failing now. Sure, my changes touch input metrics but it only applies to beats receivers and not plain filebeat. Would you mind taking a quick look to see if you have any insights? Thanks!

@mauri870, I've run it several times on my machine and checked the CI runs on main. All passing. It might be flaky and I haven't caught it yet.

It might be the registry you're passing to the inputs handler. I'm fixing another thing right now. But I can try to help you later with it.

@mauri870
Copy link
Member Author

mauri870 commented May 9, 2025

It might be the registry you're passing to the inputs handler. I'm fixing another thing right now. But I can try to help you later with it.

Thanks for looking into it! I'll do some additional testing to make sure I pinpoint the reason, I suspect it is the addition of the inputs route that is breaking the test.

@mauri870 mauri870 marked this pull request as draft May 9, 2025 17:42
@mauri870 mauri870 force-pushed the factory-http-metrics branch from 6c92c49 to 58e2fe8 Compare May 12, 2025 15:48
@mauri870
Copy link
Member Author

mauri870 commented May 12, 2025

I refactored the test code to ensure it returns the correct data for the inputs and stats endpoints.

Here are a few things I encountered:

  1. The monitoring endpoint is started during receiver creation (CreateLogs), but inputs/returns [] if it's called before the receiver is started. After the receiver starts, the endpoint returns the correct data with inputs. I assume this is expected behavior, likely because the inputs haven't started yet. Just want to confirm my rationale is correct.

  2. I updated the code to use inputmon.AttachHandler, which seems to be the intended approach after the recent refactor in New input metrics API and add per-input metrics to libbeat pipeline client #42618.

  3. The tests for mbreceiver confirm that the inputs endpoint works as expected and returns the correct data, including the input name in the response. However, for fbreceiver, it doesn't work— even after starting the receiver, the endpoint always returns []. I might be missing something here, so any guidance would be appreciated.

@leehinman
Copy link
Contributor

3. The tests for mbreceiver confirm that the inputs endpoint works as expected and returns the correct data, including the input name in the response. However, for fbreceiver, it doesn't work— even after starting the receiver, the endpoint always returns []. I might be missing something here, so any guidance would be appreciated.

I'm pretty sure the problem is that the filebeat beater is created, and that already attaches metricbeat is still using the global registry. So I'm thinking metricbeat is "working" because it happens to be using the global namespace.

@mauri870
Copy link
Member Author

Closing, superseded by #44438.

@mauri870 mauri870 closed this May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.19 Automated backport to the 8.19 branch backport-9.0 Automated backport to the 9.0 branch cleanup Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Beats receivers should expose input metrics
5 participants