[exporter/elasticsearch] report connection health via componentstatus #39562

faec · 2025-04-22T19:18:40Z

Description

Report component status of the Elasticsearch exporter based on the response code when making ingestion requests.

Testing

Added unit test confirming status is reported on http response. Also tested manually with the collector and confirmed that the error conditions appear when querying collector health status.

linux-foundation-easycla · 2025-04-22T19:18:44Z

The committers listed above are authorized under a signed CLA.

✅ login: faec / name: Fae Charlton (a041a17, c1ca4b7, c76a944, 8a34da3, 1d0dda0, 3840af1, 4e11529, 9cfd1a9, db37cec, b9eafef, 0274bcc, 8359fb6)
✅ login: andrzej-stencel / name: Andrzej Stencel (31e1957, 91b631a)

…y-collector-contrib into elasticsearch-exporter-componentstatus

carsonip

Thanks! Using LogRoundTrip for health reporting is a bit hacky, but I reckon it is the easiest and the best way to achieve the goal because response parsing is done in go-docappender and there are no hooks.

exporter/elasticsearchexporter/esclient_test.go

cmacknz · 2025-04-30T15:30:17Z

exporter/elasticsearchexporter/esclient.go

+		} else if resp.StatusCode >= 300 {
+			// Error results
+			err := fmt.Errorf("Elasticsearch request failed: %v", resp.Status)
+			componentstatus.ReportStatus(
+				cl.componentHost, componentstatus.NewRecoverableErrorEvent(err))
+		}


Are all of these recoverable? I would think a 401 would be non-recoverable without human intervention.

409s are not an error at all necessarily, when documents are truly duplicates. Is it worth special casing those? There should probably be an indicator they are happening but if someone explicitly is doing de-duplication via an _id in the document this will mark the exporter as degraded unnecessarily.

"non-recoverable" in the collector is very unforgiving, to the extent that almost nothing we could encounter truly satisfies its definition (which is that the component can never again return to a healthy state no matter what for the life of the process -- so its real meaning isn't "requires human intervention" it's "requires intervention and a restart of the process"). For example I've seen some setups where ingest credentials are synced to the ingest workers before they're actually activated upstream (I don't know why, but it shouldn't break us), or similarly a 401 can be a side effect of broken or unreliable proxy settings that can be resolved without restarting the client process.

Good point about 409s though, those should be telemetry numbers rather than a component error state, I will add a special case for that.

Thanks, just special casing 409s makes sense to me then.

Change looks good, probably worth a test case for 409s specifically since they are now meant not to trigger recovery from an error state.

Thanks for the test!

…y-collector-contrib into elasticsearch-exporter-componentstatus

atoulme · 2025-05-02T06:49:35Z

Please address CI

…y-collector-contrib into elasticsearch-exporter-componentstatus

faec · 2025-05-02T14:31:28Z

Maybe I'm not understanding how to sync module versions -- "go get" on go.opentelemetry.io/collector/component/componentstatus adds version v0.125.0, but check-collector-module-version.sh in the CI seemed to want a pseudo version starting with v0.125.1. I tried running check-collector-module-version.sh locally but it fails for me, and I can't find the right way to run it or to update to the right module version (other than waiting for CI to fail and copying the value it reports, but that's slow and the version seems to change every couple days). Presumably there's some workflow I'm missing, where should I be looking?

…y-collector-contrib into elasticsearch-exporter-componentstatus

andrzej-stencel

👍

…ponentstatus

…open-telemetry#39562) #### Description Report component status of the Elasticsearch exporter based on the response code when making ingestion requests. #### Testing Added unit test confirming status is reported on http response. Also tested manually with the collector and confirmed that the error conditions appear when querying collector health status. --------- Co-authored-by: Andrzej Stencel <[email protected]>

Report elasticsearch exporter health via componentstatus

a041a17

github-actions bot added the exporter/elasticsearch label Apr 22, 2025

github-actions bot requested review from carsonip, JaredTan95 and lahsivjar April 22, 2025 19:18

faec added 3 commits April 22, 2025 16:44

Add a unit test for component status reporting

b9eafef

lint

3840af1

update componentstatus version

9cfd1a9

faec changed the title ~~Report elasticsearch exporter health via componentstatus~~ [exporter/elasticsearch] report connection health via componentstatus Apr 24, 2025

faec added 3 commits April 24, 2025 10:56

add changelog

c76a944

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

8359fb6

…y-collector-contrib into elasticsearch-exporter-componentstatus

go mod tidy

4e11529

faec marked this pull request as ready for review April 24, 2025 21:06

faec requested a review from a team as a code owner April 24, 2025 21:06

faec requested a review from dashpole April 24, 2025 21:06

github-actions bot assigned fatsheep9146 Apr 24, 2025

carsonip approved these changes Apr 25, 2025

View reviewed changes

exporter/elasticsearchexporter/esclient_test.go Outdated Show resolved Hide resolved

exporter/elasticsearchexporter/esclient_test.go Outdated Show resolved Hide resolved

cmacknz reviewed Apr 30, 2025

View reviewed changes

faec added 2 commits May 1, 2025 12:07

fix linter errors, special case ES duplicate ingestion errors

db37cec

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

1d0dda0

…y-collector-contrib into elasticsearch-exporter-componentstatus

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

8a34da3

…y-collector-contrib into elasticsearch-exporter-componentstatus

carsonip approved these changes May 6, 2025

View reviewed changes

faec added 2 commits May 8, 2025 16:04

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

c1ca4b7

…y-collector-contrib into elasticsearch-exporter-componentstatus

Include 409 special case in unit test

0274bcc

cmacknz approved these changes May 8, 2025

View reviewed changes

Merge branch 'main' into elasticsearch-exporter-componentstatus

31e1957

andrzej-stencel approved these changes May 9, 2025

View reviewed changes

Merge remote-tracking branch 'origin' into elasticsearch-exporter-com…

91b631a

…ponentstatus

andrzej-stencel merged commit e45dbe3 into open-telemetry:main May 9, 2025
173 checks passed

github-actions bot added this to the next release milestone May 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[exporter/elasticsearch] report connection health via componentstatus #39562

[exporter/elasticsearch] report connection health via componentstatus #39562

Uh oh!

faec commented Apr 22, 2025 •

edited

Loading

Uh oh!

linux-foundation-easycla bot commented Apr 22, 2025 •

edited

Loading

Uh oh!

carsonip left a comment

Uh oh!

Uh oh!

Uh oh!

cmacknz Apr 30, 2025

Uh oh!

faec May 1, 2025

Uh oh!

cmacknz May 1, 2025

Uh oh!

cmacknz May 1, 2025

Uh oh!

cmacknz May 8, 2025

Uh oh!

atoulme commented May 2, 2025

Uh oh!

faec commented May 2, 2025

Uh oh!

andrzej-stencel left a comment

Uh oh!

Uh oh!

Uh oh!

[exporter/elasticsearch] report connection health via componentstatus #39562

[exporter/elasticsearch] report connection health via componentstatus #39562

Uh oh!

Conversation

faec commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Uh oh!

linux-foundation-easycla bot commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carsonip left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cmacknz Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

faec May 1, 2025

Choose a reason for hiding this comment

Uh oh!

cmacknz May 1, 2025

Choose a reason for hiding this comment

Uh oh!

cmacknz May 1, 2025

Choose a reason for hiding this comment

Uh oh!

cmacknz May 8, 2025

Choose a reason for hiding this comment

Uh oh!

atoulme commented May 2, 2025

Uh oh!

faec commented May 2, 2025

Uh oh!

andrzej-stencel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

faec commented Apr 22, 2025 •

edited

Loading

linux-foundation-easycla bot commented Apr 22, 2025 •

edited

Loading