Skip to content

Conversation

@evan-bradley
Copy link
Contributor

Description

The fact that Optional is a config type with private fields means it needs to manually continue the xconfmap.Validate call chain using one of its private values, since the reflection calls in xconfmap.Validate can't manually introspect them (and arguably shouldn't).

Link to tracking issue

Fixes #13579

@evan-bradley evan-bradley requested a review from a team as a code owner August 11, 2025 14:41
@evan-bradley evan-bradley requested a review from dmitryax August 11, 2025 14:41
@evan-bradley evan-bradley requested a review from jmacd August 11, 2025 14:55
@codecov
Copy link

codecov bot commented Aug 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.65%. Comparing base (dd957e4) to head (8ee4bfe).
⚠️ Report is 7 commits behind head on main.

❌ Your project status has failed because the head coverage (87.65%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13611      +/-   ##
==========================================
- Coverage   87.67%   87.65%   -0.03%     
==========================================
  Files         632      632              
  Lines       39697    39709      +12     
==========================================
  Hits        34806    34806              
- Misses       3648     3657       +9     
- Partials     1243     1246       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jmacd
Copy link
Contributor

jmacd commented Aug 11, 2025

This will fix the underlying issue in #13580. The test changes there would be nice to keep. @evan-bradley feel free to copy just the exporter/exporterhelper/internal/queuebatch/config_test.go portion to add testing here.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we depend on xconfmap everywhere because of this, or should we do xconfmap depend of optional, and there check if a public field is "ConfigOptional" and if it is get the value.

I like more to have xconfmap depend on configoptional, but want to hear others.

@evan-bradley
Copy link
Contributor Author

Should we depend on xconfmap everywhere because of this, or should we do xconfmap depend of optional, and there check if a public field is "ConfigOptional" and if it is get the value.

I like more to have xconfmap depend on configoptional, but want to hear others.

I would still prefer to have configoptional -> (x)confmap:

  1. We already have a dependency on confmap in configoptional (through Marshaler and Unmarshaler), so once the Validator interface is stabilized and moved from xconfmap to confmap, the xconfmap dependency will be eliminated.
  2. Having Validate make an exception for the Optional type will limit us to just this type. The approach in this PR could work for any similar wrapper type since all you have to do is implement Validator, which I think is a cleaner design.
  3. If configoptional -> confmap is an issue, we can move the Validator interface (or any of the confmap interfaces) to a configbase or similarly-named package that only contains the interfaces, then have both configoptional and confmap depend on that.

@github-actions github-actions bot requested a review from bogdandrutu August 12, 2025 19:34
@evan-bradley
Copy link
Contributor Author

Thanks everyone for the quick reviews.

I'm okay making this a bug fix, though I think we should consider making changelogs for affected components since configoptional is probably something most users aren't familiar with. I'll add and fix tests for the exporter helper in a follow up to this PR.

I've also updated this PR to include a few tests to check a few different cases. @jade-guiton-dd following up on our conversation during the 2025-08-11 stability call, I've still elected to continue to validate the default flavor of the Optional type. I think we want to make it easier for component authors to catch that their defaults aren't considered valid configuration without requiring them to call Unmarshal on the config type. For cases where it's intentional that the default isn't valid, they can test setting the Optional value both with Default and with Some to test validation, and can go through and call Unmarshal if they want to check behavior when the user sets valid values into an invalid default. Let me know if I've missed any cases or you still think we should reconsider this behavior.

@mx-psi
Copy link
Member

mx-psi commented Aug 13, 2025

I'm okay making this a bug fix, though I think we should consider making changelogs for affected components since configoptional is probably something most users aren't familiar with. I'll add and fix tests for the exporter helper in a follow up to this PR.

Makes sense to me to do this.

I've still elected to continue to validate the default flavor of the Optional type. I think we want to make it easier for component authors to catch that their defaults aren't considered valid configuration without requiring them to call Unmarshal on the config type.

I agree with this decision, also because it is easier to start validating it and stop doing it in the future than the other way around (since we would potentially break people's configs).

@jade-guiton-dd What do you think?

@mx-psi mx-psi requested a review from jade-guiton-dd August 13, 2025 08:58
@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Aug 13, 2025

I think we want to make it easier for component authors to catch that their defaults aren't considered valid configuration without requiring them to call Unmarshal on the config type. For cases where it's intentional that the default isn't valid, they can test setting the Optional value both with Default and with Some to test validation, and can go through and call Unmarshal if they want to check behavior when the user sets valid values into an invalid default.

I'm sorry, I don't understand this paragraph at all. What does Unmarshal have to do with validation? And how would a user be able to bypass the Validate check if the default not being valid is intentional?

I agree with this decision, also because it is easier to start validating it and stop doing it in the future than the other way around (since we would potentially break people's configs).

Technically, this PR would already break people's configs compared to the previous system (pointer-based optional fields).

If you had a very simple otlpreceiver-like config like this:

type MyConfig struct {
	GRPC *configgrpc.ServerConfig `mapstructure:"grpc"`
}
func (cfg *MyConfig) Unmarshal(conf *confmap.Conf) error {
	err := conf.Unmarshal(cfg)
	if err != nil {
		return err
	}
	if !conf.IsSet("grpc") {
		cfg.GRPC = nil
	}
}
func createDefaultConfig() component.Config {
	return &MyConfig{
		GRPC: configgrpc.NewDefaultServerConfig(),
	}
}

NewDefaultServerConfig() returns a barebones config, which is invalid because it's missing required fields like transport.

If you don't set a grpc: field, you get no errors. If you set a grpc: field without filling in the required fields, you would get a validation error. I think this is intuitive behavior.

Now updated to use configoptional:

type MyConfig struct {
	GRPC configoptional.Optional[configgrpc.ServerConfig] `mapstructure:"grpc"`
}
// No need for Unmarshal, great!
func createDefaultConfig() component.Config {
	return &MyConfig{
		GRPC: configoptional.Default(configgrpc.NewDefaultServerConfig()),
	}
}

With the current state of this PR, you would now get validation errors about GRPC subfields even if you've disabled GRPC entirely by not setting the grpc: field, which I find very counter-intuitive.

And importantly: I don't see any way with this PR that a component author could bypass these validation errors if the intent was to make those fields required.

The only reason this PR would only "technically" cause breakage is because the only known user of Default across core and contrib is the otlpreceiver, whose default config doesn't use configgrpc.NewDefaultServerConfig() directly; it fills some of the required fields so that an empty grpc: section is valid.

I think this PR only makes sense if we decide to more formally create a rule that any config struct (even substructs like configgrpc.ServerConfig or confignet.AddrConfig) must have defaults that pass validation, which I think is a pretty big constraint that the config structs in core are absolutely not fulfilling at the moment.

@evan-bradley
Copy link
Contributor Author

@jade-guiton-dd appreciate your thoughts here, thanks for all the details.

With the current state of this PR, you would now get validation errors about GRPC subfields even if you've disabled GRPC entirely by not setting the grpc: field, which I find very counter-intuitive.

Argh, I forgot to check for this. You're right, we don't want this behavior. I missed this in my testing, I've added a test that covers this now. The fact that all tests passed in this PR also indicates to me we lack tests around this elsewhere in the repo, so I will issue a few pre-req PRs to make sure we have this covered.

@mx-psi I had to go the other route on validating default values due to this behavior. It makes testing slightly harder for component authors, but is necessary since the Default and None flavors need to have identical behavior after unmarshaling, which is when Validate would be called.

I'm sorry, I don't understand this paragraph at all. What does Unmarshal have to do with validation? And how would a user be able to bypass the Validate check if the default not being valid is intentional?

Sorry, maybe I could have made that a bit clearer. Calling (*Optional).Unmarshal takes the flavor from default -> some in the case where the optional type was created with Default and the user passed a value. Therefore, to make it easier for component authors to validate configs created with Default, I wanted to have (*Optional).Validate validate default-flavored Optional values. However, as you pointed out, this will cause user-facing errors, so we can't take this route.

For your second question, my intent was that the user wouldn't come into play for this consideration and it would just be a way for component authors to be explicit about requiring certain fields inside optional config objects.

@mx-psi
Copy link
Member

mx-psi commented Aug 13, 2025

The only reason this PR would only "technically" cause breakage is because the only known user of Default across core and contrib is the otlpreceiver, whose default config doesn't use configgrpc.NewDefaultServerConfig() directly; it fills some of the required fields so that an empty grpc: section is valid.

That was my point: because of low adoption this would not break almost anybody today. And then if we end up facing the case you mention (e.g. in contrib) then we can remove the default validation.

I think this PR only makes sense if we decide to more formally create a rule that any config struct (even substructs like configgrpc.ServerConfig or confignet.AddrConfig) must have defaults that pass validation, which I think is a pretty big constraint that the config structs in core are absolutely not fulfilling at the moment.

But I see your point here and that's fair. I don't think we need to commit to anything specific but it does feel weird to do this.

@evan-bradley
Copy link
Contributor Author

Both of the contrib test failures are expected and will be fixed in a follow-up PR after this one is merged.

Copy link
Contributor

@jade-guiton-dd jade-guiton-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good to me. I find the tests a bit confusing, but I won't block merging on that.

return nil
}

func TestOptionalFileValidate(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this test a bit confusing. The test struct is pretty convoluted and tests multiple things at once, so it's not easy to convince myself that the test is correct and covers all the cases we care about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I've been erring on the side of testing too many things for this PR, but if it weakens the signal that we've covered the cases we want, I'll pare it down some or at least explain why we're doing what we are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reduced the struct size and made the test names more descriptive. The only excess now is the nested optional struct, which I left in there just to make sure we're properly calling xconfmap.Validate. Let me know if it looks better.

Copy link
Contributor

@jade-guiton-dd jade-guiton-dd Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot simpler to understand, thank you.

Maybe it would be worth it to add a test case for "invalid default + explicit", just in case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, done.

Comment on lines +77 to +78
cfg.Sizer = request.SizerType{}
require.EqualError(t, xconfmap.Validate(cfg), "`batch` supports only `items` or `bytes` sizer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@mx-psi mx-psi added this pull request to the merge queue Aug 20, 2025
Merged via the queue into open-telemetry:main with commit 74d02da Aug 20, 2025
72 of 77 checks passed
andrzej-stencel added a commit to andrzej-stencel/beats that referenced this pull request Sep 12, 2025
The property `flush_timeout` has been required since v0.134.0.
I believe this is a result of fixing config validation in open-telemetry/opentelemetry-collector#13611.
andrzej-stencel added a commit to elastic/beats that referenced this pull request Sep 18, 2025
* chore: update OTel Collector libraries to `v1.41.0`/`v0.135.0`

* fix(auditbeat): update ebpfevents to `v0.8.0`

Resolves dependency conflict on github.com/cilium/ebpf.

* make notice

* test: fix TestFilebeatOTelE2E integration test

The property `flush_timeout` has been required since v0.134.0.
I believe this is a result of fixing config validation in open-telemetry/opentelemetry-collector#13611.

* test: fix unit tests
mergify bot pushed a commit to elastic/beats that referenced this pull request Sep 25, 2025
* chore: update OTel Collector libraries to `v1.41.0`/`v0.135.0`

* fix(auditbeat): update ebpfevents to `v0.8.0`

Resolves dependency conflict on github.com/cilium/ebpf.

* make notice

* test: fix TestFilebeatOTelE2E integration test

The property `flush_timeout` has been required since v0.134.0.
I believe this is a result of fixing config validation in open-telemetry/opentelemetry-collector#13611.

* test: fix unit tests

(cherry picked from commit 16c4d9a)

# Conflicts:
#	NOTICE.txt
#	go.mod
#	go.sum
#	x-pack/filebeat/tests/integration/otel_test.go
#	x-pack/metricbeat/tests/integration/otel_test.go
mergify bot pushed a commit to elastic/beats that referenced this pull request Sep 25, 2025
* chore: update OTel Collector libraries to `v1.41.0`/`v0.135.0`

* fix(auditbeat): update ebpfevents to `v0.8.0`

Resolves dependency conflict on github.com/cilium/ebpf.

* make notice

* test: fix TestFilebeatOTelE2E integration test

The property `flush_timeout` has been required since v0.134.0.
I believe this is a result of fixing config validation in open-telemetry/opentelemetry-collector#13611.

* test: fix unit tests

(cherry picked from commit 16c4d9a)

# Conflicts:
#	NOTICE.txt
#	go.mod
#	go.sum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[configoptional] Implement Validate()?

5 participants