Skip to content

Allow specifying extra extensions to enable in Druid #416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

soenkeliebau
Copy link
Member

Description

Currently the only way to enable extensions not enabled by default is to use config overrides.
This has the drawback that users need to specify all extensions to load, so basically anticipate what extensions the operator will enable and then add the ones they want.

This PR adds a field additionalExtensions to the clusterConfig - these extensions will be added to any extensions the operator deems are needed, so the user can just specify the extensions the care about here.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes
# Author
- [ ] Changes are OpenShift compatible
- [ ] CRD changes approved
- [ ] Helm chart can be installed and deployed operator works
- [ ] Integration tests passed (for non trivial changes)
# Reviewer
- [ ] Code contains useful comments
- [ ] (Integration-)Test cases added
- [ ] Documentation added or updated
- [ ] Changelog updated
- [ ] Cargo.toml only contains references to git tags (not specific commits or branches)
# Acceptance
- [ ] Feature Tracker has been updated
- [ ] Proper release label has been added

Once the review is done, comment bors r+ (or bors merge) to merge. Further information

Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in principle.

Looking at e.g. https://github.com/orgs/stackabletech/discussions/15 I think we might want to add dedicated extensions in the future - in this case for google cloud extensions with bucket name and credentials secret.
The mechanism from this PR might co-exist, but I would like to model them together and not go with a Vec for now and later e.g. realizing we want a BTreeMap<String, Whatever>

Let's wait for Arch meeting to discuss CRD change

@@ -38,5 +38,18 @@ pub fn get_extension_list(
extensions.push(EXT_S3.to_string());
}

// Add user specified extensions to the list of loaded extensions if any are present
for additional_extension in &druid.spec.cluster_config.additional_extensions {
if extensions.contains(additional_extension) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O(n^2) here we come :) Just kidding, you could use a BTreeMap - feel free to ignore

@soenkeliebau
Copy link
Member Author

Looking at e.g. https://github.com/orgs/stackabletech/discussions/15 I think we might want to add dedicated extensions in the future - in this case for google cloud extensions with bucket name and credentials secret. The mechanism from this PR might co-exist, but I would like to model them together and not go with a Vec for now and later e.g. realizing we want a BTreeMap<String, Whatever>

Agree in principle, but for this specific example, wouldn't that be better handled in the deep storage config?

@sbernauer
Copy link
Member

I think the gcs extension things includes more stuff such as ingesting data from gcs, not only deep storage

@fhennig
Copy link
Contributor

fhennig commented Nov 20, 2023

Hey @sbernauer, I'm reviving this again because we had another user as for how to add extensions, I think it was the avro extension. It doesn't have any configuration, so a list would be perfectly fine. I'd really like to make this easier for users as it is currently quite easy to make a mistake when editing the list by hand in the overrides. The avro extension doesn't have any settings, so a list would work here.

Would you maybe like to expand on your design a bit? Maybe we can get something fleshed out to implement soonish.

I have also an alternative idea, which is to just load a lot more extensions by default - just thinking out loud here, I don't really know, are there any downsides to, let's say, load the avro extensions always? It's already in the image, so it's not about space.

@sbernauer
Copy link
Member

Have not spend many thoughts on this, my only requirement is that we can add properties to extensions. Does not need to be right now, just give us the chance to add that non-breaking. For avro specifically I don't see a reason to not always enabled it.
CRD could look something like

kind: DruidCluster
spec:
  clusterConfig:
    additionalExtensions:
      google: # optional
        credentialsSecret: my-google-credentials # mandatory, gets put into GOOGLE_APPLICATION_CREDENTIALS
        bucket: my-bucket # mandatory
        prefix: / # mandatory, defaults to /
        maxListingLength: 1024 # optional
      avro: {} # optional

As a first step we could simply add all extensions that don't need any configuration options (such as avro)

@fhennig
Copy link
Contributor

fhennig commented Nov 21, 2023

Okay yes seems reasonable and also still easy to use.

@labrenbe
Copy link
Member

labrenbe commented May 10, 2024

Closing as this is already implemented by #547.

@labrenbe labrenbe closed this May 10, 2024
@razvan razvan deleted the feat/additional_extensions branch December 15, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants