Skip to content

[otel-arrow-rust] Adaptive array builders support appending nulls #534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
albertlockett opened this issue Jun 4, 2025 · 1 comment
Open
Assignees
Labels
enhancement New feature or request rust Pull requests that update Rust code

Comments

@albertlockett
Copy link
Member

In #473 we added basic functionality for adaptive array builders. We should extend this functionality to support appending nulls to the array

@albertlockett albertlockett self-assigned this Jun 4, 2025
@albertlockett albertlockett added enhancement New feature or request rust Pull requests that update Rust code labels Jun 4, 2025
@albertlockett albertlockett moved this to In Progress in OTel-Arrow Jun 4, 2025
@albertlockett albertlockett moved this from In Progress to Todo in OTel-Arrow Jun 4, 2025
@albertlockett
Copy link
Member Author

This addition to arrow-rs will be helpful apache/arrow-rs#7606

github-merge-queue bot pushed a commit that referenced this issue Jun 5, 2025
)

Part of: #533

Very rough implementation of adaptive array builders. This my "rust"
version of the builder's we've implemented in golang here:
https://github.com/open-telemetry/otel-arrow/blob/main/go/pkg/otel/common/schema/builder/record.go

The idea behind these is that when we're encoding OTAP records, we often
want to dynamically create columns in some record batch that that either
aren't added to the record batch (if all the values are null), or are
dictionary encoded with the smallest possible index, or are the native
array if the dictionary index would overflow. (Some of this was alluded
to in yesterday's SIG meeting).

The intended usage is something like this:
```rs
use otel_arrow_rust::encode::record::array::StringArrayBuilder;

let mut str_builder = StringArrayBuilder::new(ArrayOptions {
    nullable: true,
    dictionary_options: Some(DictionaryOptions {
        min_cardinality: u8::MAX.into(),
        max_cardinality: u16::MAX,
    }),
});

// maybe append some values
str_builder.append_value(&"a".to_string());

let result = str_builder.finish();

let mut fields = Vec::new();
let mut columns = Vec::new();

if let Some(result) = result {
  fields.push(Field:new("str", result.data_type, true));
  columns.push(result.array);
}

let record_batch = RecordBatch::try_new(
    Arc::new(Schema::new(fields)),
    columns
)
.expect("should work");
```

Followup work includes:
- null support #534
- additional datatype support:
#535
- optimize the conversion between Dict<u8> -> Dict<u16>
#536

---------

Co-authored-by: Laurent Quérel <[email protected]>
Co-authored-by: Laurent Quérel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request rust Pull requests that update Rust code
Projects
Status: Todo
Development

No branches or pull requests

1 participant