Add state/structure to OTAP batch + sorting/delta decoding transforms #468

albertlockett · 2025-05-20T22:00:35Z

Part of #449

This PR adds:

A new structure called OtapBatch that encapsulates all the record batches associated with a BatchArrowRecord. It provides efficient methods for setting & getting the record batches by payload type. This has been integrated into the decoder code.
Adds new otap::transform module, methods for sorting record batch by parent ID and removing delta encoding.
Adds helper methods for updating schema/field metadata, and invokes these methods to update the metadata with the current state of the record batches. (state including things like the encoding of the parent ID column, and the column sort order).

The scope of this PR has somewhat changed. I'll leave the original description below, for posterity.

Original Description:

Restores some of the code from #447 , but with a slightly different implementation of the "Version 2" attribute store.

This new version assumes that the record batch it receives is sorted by the parent ID.
- To do this sort, we first materialize the parent IDs using the code in this PR: feat: add function to materialize transport optimized parent ids #455 and then sort the record batch (see implementation of sort_by_parent_id).
This new store also lazily materializes the KeyValues, by returning an iterator from it's attribute_by_delta_id method (similar to what we did in [WIP] AttributesStore optimization and/or OTAP pdata research #447).
Note that this new attribute store implementation does not support random access. It expects attribute_by_delta_id to be called with the delta ID of each parent ID.

The benchmark results are interesting.. The v2 store gets slightly worse performance for small batch sizes (128), but gets much better performance for larger batch sizes (8092), which is encouraging!

materialize_parent_ids/v1_attr_store/128
                        time:   [7.6742 µs 7.6967 µs 7.7190 µs]

materialize_parent_ids/v2_attr_store/128
                        time:   [10.495 µs 10.515 µs 10.534 µs]

materialize_parent_ids/v1_attr_store/1536
                        time:   [118.21 µs 118.76 µs 119.32 µs]

materialize_parent_ids/v2_attr_store/1536
                        time:   [89.421 µs 89.582 µs 89.765 µs]

materialize_parent_ids/v1_attr_store/8092
                        time:   [1.2816 ms 1.2888 ms 1.2966 ms]

materialize_parent_ids/v2_attr_store/8092
                        time:   [464.49 µs 465.22 µs 466.00 µs]

Still very much WIP. One thing that needs to be sorted is where/when we apply this transform materialize parent IDs and sort the record batch, and how we track the state of the batch. This PR is currently doing it decode::decoder::Consumer::consume_bar, which may not be the right place.

We might consider not merging all the code in this PR straight away, as some of this work is exploratory/proof of concept

codecov · 2025-05-20T22:03:18Z

Codecov Report

Attention: Patch coverage is 93.91144% with 33 lines in your changes missing coverage. Please review.

Project coverage is 62.91%. Comparing base (a629271) to head (de776c9).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #468      +/-   ##
==========================================
+ Coverage   62.27%   62.91%   +0.63%     
==========================================
  Files         190      193       +3     
  Lines       27291    27696     +405     
==========================================
+ Hits        16995    17424     +429     
+ Misses       9760     9737      -23     
+ Partials      536      535       -1

Components	Coverage Δ
otap-dataflow	`73.76% <ø> (ø)`
beaubourg	`67.19% <ø> (ø)`
otel-arrow-rust	`67.37% <93.91%> (+3.07%)`	⬆️
query_abstraction	`81.42% <ø> (ø)`
syslog_cef_receivers	`99.17% <ø> (ø)`
otel-arrow-go	`52.93% <ø> (+0.02%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

albertlockett · 2025-05-20T22:03:38Z

@jmacd please take a look when you get a chance? This is some followup from last week's SIG meeting

albertlockett · 2025-05-21T21:17:11Z

One thing that needs to be sorted is where/when we apply this transform materialize parent IDs and sort the record batch, and how we track the state of the batch.

Link to thread about answering some of these questions: https://cloud-native.slack.com/archives/C08RRSJR7FD/p1747847973678329

albertlockett · 2025-05-22T21:22:47Z

Update 05/22 -- added some structure around OtapBatch and refactored decoder to use this. Started adding helper methods to update schema metadata when transforming record batches

rust/otel-arrow-rust/src/otap.rs

jmacd · 2025-05-22T23:56:36Z

rust/otel-arrow-rust/src/otap/schema.rs

+    /// keys for arrow schema/field metadata
+    pub mod metadata {
+        /// schema metadata for which columns the record batch is sorted by
+        pub const SORT_COLUMNS: &str = "sort_columns";


How are multiple columns supported? I've seen this set to a single column, so far. Would it make sense to have SORT_COL0, SORT_COL1, ... to indicate the successive sort orders?

I was thinking maybe we could have some delimiter. E.g. I'm imagining how we'd represent the default sorting of attributes from the golang exporter, it would be type,key,value . Do you think that'd work?

rust/otel-arrow-rust/src/otap/transform.rs

rust/otel-arrow-rust/Cargo.toml

rust/otel-arrow-rust/src/otlp/attributes/store.rs

albertlockett · 2025-05-23T15:46:11Z

Update 05/23 - after discussion w/ @jmacd , decided it's probably best to merge only what we are pretty sure we'll need in this PR, which does not include the AttributeStoreV2 or the benchmark for it. I'm going to stash that code on a branch called attribute-store-optimization-2 on my fork: https://github.com/open-telemetry/otel-arrow/compare/main...albertlockett:otel-arrow:attribute-store-optimizer-2?expand=1

This PR doesn't need to include AttributeStoreV2. It's still unclear whether or not we need Attribute Store at all to optimize encoding OTAP -> OTLP bytes. Rather than include code we don't need, we'll delete this code for now.

rust/otel-arrow-rust/src/decode/decoder.rs

albertlockett changed the title ~~Attribute store optimizer~~ [WIP] Optimized attribute store using sorted record batch May 20, 2025

albertlockett added 11 commits May 22, 2025 09:03

feat: added helper types for batches of arrow records

9a9f51b

add implementation of batch store for remaining signal types

1ce5ed2

made similar refactor for metrics

485b152

this compiles but is very messy

5743ca3

this is kind of working

0474d25

finished so tests pass

658a5bc

added bench and some fixes

9006e03

ton of cleanup

8da6220

more cleanup

dd5e69d

fix bench timing

1fd4b3b

revert arrow dependency change

fb17a49

albertlockett force-pushed the attribute-store-optimizer branch from 487e6ff to 15c033c Compare May 22, 2025 18:11

albertlockett added 3 commits May 22, 2025 14:39

removed legacy related data from logs

70490b1

fixed bugs found in test

1ed9943

added helper method for updating schema metadata

1709fa5

albertlockett force-pushed the attribute-store-optimizer branch from 15c033c to 1709fa5 Compare May 22, 2025 21:21

jmacd reviewed May 23, 2025

View reviewed changes

albertlockett added 3 commits May 23, 2025 07:41

it updates schema metadata and i cleanup

04a9e89

cleaned up how encodings are set, etc

3dedfca

make some things public required for using for demonstration

9155b28

albertlockett added 2 commits May 23, 2025 12:39

remove AttributeStoreV2

1fb2a0f

This PR doesn't need to include AttributeStoreV2. It's still unclear whether or not we need Attribute Store at all to optimize encoding OTAP -> OTLP bytes. Rather than include code we don't need, we'll delete this code for now.

fix accidently enabling prettyprint feature for arrow

850e44d

albertlockett changed the title ~~[WIP] Optimized attribute store using sorted record batch~~ [WIP] Add state/structure to OTAP batch + sorting/delta decoding transforms May 23, 2025

fix some missing test coverage

8a14aa2

albertlockett added 3 commits May 23, 2025 13:24

fix the clippies

745b5f1

fixed some TODOs

679a5ba

Remove test that won't pass

f0a4b23

albertlockett marked this pull request as ready for review May 23, 2025 18:27

albertlockett requested a review from a team as a code owner May 23, 2025 18:27

Merge branch 'main' into attribute-store-optimizer

953ffc6

jmacd approved these changes May 23, 2025

View reviewed changes

rust/otel-arrow-rust/src/decode/decoder.rs Show resolved Hide resolved

albertlockett changed the title ~~[WIP] Add state/structure to OTAP batch + sorting/delta decoding transforms~~ Add state/structure to OTAP batch + sorting/delta decoding transforms May 23, 2025

albertlockett added 2 commits May 23, 2025 14:58

fix deref

a4afc6e

Merge branch 'main' into attribute-store-optimizer

de776c9

jmacd merged commit ece6333 into open-telemetry:main May 23, 2025
41 checks passed

albertlockett mentioned this pull request May 30, 2025

[otel-arrow-rust] Add method to OtapBatch to recursively decode delta/quasi-delta IDs/parent-ids #512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add state/structure to OTAP batch + sorting/delta decoding transforms #468

Add state/structure to OTAP batch + sorting/delta decoding transforms #468

Uh oh!

albertlockett commented May 20, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 20, 2025 •

edited

Loading

Uh oh!

albertlockett commented May 20, 2025

Uh oh!

albertlockett commented May 21, 2025

Uh oh!

albertlockett commented May 22, 2025

Uh oh!

Uh oh!

jmacd May 22, 2025

Uh oh!

albertlockett May 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertlockett commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add state/structure to OTAP batch + sorting/delta decoding transforms #468

Add state/structure to OTAP batch + sorting/delta decoding transforms #468

Uh oh!

Conversation

albertlockett commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

albertlockett commented May 20, 2025

Uh oh!

albertlockett commented May 21, 2025

Uh oh!

albertlockett commented May 22, 2025

Uh oh!

Uh oh!

jmacd May 22, 2025

Choose a reason for hiding this comment

Uh oh!

albertlockett May 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertlockett commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertlockett commented May 20, 2025 •

edited

Loading

codecov bot commented May 20, 2025 •

edited

Loading