[otap-dataflow] parquet exporter flush open writers after timeout #499
Labels
enhancement
New feature or request
parquet-exporter
Parquet Exporter related tasks
pipeline
Rust Pipeline Related Tasks
rust
Pull requests that update Rust code
Uh oh!
There was an error while loading. Please reload this page.
After what was implemented in #488, the parquet exporter only flushes files if:
If neither of these things ever happens, the open writer will never flush the file. This means that data may not become visible for quite a long time after it was received. We should probably have some capability to periodically flush writers in order to prevent this.
There are multiple ways to go about this timing and we can consider the best approach or even using a combination. For example, from the perspective of some unflushed file, we could flush on a timeout computed from either:
The batch processor is currently using time since last batch https://github.com/open-telemetry/otel-arrow/pull/347/files
This should be implemented after prerequisites:
The text was updated successfully, but these errors were encountered: