Open
Description
Search before asking
- I searched in the issues and found nothing similar.
Motivation
In the case of Flink‘s lookup joins with FullCacheLookupTable , within the same TaskManager process, each task currently creates a separate RocksDB instance to cache the table data.
I observed that each RocksDB instance fully caches the entire dataset of the table, which may lead to some memory and disk waste, especially when the lookup table is large.
Actually, I believe that for the same table within a single TaskManager, we only need one shared RocksDB instance to cache the data. All tasks in the same TaskManager could reuse this shared RocksDB, thereby saving substantial memory and disk overhead.
Solution
As proposed above, I suggest implementing a TaskManager-level shared RocksDB cache for the same lookup table.
Anything else?
No response
Are you willing to submit a PR?
- I'm willing to submit a PR!