-
|
We need memory-efficient, CPU-efficient ways of caching large numbers of event keys so that we do not re-transmit the same things more than once (ideally.) However, as mentioned, we have limited memory so we'd like to evict things that aren't recently hot. A Cuckoo filter would be the ideal candidate (it supports deletions, unlike Bloom) but what we really need is Cuckoo + LRU eviction. Any suggestions for a Cuckoo Rust library for starters? I see only one paper on Cuckoo + LRU, and no Rust code for it, and really I see only one Rust Cuckoo implementation. I don't know how many bits would have to be added to each object for an LRU timer, but it would seem to me that this could be done very efficiently with just a few bits if you divide the memory space into "buckets" and just evict all the members of a bucket when capacity is reached instead of keeping timers/counts. You'd only need the number of bits in each object to match the bucket size. 5 bits, 32 buckets, etc. I'm sure someone more clever has reduced this even more significantly. We would be open to developing this feature (and probably will) but guidance on easier/better paths is welcome. Counter-point: we could just use a hash + LRU model, but that doesn't seem to be quite as memory-efficient as Cuckoo. References: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Thanks for bringing this up! The combination of Cuckoo filters with LRU eviction is an interesting approach for memory-efficient deduplication. Given the complexity and impact of this proposal, I think this would benefit from a RFC. This would help the community and maintainers evaluate the design more thoroughly and ensure it fits well with Vector's architecture. Some things to cover:
|
Beta Was this translation helpful? Give feedback.
Thanks for bringing this up! The combination of Cuckoo filters with LRU eviction is an interesting approach for memory-efficient deduplication. Given the complexity and impact of this proposal, I think this would benefit from a RFC. This would help the community and maintainers evaluate the design more thoroughly and ensure it fits well with Vector's architecture.
Some things to cover: