Skip to content

Conversation

@tilakraj94
Copy link

@tilakraj94 tilakraj94 commented Dec 22, 2025

This PR adds Sync() that ensures tombstones are durably persisted after stream purge operations. After purge completes, the message block and index files may become 0 bytes (whole directory moves to new dir). New changes might reflect partially in disk, In this state, the tombstone becomes the sole persistent record of the stream's last sequence number.

Without an explicit sync, the tombstone remains buffered in the OS page cache and is lost on power failure, leaving the stream unable to recover its sequence history and resulting in 0,0 state.

Signed-off-by: Tilak Raj [email protected]

@tilakraj94 tilakraj94 requested a review from a team as a code owner December 22, 2025 19:27
@tilakraj94 tilakraj94 changed the title fsync while writting last tombstone in purge fsync while writing last tombstone in purge Dec 22, 2025
@MauriceVanVeen
Copy link
Member

Do you have a test or reproduction for this case? What is your setup, is this a single-node R1 stream or replicated?

Don't think the explicit sync is needed there though. If the setup is single-node R1 and you're more vulnerable to power failure, you'd probably want to configure sync: always in the server config instead. That would then already perform the sync as part of fs.writeTombstone itself.

@tilakraj9560
Copy link

tilakraj9560 commented Dec 23, 2025

Do you have a test or reproduction for this case? What is your setup, is this a single-node R1 stream or replicated?

Don't think the explicit sync is needed there though. If the setup is single-node R1 and you're more vulnerable to power failure, you'd probably want to configure sync: always in the server config instead. That would then already perform the sync as part of fs.writeTombstone itself.

Yes, this is a single R1 setup with the default sync interval. I’ll try to share a working example, but I’ve observed this behavior in older versions as well.

Using sync: always would address the issue, but that doesn’t mean state should be lost when sync: always is not enabled. The real problem occurs when the OS crashes immediately after a purge call has already returned success to the client (This only makes changes to OS Page Cache) unless sync: always. This case would lose OS caches, also why this happens with Purge, because whole directory moves to __msgs__ (older blks and index.db), so new directory/file changes (new blk and new index.db) could reflect partially.

This will not happen for replicated stream because of RAFT.

@tilakraj9560
Copy link

@MauriceVanVeen !lmb.syncAlways && fs.cfg.Replicas == 1 && lmb.mfd != nil I have added few conditions to the explicit sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants