-
Notifications
You must be signed in to change notification settings - Fork 47
feat(sds): persistent storage for history #2741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| * | ||
| * If no storage backend is available, this behaves like {@link MemLocalHistory}. | ||
| */ | ||
| export class PersistentHistory extends MemLocalHistory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you extending another class instead of the interface? Avoid abstractions as a rule of thumb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extended so that we could get the implementations for the following functions without needing to reimplement:
push(...items: ContentMessage[]): number;
some(
predicate: (
value: ContentMessage,
index: number,
array: ContentMessage[]
) => unknown,
thisArg?: any
): boolean;
slice(start?: number, end?: number): ContentMessage[];
find(
predicate: (
value: ContentMessage,
index: number,
obj: ContentMessage[]
) => unknown,
thisArg?: any
): ContentMessage | undefined;
findIndex(
predicate: (
value: ContentMessage,
index: number,
obj: ContentMessage[]
) => unknown,
thisArg?: any
): number;There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but you already know you want to get rid of some of that with the optimisation so I would avoid the abstraction usage.
You can also extract those logic if necessary. Saving code is not always the best way forward. See my new comments where naming is confusing.
| this.restore(); | ||
| } | ||
|
|
||
| public override push(...items: ContentMessage[]): number { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, don't use abstraction. Its not worth the indirection you are bringing in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switched to composition!
| } | ||
|
|
||
| private persist(): void { | ||
| if (!this.storage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, does it make sense for it to be constructed without storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want the class to behave like MemLocalHistory if no storage is provided, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or perhaps, that should be handled one level above where MessageChannel chooses between MemLocalHistory and PersistentHistory -- suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want the class to behave like MemLocalHistory if no storage is provided, right?
So you are saying that if we instantiate a persistent local history, but there is no persistence to it, then we want it to behave like memory local history? Think about the footgun you are setting up for developers.
If there is no way to persist the history, then the class handling persisting history should not be instantiable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed:
if (storage instanceof PersistentStorage) {
this.storage = storage;
log.info("Using explicit persistent storage");
} else if (typeof storage === "string") {
this.storage = PersistentStorage.create(storage);
log.info("Creating persistent storage for channel", storage);
} else {
this.storage = undefined;
log.info("Using in-memory storage");
}d7aa504 to
92dd0ba
Compare
size-limit report 📦
|
|
Restructured the PR: |
df7d56b to
3e3c511
Compare
| export interface ILocalHistory { | ||
| length: number; | ||
| push(...items: ContentMessage[]): number; | ||
| some( | ||
| predicate: ( | ||
| value: ContentMessage, | ||
| index: number, | ||
| array: ContentMessage[] | ||
| ) => unknown, | ||
| thisArg?: any | ||
| ): boolean; | ||
| slice(start?: number, end?: number): ContentMessage[]; | ||
| find( | ||
| predicate: ( | ||
| value: ContentMessage, | ||
| index: number, | ||
| obj: ContentMessage[] | ||
| ) => unknown, | ||
| thisArg?: any | ||
| ): ContentMessage | undefined; | ||
| findIndex( | ||
| predicate: ( | ||
| value: ContentMessage, | ||
| index: number, | ||
| obj: ContentMessage[] | ||
| ) => unknown, | ||
| thisArg?: any | ||
| ): number; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does not seem to be the right place to define this interface. Also, it is the same as before (Pick<Array...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tackled here: #2745
fryorcraken
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is very confusing. It is unclear what is actually related to local history, vs the local storage interfaces.
Not sure how the message channel is supposed to use the persistent storage when there is no persistent storage implementing ILocalHistory.
| * at next push. | ||
| */ | ||
| export class MemLocalHistory { | ||
| export interface ILocalHistory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define the interface alongside the class that needs this interface, aka, message channel.
It is odd to define the interface along side one of the implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ILocalHistory is being implemented by LocalHistory (prev MemLocalHistory), so it needs it as well. Would you suggest moving it to MessageChannel?
Since the concept for LocalHistory belongs in this file, I believed it's good design to keep it close to the implementation. The interface exists because of the class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your observation that it's only being used once, removed it.
| } | ||
|
|
||
| export type MemLocalHistoryOptions = { | ||
| storage?: ChannelId | PersistentStorage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you doing here? Why would you use persistent storage for the Memory implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, no I got it, see my recent comment on the PR.
I would not have this kind of type switching here.
You can have a storagePrefix string that is applied in any case (whether you use browser localStorage or fs). It is relevant to both.
Then, for PersistentStorage, could we instead use the package.json browser feature and have 2 files:
browser-localStorage.ts
localStorage.ts
Both of them would expert a LocalStorage (what you did with PersistentStorage class, except that for the browser one, the LocalStorage class is just a thin wrap on the browser localStorage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, interesting, that's probably a neater solution.
| this.incomingBuffer = []; | ||
| this.localHistory = localHistory; | ||
| this.localHistory = | ||
| localHistory ?? new MemLocalHistory({ storage: channelId }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that this is not an API we actually want to expose, (ReliableChannel is), then it's fine to not have default.
| export interface HistoryStorage { | ||
| getItem(key: string): string | null; | ||
| setItem(key: string, value: string): void; | ||
| removeItem(key: string): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is that? it does not seem to be "History" right? It's supposed to be the interface for local storage, right? Call it ILocalStorage then
|
After more review, I now better understand what is done here: This is not the architecture I originally had in mind, which is fine. It just means that I did some pre-optimisation that now needs to be trashed. In this proposed architecture, there is only one implementation of
Then the last part |
434728f to
96ec51f
Compare
Ah, I see where the confusion came from. Thanks for the diagram (note to self to include it wherever possible), I can see how it may have been confusing since the structure changed.
Agreed, see my comment here: #2741 (comment)
Valid point, thanks! One caveat: the class is more than just "local storage" as it allows users to pass their custom storage providers, so |
…istentHistory` to use composition over inheritance for history management.
…e key generation.
…ifying history storage initialization to default to `localStorage`
…mproved compatibility.
…in message channels
…d refactor `MemLocalHistory` to support optional persistent storage.
… persistence, updating LocalHistory to utilize a unified Storage interface and enhancing tests for localStorage functionality
75e8e12 to
e396c3b
Compare
8503315 to
ab6d3cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces persistent message history storage for SDS (Scalable Data Sync), allowing message history to survive application restarts. The implementation provides automatic localStorage usage in browsers and file-based storage in Node.js, with support for custom storage providers. The changes maintain backward compatibility by making persistent storage optional - applications can opt out or continue using the in-memory-only mode.
Key Changes:
- Renamed
MemLocalHistorytoLocalHistorywith optional persistent storage support via a new Storage abstraction - Added platform-specific storage implementations: browser (localStorage) and Node.js (file system)
- Updated
MessageChannelto use persistent storage by default when a channelId is provided - Added message serialization/deserialization utilities to convert between
ContentMessageobjects and storable JSON format
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
packages/sds/src/message_channel/storage/node.ts |
New file: Node.js file-based storage implementation using fs module |
packages/sds/src/message_channel/storage/browser.ts |
New file: Browser localStorage wrapper for persistent storage |
packages/sds/src/message_channel/storage/message_serializer.ts |
New file: Serialization utilities for ContentMessage and HistoryEntry objects |
packages/sds/src/message_channel/storage/index.ts |
New file: Export configuration with browser field mapping for platform selection |
packages/sds/src/message_channel/local_history.ts |
Renamed from MemLocalHistory, added storage integration with save/load methods |
packages/sds/src/message_channel/local_history.spec.ts |
Updated tests to use new LocalHistory class and constructor API |
packages/sds/src/message_channel/message_channel.ts |
Updated to create LocalHistory with persistent storage by default using channelId as prefix |
packages/sds/src/message_channel/message_channel.spec.ts |
Added createTestChannel helper for in-memory testing, added localStorage persistence tests |
packages/sds/src/message_channel/persistent_storage.spec.ts |
New test suite for storage persistence and corruption handling |
packages/sds/src/message_channel/repair/repair.ts |
Updated type references from ILocalHistory to LocalHistory |
packages/sds/package.json |
Added browser field mapping to swap Node.js storage for browser storage |
packages/sds/karma.conf.cjs |
Updated to configure webpack aliases for browser storage in test environment |
.gitignore |
Added entries for allure-results directories |
Comments suppressed due to low confidence (3)
packages/sds/src/message_channel/local_history.ts:87
- Calling save() on every push operation could be a performance bottleneck for high-frequency message scenarios, especially since it serializes and writes the entire message history to storage each time. Consider implementing a debounced or batched save mechanism to reduce I/O operations while still maintaining data persistence. This is particularly important when multiple messages are pushed in rapid succession.
packages/sds/src/message_channel/local_history.ts:57 - When both customInstance and prefix are provided in the storage options, the prefix is silently ignored. Consider either: (1) adding validation to throw an error when both are provided, (2) documenting this precedence clearly in the LocalHistoryOptions type documentation, or (3) logging a warning when prefix is provided but ignored due to customInstance being present.
packages/sds/src/message_channel/local_history.ts:28 - The class documentation still says "In-Memory implementation" but the class now supports optional persistent storage via localStorage (browser) or file system (Node.js). Update the documentation to reflect that this class can use persistent storage when configured with a storage prefix or custom storage instance, and only falls back to in-memory storage when no storage configuration is provided.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private static deserializeCausalEntry( | ||
| entry: StoredCausalEntry | ||
| ): HistoryEntry { | ||
| return { | ||
| messageId: entry.messageId, | ||
| retrievalHint: entry.retrievalHint | ||
| ? hexToBytes(entry.retrievalHint) | ||
| : undefined | ||
| }; |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deserializeCausalEntry method is missing the senderId field from the deserialization. This field needs to be included to properly reconstruct HistoryEntry objects when loading from storage. Without it, deserialized causal history entries will be incomplete.
| private readonly filePath: string; | ||
|
|
||
| public constructor(storagePrefix: string, basePath: string = ".waku") { | ||
| this.filePath = join(basePath, `${storagePrefix}.json`); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The storagePrefix parameter is used directly in file path construction without validation or sanitization. If user-controlled input is used as the storagePrefix (e.g., from a channelId), it could contain path traversal sequences like "../" that allow writing files outside the intended basePath directory. Consider validating or sanitizing the storagePrefix to ensure it doesn't contain path separators or other potentially dangerous characters.
| this.filePath = join(basePath, `${storagePrefix}.json`); | |
| // Sanitize storagePrefix to prevent path traversal and invalid characters | |
| const safePrefix = storagePrefix.replace(/[^a-zA-Z0-9_-]/g, "_"); | |
| this.filePath = join(basePath, `${safePrefix}.json`); |
| ); | ||
| localStorage.setItem(this.storageKey, payload); | ||
| } catch (error) { | ||
| log.error("Failed to save messages to storage:", error); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error logging for localStorage.setItem should provide more context about the specific error type, particularly for QuotaExceededError which is common when localStorage is full. Consider logging a more specific message when this error occurs (e.g., "localStorage quota exceeded - consider reducing message history size") to help users understand and resolve the issue.
| log.error("Failed to save messages to storage:", error); | |
| if ( | |
| error && | |
| (error.name === "QuotaExceededError" || | |
| error.name === "NS_ERROR_DOM_QUOTA_REACHED" || // Firefox | |
| error.code === 22 || // Chrome, Safari | |
| error.code === 1014) // Firefox | |
| ) { | |
| log.error( | |
| "localStorage quota exceeded - consider reducing message history size.", | |
| error | |
| ); | |
| } else { | |
| log.error("Failed to save messages to storage:", error); | |
| } |
| import path from "path"; | ||
|
|
||
| module.exports = config; | ||
| import baseConfig from "../../karma.conf.cjs"; | ||
|
|
||
| export default function (config) { | ||
| baseConfig(config); | ||
|
|
||
| const storageDir = path.resolve(__dirname, "src/message_channel/storage"); | ||
|
|
||
| // Swap node storage for browser storage in webpack builds | ||
| config.webpack.resolve.alias = { | ||
| ...config.webpack.resolve.alias, | ||
| [path.join(storageDir, "node.ts")]: path.join(storageDir, "browser.ts"), | ||
| [path.join(storageDir, "node.js")]: path.join(storageDir, "browser.ts") | ||
| }; | ||
| } |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file uses ES module syntax (import/export default) but has a .cjs extension which indicates CommonJS. Additionally, it uses __dirname which is not available in ES modules without importing it from a special module.
Either change the file extension to .mjs or use CommonJS syntax with require() and module.exports. If using ES modules, you'll need to derive __dirname from import.meta.url.
| describe("localStorage persistence", function () { | ||
| // LocalStorage specific tests (browser) | ||
| before(function () { | ||
| if (typeof localStorage === "undefined") { | ||
| this.skip(); | ||
| } | ||
| }); | ||
|
|
||
| it("should restore messages from localStorage on channel recreation", async () => { | ||
| const persistentChannelId = "persistent-channel"; | ||
|
|
||
| const channel1 = new MessageChannel(persistentChannelId, "alice"); | ||
|
|
||
| await sendMessage(channel1, utf8ToBytes("msg-1"), callback); | ||
| await sendMessage(channel1, utf8ToBytes("msg-2"), callback); | ||
|
|
||
| expect(channel1["localHistory"].length).to.equal(2); | ||
|
|
||
| // Recreate channel with same storage - should load history | ||
| const channel2 = new MessageChannel(persistentChannelId, "alice"); | ||
|
|
||
| expect(channel2["localHistory"].length).to.equal(2); | ||
| expect( | ||
| channel2["localHistory"].slice(0).map((m) => m.messageId) | ||
| ).to.deep.equal([ | ||
| MessageChannel.getMessageId(utf8ToBytes("msg-1")), | ||
| MessageChannel.getMessageId(utf8ToBytes("msg-2")) | ||
| ]); | ||
| }); | ||
|
|
||
| it("should include persisted messages in causal history after restart", async () => { | ||
| const persistentChannelId = "persistent-causal"; | ||
|
|
||
| const channel1 = new MessageChannel(persistentChannelId, "alice", { | ||
| causalHistorySize: 2 | ||
| }); | ||
|
|
||
| await sendMessage(channel1, utf8ToBytes("msg-1"), callback); | ||
| await sendMessage(channel1, utf8ToBytes("msg-2"), callback); | ||
| await sendMessage(channel1, utf8ToBytes("msg-3"), callback); | ||
|
|
||
| const channel2 = new MessageChannel(persistentChannelId, "alice", { | ||
| causalHistorySize: 2 | ||
| }); | ||
|
|
||
| let capturedMessage: ContentMessage | null = null; | ||
| await sendMessage(channel2, utf8ToBytes("msg-4"), async (message) => { | ||
| capturedMessage = message; | ||
| return { success: true }; | ||
| }); | ||
|
|
||
| expect(capturedMessage).to.not.be.null; | ||
| expect(capturedMessage!.causalHistory).to.have.lengthOf(2); | ||
| // Should reference the last 2 messages (msg-2 and msg-3) | ||
| expect(capturedMessage!.causalHistory[0].messageId).to.equal( | ||
| MessageChannel.getMessageId(utf8ToBytes("msg-2")) | ||
| ); | ||
| expect(capturedMessage!.causalHistory[1].messageId).to.equal( | ||
| MessageChannel.getMessageId(utf8ToBytes("msg-3")) | ||
| ); | ||
| }); | ||
| }); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing localStorage cleanup after tests. The test suite creates localStorage entries with keys like "waku:sds:storage:persistent-channel" and "waku:sds:storage:persistent-causal" but doesn't clean them up in an afterEach hook. This could lead to test pollution affecting other tests. Add an afterEach hook to remove these localStorage entries similar to what's done in persistent_storage.spec.ts.
| export type StoredCausalEntry = { | ||
| messageId: string; | ||
| retrievalHint?: string; | ||
| }; |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StoredCausalEntry type is missing the senderId field. Based on the code in message_channel.ts (lines 456-457, 678-679), HistoryEntry objects contain messageId, retrievalHint, and senderId fields. The senderId field needs to be included in the serialization and deserialization logic to properly preserve causal history entries. This will cause data loss when messages are persisted and restored from storage.
| private static serializeCausalEntry(entry: HistoryEntry): StoredCausalEntry { | ||
| return { | ||
| messageId: entry.messageId, | ||
| retrievalHint: entry.retrievalHint | ||
| ? bytesToHex(entry.retrievalHint) | ||
| : undefined | ||
| }; |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serializeCausalEntry method is missing the senderId field from the serialization. This field needs to be included to match the HistoryEntry structure used throughout the codebase. Without it, the senderId information will be lost when messages are persisted.
| it("persists and restores messages", () => { | ||
| const history1 = new LocalHistory({ storage: { prefix: channelId } }); | ||
| history1.push(createMessage("msg-1", 1)); | ||
| history1.push(createMessage("msg-2", 2)); | ||
|
|
||
| const history2 = new LocalHistory({ storage: { prefix: channelId } }); | ||
|
|
||
| expect(history2.length).to.equal(2); | ||
| expect(history2.slice(0).map((msg) => msg.messageId)).to.deep.equal([ | ||
| "msg-1", | ||
| "msg-2" | ||
| ]); | ||
| }); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test "persists and restores messages" doesn't verify that causal history entries with their senderId fields are properly persisted and restored. Add test coverage to ensure that messages with non-empty causal history (including senderId fields) are correctly serialized and deserialized. This is important because the HistoryEntry type includes senderId, retrievalHint, and messageId fields.
Problem / Description
Message history is currently stored only in memory, causing SDS to start recovery afresh every time.
Solution
This PR introduces persistent message history that survives application restarts using localStorage (browser), with an option to provide custom storage providers.
Changes
MemLocalHistorynow optionally sets upStoragelocalStorageusage in browsers with serialisation supportMessageChannelwith persistent storage support, by defaultNotes
Checklist