Description
The current file based storage implementation is based on rusqlite
. It seems to be rather slow in practice, and we don't really depend on the SQL features. @bluskript suggested to replace it with something else like RocksDB #302. I think this is worth investigating and making the change if it improves performance.
Similar to the binary encoding we use, the database itself should be mature, have a stable storage format, and perform well. These are explicit goals of the RocksDB project, making it a good candidate.
Changes
I imagine the following changes to be part of this transition:
-
One of the main challenges will be maintaining the index of file paths from the root node. The paths coming from root are indexed by partial symbol stacks, and originate from multiple files. Therefore, it should be possible to add paths belonging to a particular file to this shared index, but also to remove the paths from a particular files from the index, while keeping the rest.
- Perhaps the problem is slightly simplified if we add an intermediate step. Per file an index of partial symbol stacks for root paths, and one global index for partial symbol stacks pointing to files contributing paths with that stack.
-
See if we can exploit RocksDB's range based indexing to efficiently load/remove data from the database (e.g., all paths belong to a single file). This will require careful key design.
-
Drop the support for having an in-memory database. RocksDB doesn't support this, and once Allow path stitcher to work on
SQLiteReader
#291 is merged, switching between a database and in-memory data structures should have little impact on code. -
Find more neutral naming for the storage classes. Currently, they explicitly mention
SQL
, but we should use something to suggest disk, file-based, or persistent storage, without exposing the implementation.