Skip to content

Conversation

@danielaskdd
Copy link
Collaborator

Refactor: Enhance KG Editing with Chunk Tracking

Overview

Currently, knowledge graph editing does not synchronously update the chunk tracking storage, resulting in inconsistencies that disrupt entity and relation reconstruction after document deletion. This PR significantly improves the knowledge graph editing functionality by adding comprehensive chunk tracking synchronization.

Key Changes

1. Chunk Tracking Synchronization (Commits: 3fbd704, a3370b0)

  • Added chunk storage synchronization to all entity/relation edit operations
  • Implemented incremental chunk ID updates during entity operations
  • Added chunk cleanup on entity/relation deletion
  • Track chunks properly in entity/relation creation operations
  • Preserve chunk references when editing entities/relations

2. Graph Consistency Improvements (Commits: bf1897a, a3370b0)

  • Normalized entity pair ordering for undirected graph consistency
  • Ensured relation keys are consistently normalized across all operations
  • Updated API documentation to reflect undirected edge handling

3. Code Refactoring (Commit: 6015e8b)

  • Introduced unified _persist_graph_updates() function to eliminate code duplication
  • Removed duplicate persistence callback functions
  • Improved code maintainability and reduced complexity

Technical Details

Modified Files:

  • lightrag/utils_graph.py - Core graph utility functions with chunk tracking
  • lightrag/utils.py - Added utility functions for chunk tracking
  • lightrag/lightrag.py - Integration with chunk storage
  • lightrag/api/routers/graph_routes.py - API endpoint updates

Statistics:

  • Total additions: ~534 lines
  • Total deletions: ~196 lines
  • Net change: ~338 lines

Impact

  • Data Integrity: Chunk tracking ensures that chunk-entity/relation mappings remain synchronized
  • Graph Consistency: Normalized entity ordering prevents duplicate edges in undirected graphs
  • Code Quality: Unified persistence callback reduces maintenance overhead
  • API Stability: Better handling of edge cases in graph editing operations

Testing Recommendations

  • Test entity renaming with existing chunk references
  • Verify chunk cleanup on entity/relation deletion
  • Validate undirected graph consistency with various entity pair orders
  • Test concurrent editing operations

Outstanding issue

Entity merging operations still fail to update the corresponding text chunk tracking information.

• Add chunk storage sync to edit ops
• Implement incremental chunk ID updates
• Support entity renaming migrations
• Normalize relation keys consistently
• Preserve chunk references on edits
• Normalize entity pairs for storage
• Update API docs for undirected edges
• Clean up chunk storage on delete
• Track chunks in create operations
• Normalize relation keys consistently
- Add _persist_graph_updates function
- Remove duplicate callback functions
@danielaskdd danielaskdd merged commit 69b4cda into HKUDS:main Oct 26, 2025
1 check passed
@danielaskdd danielaskdd deleted the edit-kg-new branch October 27, 2025 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant