Skip to content

Conversation

@alexander-belikov
Copy link

@alexander-belikov alexander-belikov commented Nov 5, 2025


Related Issues

This PR adds TigerGraph as a new graph storage backend option for LightRAG, expanding the available graph database integrations alongside existing options (Neo4j, Memgraph, PostgreSQL, MongoDB, NetworkX).

Changes Made

1. New TigerGraph Storage Implementation (lightrag/kg/tigergraph_impl.py)

  • New file: Complete implementation of TigerGraphStorage class extending BaseGraphStorage

  • All abstract methods implemented:

    • Node operations: has_node, get_node, get_nodes_batch, upsert_node, delete_node, remove_nodes
    • Edge operations: has_edge, get_edge, get_edges_batch, upsert_edge, remove_edges
    • Graph traversal: get_node_edges, get_nodes_edges_batch, get_knowledge_graph
    • Query operations: node_degree, node_degrees_batch, edge_degree, edge_degrees_batch
    • Label operations: get_all_labels, get_popular_labels, search_labels
    • Chunk operations: get_nodes_by_chunk_ids, get_edges_by_chunk_ids
    • Bulk operations: get_all_nodes, get_all_edges
    • Lifecycle: initialize, finalize, index_done_callback, drop
  • Key features:

    • Uses pyTigerGraph Python driver with async wrappers (asyncio.to_thread)
    • URI-based connection pattern (similar to Neo4j): TIGERGRAPH_URI
    • Workspace isolation using workspace label as vertex type name
    • Automatic schema creation on initialization
    • Undirected edge support (DIRECTED edge type for compatibility)
    • Chinese text support in label search
    • Retry logic with tenacity for write operations
    • Error handling and logging consistent with other backends

2. Storage Registry Updates (lightrag/kg/__init__.py)

  • Added TigerGraphStorage to GRAPH_STORAGE implementations list
  • Added environment variable requirements: TIGERGRAPH_URI, TIGERGRAPH_USERNAME, TIGERGRAPH_PASSWORD
  • Added module mapping: "TigerGraphStorage": ".kg.tigergraph_impl"

3. Configuration Examples

  • env.example: Added TigerGraph configuration section with:

    • TIGERGRAPH_URI (default: http://localhost:9000)
    • TIGERGRAPH_USERNAME (default: tigergraph)
    • TIGERGRAPH_PASSWORD (required)
    • TIGERGRAPH_GRAPH_NAME (default: lightrag)
    • TIGERGRAPH_WORKSPACE (optional, for workspace override)
  • config.ini.example: Added [tigergraph] section with corresponding configuration options

Implementation Details

  • Schema Design: Follows Neo4j pattern with workspace-based vertex types
  • Primary Key: Uses entity_id as vertex primary key (STRING type)
  • Properties: Supports entity_type, description, keywords, source_id with dynamic attribute support
  • Edge Type: Single "DIRECTED" undirected edge type for all relationships
  • Async Compatibility: All synchronous pyTigerGraph calls wrapped in asyncio.to_thread() for async compatibility
  • Connection Management: Uses get_data_init_lock() and get_graph_db_lock() for thread-safe initialization

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

Testing Requirements

  • Requires a running TigerGraph instance for testing
  • Default connection: http://localhost:9000
  • Graph must be created or will use default graph name from configuration
  • Schema is automatically created on first initialization

Dependencies

  • pyTigerGraph:Install via pip install -e ".[offline-storage]" or automatically installed via pipmaster
  • No changes to existing dependencies or requirements files

Compatibility

  • Follows the same patterns as Neo4j implementation for consistency
  • Compatible with existing LightRAG workflows
  • Maintains workspace isolation similar to other backends
  • Supports all query modes (local, global, hybrid, naive, mix)

Potential Future Enhancements

  • Batch query optimization for better performance with large datasets
  • TigerVector integration for native vector search capabilities
  • Custom GSQL queries for complex graph traversals
  • Connection pooling optimization for high-throughput scenarios

Known Limitations

  • Some operations (like get_nodes_batch) iterate through nodes sequentially due to TigerGraph API limitations
  • Schema creation uses GSQL commands which require appropriate permissions
  • Large graph queries may need optimization for production use

def _search_labels():
try:
# Get all vertices and filter
vertices = self._conn.getVertices(workspace_label, limit=100000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying via traversal is inefficient and not recommended.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def _get_popular_labels():
try:
# Get all vertices and calculate degrees
vertices = self._conn.getVertices(workspace_label, limit=100000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying via traversal is inefficient and not recommended.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

edges_dict[node_id] = edges if edges is not None else []
return edges_dict

async def get_nodes_by_chunk_ids(self, chunk_ids: list[str]) -> list[dict]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_nodes_by_chunk_ids is deprecated

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


return await asyncio.to_thread(_get_nodes_by_chunk_ids)

async def get_edges_by_chunk_ids(self, chunk_ids: list[str]) -> list[dict]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_edges_by_chunk_ids is deprecated

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants