Releases: HKUDS/LightRAG
v1.4.9.8
What's New
- Feat: Add PDF Decryption Support for Password-Protected Files by @danielaskdd in #2296
- Feat: Add optional Langfuse observability integration by @anouar-bm in #2298
- Feat: Add RAGAS evaluation framework for RAG quality assessment by @anouar-bm in #2297
- Feat: Add native gemini LLM support by @Humphryshikunzi in #2305
What's Changed
- Refact: Auto-refresh of Popular Labels When Pipeline Completes by @danielaskdd in #2291
- Fix empty context validation bug and improve naming consistency in query context building by @danielaskdd in #2295
- Refact: Enhanced RAG Evaluation CLI with Two-Stage Pipeline and Improved UX by @danielaskdd in #2311
- Refact: Separate Configuration of RAGAS for LLM and Embeddings by @danielaskdd in #2314
- Refactor: Remove Deprecated Chunk-Based Query Methods and Improve Graph Unit Test by @danielaskdd in #2319
- Fix node retrieval fail with special characters in IDs for Postgres AGE GraphStorage by @danielaskdd in #2320
- Fix performance bottleneck in document deletion by @danielaskdd in #2321
New Contributors
- @anouar-bm made their first contribution in #2298
- @Humphryshikunzi made their first contribution in #2305
Full Changelog: v1.4.9.7...v1.4.9.8
v1.4.9.7
Important Notes
This update requires qdrant-client version 1.11.0 or later (due to the use of tenant indexing) when using Qdrant. Data migration may take a significant amount of time for large datasets.
What's Changed
- Refactor: Qdrant Multi-tenancy with Payload-Based Partitioning by @Anush008 in #2247
- Refact: Enhance Property editing UI for KG Nodes by @danielaskdd in #2287
- Fix: Add PyCryptodome dependency for encrypted PDF processing by @danielaskdd in #2289
- Fix: Clean Residual Edges from VDB During Entity Deletion by @danielaskdd in #2290
New Contributors
Full Changelog: v1.4.9.6...v1.4.9.7
v1.4.9.6 Hotfix
What's Changed
- Restore query generation example and fix README path reference by @danielaskdd in #2281
- Refact: Graceful shutdown and signal handling in Gunicorn Mode by @danielaskdd in #2280
- HotFix: Include swagger-docs static files in package distribution by @danielaskdd in #2284
Full Changelog: v1.4.9.5...v1.4.9.6
v1.4.9.5
Important Notes
- 🚀 PostgreSQL migration performance problem for large dataset in v1.4.9.4
- 🛑 Introduces a graceful pipeline cancellation mechanism for document processing operations
- ✏️ Enable entity merging when renaming an entity to a target entity that already exists.
What's New
- Feat: Add Pipeline Cancellation Feature with Enhanced Reliability by @danielaskdd in #2258
- Allow users to provide keywords with QueryRequest by @Mobious in #2253
- Refact: Add offline Swagger UI support with custom static file serving by @danielaskdd in
- Refactor: Enhanced Entity Merging with Chunk Tracking by @danielaskdd in #2266
What's Fixed
- Fix: PostgreSQL Data Migration Performance Problem by @danielaskdd in #2259
- Fix: Ensure Storage Consistency When Creating Implicit Nodes from Relationships by @danielaskdd in #2262
- Refactor: Enhance KG Editing with Chunk Tracking by @danielaskdd in #2265
#2273 - Update redis requirement from <7.0.0,>=5.0.0 to >=5.0.0,<8.0.0 by @dependabot[bot] in #2272
- Fix Entity Source IDs Tracking ProblemDuring Relationship Processing by @danielaskdd in #2279
New Contributors
Full Changelog: v1.4.9.4...v1.4.9.5
v1.4.9.4
Important Notes: Eliminate Bottlenecks in Processing Large-scale Datasets
In production deployments, entity and relation metadata can grow unbounded as documents are continuously ingested. The source_id (chunk IDs) and file_path fields in entities and relations can accumulate thousands of entries, leading to:
- Performance degradation in vector database operations
- Increased storage costs
- Memory pressure during query operations
- Slower merge operations when processing new documents
LightRAG implements a configurable metadata size control system with two key features:
- Source ID limiting: Controls the maximum number of chunk IDs stored per entity/relation
- File path limiting: Controls the maximum number of file paths displayed in metadata (display-only, doesn't affect query performance)
Both features support two strategies:
- FIFO (First In First Out): Removes oldest entries when limit is reached. Best for evolving knowledge bases, keeps most recent information.
- KEEP: Keeps oldest entries, skips new ones when limit is reached. Best for stable knowledge bases, faster (fewer merge operations)
New environment variables with default values:
# Source ID limits (affects query performance)
MAX_SOURCE_IDS_PER_ENTITY=300
MAX_SOURCE_IDS_PER_RELATION=300
SOURCE_IDS_LIMIT_METHOD=FIFO
# File path limits (display only)
MAX_FILE_PATHS=100
Auto Data Migration
Upgrading to this version requires data migration. If your current system contains a large number of entity relationships, the upgrade process may take an extended period of time.
What's New
- Feat: Add offline Docker build support with embedded models and cache by @danielaskdd in #2222
- Refact: Limit Vector Database Metadata Size to Support Large Scale Dataset by @danielaskdd in #2240
- Feat: Add Optional LLM Cache Deletion for Document Deletion by @danielaskdd in #2244
- Refact: Add Entity Identifier Length Truncation to Prevent Storage Failures by @danielaskdd in #2245
- Refact: Add Multimodal Processing Status Support to DocProcessingStatus for RayAnything Compatibility by @danielaskdd in #2248
What's Changed
- Refact: Improve query result with semantic null returns by @danielaskdd in #2218
- remove deprecated dotenv package. by @wkpark in #2229
- Refact: Frontend UI Fixes and Performance Improvements by @danielaskdd in #2234
- Security: Fix SQL injection vulnerabilities in PostgreSQL storage by @lucky-verma in #2235
- Update openai requirement from <2.0.0,>=1.0.0 to >=1.0.0,<3.0.0 by @dependabot[bot] in #2238
- Update pandas requirement from <2.3.0,>=2.0.0 to >=2.0.0,<2.4.0 by @dependabot[bot] in #2239
- Optimize PostgreSQL initialization performance by @yrangana in #2237
- fix(docs): correct typo "acivate" → "activate" by @xiaojunxiang2023 in #2243
New Contributors
- @wkpark made their first contribution in #2229
- @lucky-verma made their first contribution in #2235
- @dependabot[bot] made their first contribution in #2238
- @xiaojunxiang2023 made their first contribution in #2243
Full Changelog: v1.4.9.3...v1.4.9.4
v1.4.9.3
Important Notes
- Add temporary solution implemented to ensure compatibility with the newly introduced document status in Rayanything.
- Frontend build artifacts is removed from git repo now, manual build action is required after cloning/pulling the repo.
What's Changed
- Refactor: WebUI Optimization and Simplification by @danielaskdd in #2198
- i18n: fix mustache brackets by @zl7261 in #2196
- fix: advise excluding dev dependencies in prod build by @kevinnkansah in #2201
- Exclude Frontend Build Artifacts from Git Repository by @danielaskdd in #2208
- Add PREPROCESSED (multimodal_processed) status for multimodal document processing by @danielaskdd in #2211
Full Changelog: v1.4.9.2...v1.4.9.3
v1.4.9.2
What's New
- feat: Add endpoint and UI to retry failed documents by @RooseveltAdvisors in #2168
- Refactor(webui): Improve document tooltip display with track ID and better formatting by @danielaskdd in #2170
- feat: add options for Postgres connection by @kevinnkansah in #2172
- feat: Add token tracking support to openai_embed function by @yrangana in #2181
- Add knowledge graph manipulation endpoints by @NeelM0906 in #2183
- Feat: Add Comprehensive Offline Deployment Solution by @danielaskdd in #2194
What's Fixed
- Fix: Add file_path field to full_docs storage by @danielaskdd in #2171
- Fixed typo in log message when creating new graph file by @aleksvujic in #2178
- Fixed: Add PostgreSQL Connection Retry Mechanism with Network Robustness by @danielaskdd in #2192
- Adding support for imagePullSecrets, envFrom, and deployment strategy in Helm chart by @tcyran in #2175
- Hotfix: Preserve ordering in get_by_ids methods across all storage implementations by @danielaskdd in #2195
- Update Web Dependencies by @kevinnkansah in #2193
New Contributors
- @RooseveltAdvisors made their first contribution in #2168
- @aleksvujic made their first contribution in #2178
- @kevinnkansah made their first contribution in #2172
- @yrangana made their first contribution in #2181
- @NeelM0906 made their first contribution in #2183
- @tcyran made their first contribution in #2175
Full Changelog: v1.4.9.1...v1.4.9.2
v1.4.9.1
What's Changed
- Feature(webui): Add KaTeX chemical formula rendering by supporting mhchem extension by @danielaskdd in #2154
- Feat(webui): Prevent LaTeX Parsing Errors Show-up During Streaming by @danielaskdd in #2155
- Fix dark mode graph labels for system theme and improve colors by @roman-marchuk in #2163
- web_ui: check node source and target by @zl7261 in #2156
Full Changelog: v1.4.9...v1.4.9.1
v1.4.9
Importance Notes
v1.4.9 introduces key enhancements focused on refining the reference output format and incorporating structured references into query results. All query API endpoints now include a references field, enabling frontend applications to retrieve cited documents and their corresponding identifiers associated with LightRAG query results.
The context format sent to the LLM has been updated. The streaming response from the LLM now includes a references field (ignored by the frontend by default). If your application relies on context data returned by the LightRAG query API, you may need to update your code accordingly.
By leveraging the user_prompt parameter, users can instruct the LLM to generate responses with footnote annotations. The footnote numbers in the LLM output can be seamlessly mapped to the document IDs returned in the references field. This integration enables tighter alignment between LightRAG and your business system, empowering users to access original source materials directly.
user_prompt act as additional output instruction for LLM. Here provide two examples:
user_promptfor gpt-4.1-mini or Qwen3
For inline citations, employ the footnote reference format `[^1]`, where the `^` following the opening square bracket denotes a superscript link. When multiple citations are required at a single location, enclose each reference ID within separate footnote markers (e.g., `[^1][^2][^3]`).
user_promptfor DeepSeek:
内嵌引文标注使用Markdown脚注格式`[^1]`, 当某处有多个引文标注时,应将每个引用ID分别置于独立的中括号内(例如:`[^1][^2][^3]`)。仅需对回答的关键事实和依据信息给出标注。出现在引文标中的引文ID都应该列在最后生成的参考文献中段落中。不要在参考文献段落之后生成脚注段落。
This screenshot illustrates the functionality of the user_prompt parameter in the WebUI:
What's New
- Refactor: Provide Citation Context to LLM and Improve Reference Section Generation Quality by @danielaskdd in #2140
- Feature: Add Reference List Support for All Query Endpoints by @danielaskdd in #2147
- Refactor(WebUI): Change Client-side Search Logic with Server-driven Enity Name Search by @danielaskdd in #2124
- Feature(webui): Force sending history messages in bypass mode by @danielaskdd in #2132
- Feature(webui): Add footnotes support to markdown rendering in chat messages by @danielaskdd in #2145
- Feature(webui): Add user prompt history dropdown to query settings by @danielaskdd in #2146
What's Fixed
- Add path traversal security validation for file deletion operations by @danielaskdd in #2113
- Fix WebUI: Enhance tooltip readability by fix tooltip text wrapping of error message by @danielaskdd in #2114
- Fix Retrieval Page Parameter Options: Enforce Mutual Exclusivity Between "Only Need Context" and "Only Need Prompt" by @Saravanakumar26 in #2118
- Refactor: Optimize Query Prompts and User Prompt Handling by @danielaskdd in #2127
- WebUI Bugfix and Improvement by @danielaskdd in #2129
- Fix: Restore browser autocomplete functionality in message input box by @danielaskdd in #2131
- feat: Implement Comprehensive Document Duplication Prevention System by @danielaskdd in #2135
- Refactor node type legend and color mapping by @danielaskdd in #2137
- Fix typo: "Oputput" -> output by @SeungAhSon in #2139
- Feature: Add Enhanced Markdown Support for WebUI by @danielaskdd in #2143
- Fix: Robust clipboard functionality with fallback strategies by @danielaskdd in #2144
- Optimize Footnote Marker Display in WebUI by @danielaskdd in #2151
- Fix double query problem by add aquery_llm function for consistent response handling by @danielaskdd in #2152
- Web UI - center the loading icon and adjust GraphSeach width by @zl7261 in #2150
New Contributors
- @Saravanakumar26 made their first contribution in #2118
- @SeungAhSon made their first contribution in #2139
- @zl7261 made their first contribution in #2150
Full Changelog: v1.4.8.2...v1.4.9
v1.4.8.2
What's Changed
- Refactor: Add error handling with chunk ID prefixing in entity extraction by @danielaskdd in #2107
- fix: resolve dark mode text visibility issue in knowledge graph view by @roman-marchuk in #2106
- Fix: Resolve DocumentManager UI freezing and enhance error handling by @danielaskdd in #2109
New Contributors
- @roman-marchuk made their first contribution in #2106
Full Changelog: v1.4.8.1...v1.4.8.2