Skip to content

Conversation

@danielaskdd
Copy link
Collaborator

Add Offline Deployment Support With Cache Management and Layered Dependencies

🎯 Problem Statement

LightRAG uses dynamic package installation (pipmaster) for optional features based on file types and configurations. In offline or air-gapped environments where internet access is unavailable, these dynamic installations fail, preventing users from deploying LightRAG in restricted network environments.

Additionally, tiktoken downloads BPE encoding models on first use from OpenAI's CDN, which also fails in offline environments.

💡 Solution Overview

Implemented a comprehensive offline deployment solution with:

  1. Layered Optional Dependencies - Flexible dependency groups for different use cases
  2. Tiktoken Cache Management - CLI tool to pre-download tiktoken models
  3. Complete Documentation - Step-by-step offline deployment guide
  4. Python Compatibility Fixes - Removed problematic dependencies

📦 Changes Made

1. Layered Dependency System (pyproject.toml)

Added flexible optional dependency groups:

[project.optional-dependencies]
offline-docs = [...]      # Document processing (PDF, DOCX, PPTX, XLSX)
offline-storage = [...]   # Storage backends (Redis, Neo4j, MongoDB, etc.)
offline-llm = [...]       # LLM providers (OpenAI, Anthropic, Ollama, etc.)
offline = [...]           # Complete package (all of the above)

Usage:

# Install only what you need
pip install lightrag-hku[offline-docs]

# Or install everything
pip install lightrag-hku[offline]

2. Tiktoken Cache Downloader

Created CLI command to pre-download tiktoken models:

New Files:

  • lightrag/tools/download_cache.py - CLI implementation

CLI Command:

lightrag-download-cache
lightrag-download-cache --cache-dir ./tiktoken_cache
lightrag-download-cache --models gpt-4o-mini gpt-4

Entry Point in pyproject.toml:

[project.scripts]
lightrag-download-cache = "lightrag.tools.download_cache:main"

3. Requirements Files for pip

Created modular requirements files:

  • requirements-offline-docs.txt - Document processing only
  • requirements-offline-storage.txt - Storage backends only
  • requirements-offline-llm.txt - LLM providers only
  • requirements-offline.txt - Complete offline package

4. Comprehensive Documentation

New File: docs/OfflineDeployment.md

Includes:

  • Quick start guide
  • Layered dependencies explanation
  • Tiktoken cache management
  • Complete offline deployment workflow
  • Troubleshooting section
  • Best practices

Updated: README.md - Added prominent link to offline deployment guide

5. Cleanup and Compatibility Fixes

Removed:

  • scripts/download_tiktoken_cache.py (duplicate of CLI command)
  • lmdeploy>=0.2.0 from offline dependencies (Python version compatibility issues)

Note: lightrag/llm/lmdeploy.py is preserved - it uses pipmaster for dynamic installation when needed.

🚀 How to Use

Online Environment (Preparation)

# 1. Install with offline dependencies
pip install lightrag-hku[offline]

# 2. Download tiktoken cache
lightrag-download-cache

# 3. Download all packages
pip download lightrag-hku[offline] -d ./packages

# 4. Create archive
tar -czf lightrag-offline.tar.gz ./packages ~/.tiktoken_cache

Offline Environment (Deployment)

# 1. Extract archive
tar -xzf lightrag-offline.tar.gz

# 2. Install packages
pip install --no-index --find-links=./packages lightrag-hku[offline]

# 3. Set tiktoken cache
export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache

📊 Benefits

  • Flexible Installation - Install only what you need
  • Reduced Package Size - Modular approach saves bandwidth
  • Air-Gapped Support - Complete offline deployment capability
  • Better Compatibility - Removed problematic dependencies
  • Comprehensive Documentation - Clear step-by-step guide

📚 Documentation

  • Added: docs/OfflineDeployment.md - Complete offline deployment guide
  • Updated: README.md - Added offline deployment reference
  • Updated: pyproject.toml - Added layered optional dependencies and CLI command

Type: Enhancement
Category: Deployment, Infrastructure
Backward Compatibility: Full (no breaking changes)

• Add tiktoken cache downloader CLI
• Add layered offline dependencies
• Add offline requirements files
• Add offline deployment guide
@danielaskdd danielaskdd reopened this Oct 11, 2025
@danielaskdd danielaskdd merged commit 49326f2 into HKUDS:main Oct 11, 2025
2 checks passed
@danielaskdd danielaskdd deleted the offline branch October 14, 2025 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant