A multi-model AI chat interface that lets you create, organize, and inject structured context into conversations with different AI models (OpenAI, Anthropic, Ollama). Maintain your project knowledge, preferences, and decisions in organized contexts that can be selectively applied to enhance AI interactions.
Most AI tools are stateless—they forget context between sessions. ContextPilot solves this by:
- Storing structured context (preferences, decisions, goals, facts) with versioning
- Ranking context by relevance using semantic embeddings
- Generating optimized prompts that automatically include the right context
- Providing a clean UI to manage your personal knowledge base
- ✅ CRUD operations for context units with versioning
- ✅ Persistent storage with SQLite or PostgreSQL + pgvector
- ✅ AI integration with OpenAI (GPT-4, GPT-4o), Anthropic (Claude), and Ollama (local models)
- ✅ Dynamic model discovery - Automatically detects available models from each provider
- ✅ Model validation - Only shows working, current models in the UI
- ✅ Semantic search using sentence-transformers embeddings
- ✅ Embedding caching for faster similarity searches
- ✅ Response caching for improved API performance
- ✅ Confidence scoring and context versioning
- ✅ Relevance engine that ranks contexts by task relevance
- ✅ Prompt composer that generates LLM-ready prompts
- ✅ Chat-style interface with message bubbles and timestamps
- ✅ Conversation history with automatic persistence
- ✅ Smart context management - sends contexts once per conversation
- ✅ Auto-scroll to latest messages
- ✅ Typing indicators for AI responses
- ✅ Context refresh control for explicit context reloading
- ✅ New conversation button to start fresh chats
- ✅ Markdown image support - Automatically renders images from
syntax - ✅ Image error handling - Displays helpful warnings when images fail to load
- ✅ Immediate message display - Shows user messages before API response
- ✅ Concurrent request prevention - Disables send button during API calls
- ✅ Smart truncation handling - Shows detailed messages for truncated responses with token counts
- ✅ Model attribution - Shows which AI model generated each response for transparency
- ✅ Modern Interface Design - Clean, professional layout with brand identity
- ✅ Simplified Navigation - Streamlined interface with intuitive return navigation
- ✅ 2-Column Manage Layout - Organized context management with responsive grid design
- ✅ Custom Branding - Distinctive "by B" signature with custom fuzzy B logo
- ✅ Mobile Responsive - Optimized for all screen sizes with proper breakpoints
- ✅ Enhanced UX - Loading states, smooth transitions, and improved interactions
- ✅ Full-width layout utilizing entire browser window
- ✅ Input clearing - Automatically clears text box on message send
- ✅ Settings Management - Configure API keys and AI parameters directly in the UI
- ✅ Flexible token limits - Supports up to 16,000 tokens (default: 4000)
- ✅ Context Import/Export - JSON/CSV export and JSON import functionality
- ✅ Advanced Filtering - Search by type, tags, content, and status
- ✅ Context Templates - Quick creation with 6 pre-defined templates
- ✅ Dynamic Model Management - Automatic model discovery and caching for optimal performance
- ✅ RESTful API with FastAPI and OpenAPI documentation
- ✅ Security features - API key auth, input validation, CORS, rate limiting
- ✅ Request tracking with unique IDs and timing
- ✅ Structured logging with JSON output option
- ✅ Database migrations with Alembic
- ✅ No external dependencies for embeddings (uses local models)
┌─────────────────┐
│ React UI │ ← User Interface
│ (TypeScript) │
└────────┬────────┘
│
HTTP REST API
│
┌────────▼────────┐
│ FastAPI Server │ ← Backend
│ │
│ ┌───────────┐ │
│ │ Storage │ │ ← SQLite/PostgreSQL or in-memory
│ │ (Database)│ │
│ └───────────┘ │
│ │
│ ┌───────────┐ │
│ │ Relevance │ │ ← Embedding-based ranking
│ │ Engine │ │
│ └───────────┘ │
│ │
│ ┌───────────┐ │
│ │ Prompt │ │ ← Context composition
│ │ Composer │ │
│ └───────────┘ │
│ │
│ ┌───────────┐ │
│ │AI Service │ │ ← OpenAI / Anthropic
│ └───────────┘ │
└─────────────────┘
ContextPilot/
├── backend/ # FastAPI backend server
│ ├── main.py # FastAPI application entry point
│ ├── models.py # Pydantic data models
│ ├── db_models.py # SQLAlchemy database models
│ ├── storage.py # In-memory context store
│ ├── db_storage.py # Database storage implementation
│ ├── storage_interface.py # Storage abstraction layer
│ ├── relevance.py # Semantic search & ranking
│ ├── composer.py # Prompt composition engine
│ ├── ai_service.py # OpenAI/Anthropic integration
│ ├── config.py # Configuration management
│ ├── security.py # Authentication & validation
│ ├── validators.py # Dynamic model validation
│ ├── database.py # Database session management
│ ├── valid_models.json # Dynamic model validation rules
│ ├── alembic/ # Database migration scripts
│ ├── test_*.py # Comprehensive test suite (107 tests)
│ ├── requirements.txt # Python dependencies
│ └── README.md # Backend documentation
├── frontend/ # React TypeScript frontend
│ ├── src/
│ │ ├── App.tsx # Main application component
│ │ ├── AppContext.tsx # React context & state management
│ │ ├── ContextTemplates.tsx # Template creation component
│ │ ├── ContextTools.tsx # Import/export & filtering tools
│ │ ├── api.ts # API client with all endpoints
│ │ ├── types.ts # TypeScript type definitions
│ │ ├── model_options.json # Dynamic model configuration
│ │ └── index.tsx # React entry point
│ ├── public/
│ │ └── index.html # HTML template
│ ├── package.json # Node.js dependencies
│ └── tsconfig.json # TypeScript configuration
├── LICENSE # MIT License
├── THIRD_PARTY_NOTICES # Third-party dependency licenses
├── QUICKSTART.md # Quick reference guide
├── ARCHITECTURE.md # System architecture documentation
├── SECURITY.md # Security guidelines
├── DEPLOYMENT.md # Production deployment guide
├── MODEL_DISCOVERY.md # Dynamic model discovery documentation
├── discover_models.py # Model discovery script
├── refresh_models.py # Startup model refresh integration
├── update_models.sh # Manual model update script
├── test_dynamic_models.py # Model discovery test suite
├── demo_dynamic_models.py # Model discovery demo script
├── setup.sh # Automated environment setup
├── start.sh # Start both backend & frontend (with auto-setup)
├── stop.sh # Stop all services
├── start-backend.sh # Start backend only
├── start-frontend.sh # Start frontend only
├── demo.sh # Demo with sample data
├── CONCEPT.txt # Original concept document
└── README.md # This file
- Python 3.8+
- Node.js 16+
- npm or yarn
The easiest way to get started is using the automated setup script:
./setup.shThis script will:
- ✅ Check all prerequisites (Python 3, Node.js)
- ✅ Create and configure virtual environment
- ✅ Install all backend dependencies
- ✅ Initialize the database
- ✅ Install all frontend dependencies
- ✅ Detect and fix common issues (broken symlinks, missing packages)
Then start the application:
./start.shThis will:
- ✅ Automatically run setup if needed
- ✅ Start the backend server on http://localhost:8000
- ✅ Start the frontend development server on http://localhost:3000
- ✅ Open the application in your browser
- ✅ Provide health check verification
To stop all services:
./stop.sh- Navigate to the backend directory:
cd backend- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure environment variables:
cp .env.example .env
# Edit .env with your API keys and database URL- Initialize the database:
python init_db.py- Run the server with example data (optional):
python -c "from example_data import load_example_data; load_example_data()" && python main.pyOr run without example data:
python main.pyThe backend will be available at http://localhost:8000
API documentation: http://localhost:8000/docs
Note: For production deployment with PostgreSQL and pgvector, see DATABASE.md
- Navigate to the frontend directory:
cd frontend- Install dependencies:
npm install- Start the development server:
npm startThe frontend will be available at http://localhost:3000
- Open http://localhost:3000
- Configure API Keys: Click the ⚙️ settings button to configure your OpenAI or Anthropic API keys for AI chat functionality
- Add Context: Add context units (preferences, decisions, facts, goals) using templates or manual entry in the left sidebar
- Start a Chat:
- Select a previous conversation from the list, or
- Click "New Conversation" to start fresh
- Chat with AI: Enter your question/task in the chat interface
- Relevant contexts are automatically included in the first message
- Follow-up messages continue the conversation without re-sending contexts
- Click the "Refresh Contexts" toggle if you want to reload contexts in a follow-up
- View History: All conversations are saved and can be accessed from the left sidebar
- Context Management:
- View all your contexts in the right sidebar
- Use filters to find specific contexts
- Export/import contexts as JSON or CSV
- Generate Standalone Prompts: Use the middle column's "Generate Prompt" section to create prompts for use in other tools
Create a context:
curl -X POST http://localhost:8000/contexts \
-H "Content-Type: application/json" \
-d '{
"type": "preference",
"content": "I prefer functional programming style",
"confidence": 0.9,
"tags": ["programming", "style"]
}'List contexts:
curl http://localhost:8000/contextsGenerate a prompt:
curl -X POST http://localhost:8000/generate-prompt \
-H "Content-Type: application/json" \
-d '{
"task": "Write a function to sort a list",
"max_context_units": 5
}'Ask AI with context:
curl -X POST http://localhost:8000/ai/chat \
-H "Content-Type: application/json" \
-d '{
"task": "Explain the main purpose of this codebase",
"max_context_units": 5,
"provider": "openai",
"model": "gpt-4-turbo-preview"
}'Continue a conversation (with conversation_id):
curl -X POST http://localhost:8000/ai/chat \
-H "Content-Type: application/json" \
-d '{
"task": "Can you elaborate on the architecture?",
"max_context_units": 0,
"provider": "openai",
"model": "gpt-4-turbo-preview",
"conversation_id": "abc123"
}'View conversation history:
curl http://localhost:8000/ai/conversationsGet specific conversation:
curl http://localhost:8000/ai/conversations/{conversation_id}Task: "Write a Python function to calculate fibonacci numbers"
Generated Prompt:
# Context
## Preferences
- [✓] I prefer concise, technical explanations without excessive verbosity
(Tags: communication, style, technical)
- [✓] I like code examples in Python and TypeScript
(Tags: programming, languages)
## Goals
- [✓] Building an AI-powered context management system called ContextPilot
(Tags: project, ai, context)
## Decisions
- [✓] Using FastAPI for backend instead of Django for better async support
(Tags: architecture, backend, fastapi)
## Facts
- [✓] I have experience with vector databases and embeddings
(Tags: skills, ai, embeddings)
# Task
Write a Python function to calculate fibonacci numbers
# Instructions
Please complete the task above, taking into account the provided context.
Align your response with the stated preferences, goals, and decisions.
ContextPilot provides a settings UI (⚙️ button) where you can configure:
- OpenAI API Key: Required for using GPT models
- Anthropic API Key: Required for using Claude models
- Ollama Base URL: Local Ollama server endpoint (default: http://localhost:11434)
- Default AI Provider: Choose between
openai,anthropic, orollama - Default AI Model: Set default model
- OpenAI:
gpt-4o,gpt-4o-mini,gpt-4-turbo,gpt-4, etc. - Anthropic:
claude-3-5-sonnet-20241022,claude-3-5-haiku-20241022, etc. - Ollama:
llama3.2,mistral,codellama,phi3, etc.
- OpenAI:
- Temperature: Control randomness (0.0-2.0, default: 1.0)
- Max Tokens: Maximum response length (1-16000, default: 4000)
- Increase this if you're getting truncated responses
- For image-heavy responses, consider 8000+ tokens
ContextPilot supports running AI models locally using Ollama:
- Install Ollama: Download from https://ollama.ai
- Pull a model:
ollama pull llama3.2(or mistral, codellama, etc.) - Start Ollama:
ollama serve(runs on http://localhost:11434 by default) - Configure ContextPilot:
- Open Settings (⚙️ button)
- Set Ollama Base URL (default works if Ollama is running locally)
- Select "Ollama (Local)" as provider
- Choose your downloaded model
Benefits of Local Models:
- ✅ No API keys required
- ✅ Complete privacy - no data sent to external services
- ✅ No API costs
- ✅ Works offline
- ✅ Faster responses (no network latency)
Supported Ollama Models:
llama3.2- Meta's latest Llama modelllama3.1- Previous Llama versionmistral- Mistral AI's modelcodellama- Specialized for code generationphi3- Microsoft's efficient model
Automatic Model Download: If you select a model that hasn't been downloaded yet, ContextPilot will automatically pull it for you. The first request may take 1-5 minutes depending on model size, but subsequent requests will be instant.
You can also configure settings via API:
# Get current settings
curl http://localhost:8000/settings
# Update settings
curl -X POST http://localhost:8000/settings \
-H "Content-Type: application/json" \
-d '{
"openai_api_key": "sk-...",
"anthropic_api_key": "sk-ant-...",
"ollama_base_url": "http://localhost:11434",
"default_ai_provider": "ollama",
"default_ai_model": "llama3.2",
"ai_temperature": 0.7,
"ai_max_tokens": 8000
}'ContextPilot features an advanced dynamic model discovery system that automatically detects available AI models from each provider, ensuring you always have access to the latest and working models.
- OpenAI: Fetches available chat models via API (when API key is configured)
- Anthropic: Maintains current model list (Claude 3.5 Sonnet, Opus, etc.)
- Ollama: Automatically detects locally installed models
- ✅ Always Current: Shows only working, available models
- ✅ Automatic Updates: Refreshes model lists daily
- ✅ Local Model Detection: Finds Ollama models automatically
- ✅ Performance Optimized: 24-hour caching to minimize API calls
- ✅ Robust Fallbacks: Works even when APIs are unavailable
Refresh available models manually:
# Discover and cache all available models
python3 discover_models.py
# Force refresh (ignore cache)
python3 refresh_models.py --force
# Quick status check
python3 demo_dynamic_models.pySet up automatic daily model discovery:
# Edit crontab
crontab -e
# Add this line (runs daily at 6 AM)
0 6 * * * /path/to/ContextPilot/update_models.shAs of last discovery, ContextPilot supports:
- OpenAI: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo
- Anthropic: Claude 3.5 Sonnet (latest), Claude 3 Opus, Claude 3 Sonnet, Claude 3 Haiku
- Ollama: Automatically detected local models (e.g., llama3.2:latest)
Note: Model availability depends on your API access and local Ollama installations. The system automatically updates these lists to match your actual capabilities.
The chat interface supports markdown images using the syntax: 
When AI responses include image markdown:
- Images are automatically rendered inline
- Failed image loads show a helpful warning with a link to the image URL
- This is useful for asking AI to generate or reference images
To prevent accidental data loss, use the provided backup scripts:
# Create a backup (stored in backend/backups/)
cd backend
./backup_db.sh
# Restore from a backup
./restore_db.shAutomatic backup retention: The backup script keeps the last 10 backups automatically.
Before database maintenance: Always create a backup before:
- Running database migrations
- Reinitializing the database
- Upgrading the application
- Testing database-related changes
cd backend
python -m pytest # Run all tests
python -m pytest --ignore=test_integration.py # Skip integration testsTest Coverage:
- ✅ 135+ unit tests passing
- ✅ Mock-based testing for AI services
- ✅ Database storage tests
- ✅ API validation tests
- ✅ Security and authentication tests
chmod +x demo.sh
./demo.sh| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Root endpoint |
| GET | /health |
Health check |
| GET | /stats |
Get statistics |
| POST | /contexts |
Create context unit |
| GET | /contexts |
List all contexts |
| GET | /contexts/{id} |
Get specific context |
| PUT | /contexts/{id} |
Update context |
| DELETE | /contexts/{id} |
Delete context |
| POST | /generate-prompt |
Generate contextualized prompt |
| POST | /generate-prompt/compact |
Generate compact prompt |
| POST | /ai/chat |
Generate AI response with context |
| GET | /ai/conversations |
List conversation history |
| GET | /ai/conversations/{id} |
Get specific conversation with messages |
| DELETE | /ai/conversations/{id} |
Delete conversation |
For detailed API documentation, see the interactive docs at /docs when the server is running.
{
id: string; // Unique ID
type: "preference" | "decision" | "fact" | "goal";
content: string; // Natural language description
confidence: number; // 0.0 - 1.0
created_at: string; // ISO timestamp
last_used: string | null; // ISO timestamp
source: string; // "manual" or "extracted"
tags: string[]; // Array of tags
status: "active" | "superseded";
superseded_by: string | null; // ID of replacing context
}SQLAlchemy 2.0 (ORM and database toolkit)
- PostgreSQL / SQLite (persistent storage)
- pgvector (vector similarity search)
- OpenAI API (GPT-4 integration)
- Anthropic API (Claude integration)
- sentence-transformers (embeddings)
- Pydantic (data validation)
- NumPy (vector operations)
Frontend:
- React 18
- TypeScript
- Axios (HTTP client)
- CSS3 (styling)
- Database Setup: SQLite and PostgreSQL configuration
- AI Integration: OpenAI and Anthropic setup
- Model Discovery: Dynamic model discovery system
- API Reference: Interactive API documentation (when server is running)
- Persistent storage (PostgreSQL + pgvector) ✅
- ChatGPT/Claude API integration ✅
- Export/import functionality ✅
- Advanced search and filtering ✅
- Chat-style interface with conversation history ✅
- Smart context management (one-time sending per conversation) ✅
- Automatic context extraction from documents
- Context decay and reinforcement learning
- Conflict detection between contexts
- Browser extension for automatic context capture
- IDE plugin integration
- Analytics dashboard
- Streaming AI responses
- Multi-user support with authentication
This is an MVP. Contributions are welcome! Areas for improvement:
- Storage: Replace in-memory store with persistent database
- Embeddings: Add support for other embedding models
- UI: Enhance the interface with better visualizations
- Testing: Add unit tests and integration tests
- Documentation: Improve API documentation
See LICENSE file for details.
- Built with FastAPI and React
- Embeddings powered by sentence-transformers
- Inspired by the need for context-aware AI interactions
Built with ❤️ for better AI conversations