Skip to content

xevrion/DocuMentor

Repository files navigation

DocuMentor - AI-Powered Study Assistant

An offline, privacy-first AI tutor that helps you learn from your study materials through intelligent summarization, Q&A, and quiz generation.

Version Python License

Features

  • PDF Upload & Processing: Extract and chunk text from study materials
  • Intelligent Summarization: Generate bullet-point, paragraph, or exam-style summaries
  • Interactive Q&A: Ask questions and get context-aware answers using RAG
  • Quiz Generation (Coming Soon): Auto-generate MCQs from your documents
  • 100% Offline: All processing happens locally - zero API keys, zero data leakage
  • GPU Accelerated: Optimized for consumer GPUs (RTX 4060 and above)

Tech Stack

Backend

  • Framework: FastAPI
  • Models:
    • microsoft/Phi-3-mini-4k-instruct (3.8B) - Summarization
    • google/flan-t5-xl (3B) - MCQ Generation
    • BAAI/bge-small-en-v1.5 (33M) - Embeddings
  • Vector DB: FAISS
  • PDF Processing: pdfplumber, PyMuPDF
  • Quantization: 4-bit for efficient VRAM usage

Frontend

  • Framework: Next.js 14 + React
  • Styling: TailwindCSS
  • UI Components: shadcn/ui
  • State Management: React Hooks

Quick Start

1. Clone Repository

git clone <repository-url>
cd DocuMentor

2. Setup Backend

cd backend
./setup_environment.sh

3. Setup Frontend

cd website/client
npm install

4. Start Both Servers

Terminal 1 - Backend

./start_backend.sh

Terminal 2 - Frontend

./start_frontend.sh

5. Open Browser

Navigate to: http://localhost:3000

For detailed setup instructions, see SETUP_GUIDE.md

Usage

Upload a Document

  1. Click "Choose File" or drag & drop a PDF
  2. Wait for processing (~10-30 seconds)
  3. Document appears in sidebar

Get a Summary

Type in chat:

Summarize this document

Ask Questions

What is the main topic of this document?
Explain the concept of neural networks
List all the formulas mentioned

Generate Quiz (Coming Soon)

Generate 5 MCQs from chapter 2

Project Structure

DocuMentor/
├── backend/                    # FastAPI backend
│   ├── main.py                # Entry point
│   ├── models/                # ML model wrappers
│   │   ├── embeddings.py
│   │   ├── phi3_summarizer.py
│   │   └── t5_quiz_generator.py
│   ├── services/              # Business logic
│   │   ├── pdf_processor.py
│   │   ├── vector_store.py
│   │   └── rag_pipeline.py
│   ├── api/                   # API routes & schemas
│   ├── utils/                 # Configuration & utilities
│   └── requirements.txt
├── website/
│   ├── client/                # React frontend
│   │   ├── components/        # UI components
│   │   ├── lib/              # API client
│   │   └── pages/            # Next.js pages
│   └── shared/                # Shared TypeScript types
├── data/                      # Runtime data (gitignored)
│   ├── uploads/              # Uploaded PDFs
│   ├── vectors/              # FAISS indices
│   └── processed/            # Processed documents
├── models/                    # Model cache (gitignored)
├── claude.md                  # Architecture documentation
├── SETUP_GUIDE.md            # Detailed setup guide
├── start_backend.sh          # Backend startup script
└── start_frontend.sh         # Frontend startup script

API Endpoints

Document Management

  • POST /api/v1/upload - Upload PDF
  • GET /api/v1/documents - List documents
  • DELETE /api/v1/documents/{doc_id} - Delete document

AI Operations

  • POST /api/v1/summarize - Summarize document
  • POST /api/v1/ask - Ask question (RAG)
  • POST /api/v1/generate-quiz - Generate MCQs (Coming Soon)

System

  • GET /api/v1/health - Health check

Full API documentation available at: http://localhost:8000/docs

Development Roadmap

Month 1: Foundation (Current)

  • PDF ingestion & chunking
  • Embeddings & FAISS vector store
  • Local LLM (Phi-3) integration
  • RAG pipeline for Q&A
  • FastAPI backend
  • React frontend
  • Document upload & management

Month 2: Intelligence

  • Chunk-level & full-doc summarization
  • Multiple summary styles
  • MCQ generation (Flan-T5-XL)
  • Open-ended question generation
  • Flashcard creation
  • Spaced repetition system

Month 3: Polish & Extensions

  • Multi-document support
  • Topic-based search
  • Session management
  • Export (PDF/Markdown/Anki)
  • Performance optimizations
  • Analytics dashboard

System Requirements

  • OS: Linux (recommended), macOS, or Windows with WSL
  • Python: 3.10+
  • Node.js: 18+
  • GPU: NVIDIA GPU with 6GB+ VRAM (RTX 3060+, RTX 4060+)
  • CUDA: 12.1+
  • RAM: 16GB+
  • Disk: 15GB+ free space

Note: 4-bit quantization is now enabled by default, making it work on GPUs with as little as 6GB VRAM!

Performance

Model Sizes (with 4-bit quantization enabled)

  • Embedding model: ~150MB VRAM
  • Phi-3: ~2-3GB VRAM (reduced from 7GB!)
  • Flan-T5-XL: ~2-3GB VRAM (when loaded)
  • Total: ~5-6GB VRAM (fits comfortably on 8GB GPUs)

Processing Times

  • PDF Upload & Processing: 10-30 seconds
  • First Query (model loading): 30-60 seconds
  • Subsequent Queries: 3-10 seconds
  • Summarization: 20-60 seconds (depending on length)

Troubleshooting

See SETUP_GUIDE.md for detailed troubleshooting steps.

Common Issues

CUDA Out of Memory

FIXED! The application now uses 4-bit quantization by default, reducing memory from ~7GB to ~2-3GB.

For detailed memory optimization guide, see MEMORY_OPTIMIZATION_GUIDE.md

Quick fixes:

  1. Automatic: Just restart the backend - memory optimizations are now enabled
  2. Still having issues?: Check GPU processes with nvidia-smi
  3. Last resort: Use CPU mode (slower but works):
# Edit backend/utils/config.py
DEVICE = "cpu"  # Fallback to CPU

Models Not Loading

rm -rf models/  # Clear cache
# Models will re-download on next use

API Connection Error

Contributing

This is an educational project. Feel free to:

  • Report bugs
  • Suggest features
  • Submit pull requests
  • Use this architecture for learning

Privacy & Security

  • ✅ 100% offline - no external API calls
  • ✅ All data stays on your machine
  • ✅ No telemetry or tracking
  • ✅ Documents never leave your device
  • ✅ No API keys required

License

MIT License - See LICENSE file for details

Acknowledgments

Links


Version: 1.0.0 Last Updated: 2025-11-06 Status: Active Development Maintainer: Educational Project

Made with ❤️ for students who want to own their learning tools

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published