David Nguyen's Personal AI Assistant - Lumina is a full-stack web application that allows users to ask questions about David Nguyen, as well as any other topics, and receive instant, personalized responses powered by stateβofβtheβart AI & RAG. Users can log in to save their conversation history or continue as guests. The app uses modern technologies and provides a sleek, responsive user interface with intuitive UX and lots of animations. π
- Live App
- Features
- Architecture
- Detailed Architecture Documentation
- Setup & Installation
- Deployment
- Usage
- Streaming Responses
- User Interface
- API Endpoints
- Project Structure
- Agentic AI Pipeline
- Dockerization
- OpenAPI Specification
- CI / CD with GitHub Actions
- Testing
- Contributing
- License
Important
Currently, the app is deployed live on Vercel at: https://lumina-david.vercel.app/. Feel free to check it out!
For the backend (with Swagger docs), it is deployed live also on Vercel at: https://ai-assistant-chatbot-server.vercel.app/.
Alternatively, the backup app is deployed live on Netlify at: https://lumina-ai-chatbot.netlify.app/.
Tip
Go straight to https://lumina-david.vercel.app/chat if you want to chat with the AI right away!
- AI Chatbot: Ask questions about David Nguyen and general topics; receive responses from an AI.
- User Authentication: Sign up, log in, and log out using JWT authentication.
- Conversation History: Save, retrieve, rename, and search past conversations (only for authenticated users).
- Auto-Generated Titles: AI automatically generates concise, descriptive titles for new conversations based on the first message.
- Updated & Vast Knowledge Base: Use RAG (Retrieval-Augmented Generation) & LangChain to enhance AI responses.
- Dynamic Responses: AI-generated responses with
markdownformatting for rich text. - Interactive Chat: Real-time chat interface with smooth animations and transitions.
- Reset Password: Verify email and reset a user's password.
- Lightning-Fast Development: Built with Vite for instant HMR and optimized production builds.
- Responsive UI: Built with React and MaterialβUI (MUI) with a fully responsive, modern, and animated interface.
- Landing Page: A dynamic landing page with animations, feature cards, and call-to-action buttons.
- Guest Mode: Users may interact with the AI assistant as a guest, though conversations will not be saved.
- Conversation Search: Search through conversation titles and messages to find relevant discussions.
- Collapsible Sidebar: A sidebar that displays conversation history, allowing users to switch between conversations easily.
- Reinforced Learning from Human Feedback (RLHF): Implement a feedback loop to continuously improve the AI's responses based on user interactions.
- Dark/Light Mode: Users can toggle between dark and light themes, with the preference stored in local storage.
- Enterprise-Grade Deployment: Deployed with blue/green & canary deployment strategies on AWS & Terraform for zero-downtime updates.
- Comprehensive Testing: Unit and integration tests for both frontend and backend using Jest and React Testing Library.
- CI/CD Pipeline: Automated testing and deployment using GitHub Actions.
The project follows a modern, full-stack architecture with clear separation of concerns across three main layers:
-
Frontend Layer: A React application built with TypeScript and Material-UI (MUI) that provides:
- Modern, animated user interface with responsive design
- Client-side routing with React Router
- JWT-based authentication and authorization
- Real-time chat interface with markdown support
- Theme toggling (dark/light mode)
- Collapsible sidebar for conversation history
-
Backend Layer: An Express.js server written in TypeScript that handles:
- RESTful API endpoints for authentication and data management
- JWT token generation and validation
- User authentication (signup, login, password reset)
- Conversation management (CRUD operations)
- Integration with AI services
- Request validation and error handling
-
AI/ML Layer: RAG (Retrieval-Augmented Generation) implementation that includes:
- Retrieval: Vector similarity search using Pinecone
- Augmentation: Context building with conversation history
- Generation: Response generation using Google Gemini AI
- Knowledge Storage: Document embeddings in Pinecone vector database
- LangChain: Orchestration of the entire RAG pipeline
For detailed architecture documentation, including component diagrams, data flows, and deployment strategies, see ARCHITECTURE.md.
graph TB
subgraph "Client Layer"
Browser[Web Browser]
React[React Application]
end
subgraph "API Gateway"
LB[Load Balancer / CDN]
end
subgraph "Application Layer"
API[Express.js API Server]
Auth[Authentication Service]
Chat[Chat Service]
Conv[Conversation Service]
end
subgraph "AI/ML Layer"
RAG[RAG Pipeline]
Gemini[Google Gemini AI]
Embed[Embedding Service]
end
subgraph "Data Layer"
MongoDB[(MongoDB)]
Pinecone[(Pinecone Vector DB)]
end
Browser --> React
React --> LB
LB --> API
API --> Auth
API --> Chat
API --> Conv
Chat --> RAG
RAG --> Embed
RAG --> Gemini
RAG --> Pinecone
Auth --> MongoDB
Conv --> MongoDB
Chat --> MongoDB
style React fill:#4285F4
style API fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style Gemini fill:#4285F4
sequenceDiagram
participant User
participant Frontend
participant Backend
participant Pinecone
participant Gemini
participant MongoDB
User->>Frontend: Send chat message
Frontend->>Backend: POST /api/chat/auth
Backend->>MongoDB: Fetch conversation history
MongoDB-->>Backend: Previous messages
Note over Backend,Pinecone: Retrieval Phase
Backend->>Pinecone: Generate embedding & search
Pinecone-->>Backend: Top-3 relevant documents
Note over Backend,Gemini: Augmentation Phase
Backend->>Backend: Build augmented context
Backend->>Gemini: Send enriched prompt
Note over Gemini: Generation Phase
Gemini->>Gemini: Generate response
Gemini-->>Backend: AI response
Backend->>MongoDB: Save message & response
MongoDB-->>Backend: Saved
Backend-->>Frontend: Return AI response
Frontend-->>User: Display response
flowchart LR
subgraph "Frontend"
UI[User Interface]
State[State Management]
API_Client[API Client]
end
subgraph "Backend API"
Routes[Route Handlers]
Middleware[Auth Middleware]
Services[Business Logic]
end
subgraph "Data Sources"
MongoDB[(MongoDB)]
Pinecone[(Pinecone)]
Gemini[Gemini API]
end
UI --> State
State --> API_Client
API_Client -.HTTP/REST.-> Routes
Routes --> Middleware
Middleware --> Services
Services --> MongoDB
Services --> Pinecone
Services --> Gemini
MongoDB -.Data.-> Services
Pinecone -.Vectors.-> Services
Gemini -.AI Response.-> Services
Services -.JSON.-> Routes
Routes -.Response.-> API_Client
API_Client --> State
State --> UI
style UI fill:#4285F4
style Services fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style Gemini fill:#4285F4
Note
These diagrams provide a high-level overview of the system architecture. For detailed component interactions, database schemas, deployment strategies, and security architecture, please refer to ARCHITECTURE.md.
For comprehensive architecture documentation including:
- Detailed component diagrams and interactions
- Database schema and data models
- Security architecture and authentication flows
- Deployment strategies (Docker, AWS, Terraform)
- Performance optimization and scalability
- Monitoring and observability
- Disaster recovery and backup strategies
Please see ARCHITECTURE.md
-
Clone the repository:
git clone https://github.com/hoangsonww/AI-Assistant-Chatbot.git cd AI-Assistant-Chatbot/server -
Install dependencies:
npm install
-
Environment Variables:
Create a.envfile in theserverfolder with the following (adjust values as needed):PORT=5000 MONGODB_URI=mongodb://localhost:27017/ai-assistant JWT_SECRET=your_jwt_secret_here GOOGLE_AI_API_KEY=your_google_ai_api_key_here AI_INSTRUCTIONS=Your system instructions for the AI assistant PINECONE_API_KEY=your_pinecone_api_key_here PINECONE_INDEX_NAME=your_pinecone_index_name_here
-
Run the server in development mode:
npm run dev
This uses nodemon with
ts-nodeto watch for file changes.
-
Navigate to the client folder:
cd ../client -
Install dependencies:
npm install
-
Run the frontend development server:
npm start
The app will run on http://localhost:3000 (or any other port you've specified in the
.envfile'sPORTkey).
-
Install necessary Node.js packages:
npm install
-
Store knowledge data in Pinecone vector database:
npm run store
Or
ts-node server/src/scripts/storeKnowledge.ts
-
Ensure you run this command before starting the backend server to store the knowledge data in the Pinecone vector database.
The application is currently deployed on Vercel with the following setup:
- Frontend: Deployed at https://lumina-david.vercel.app/
- Backend: Deployed at https://ai-assistant-chatbot-server.vercel.app/
- Database: MongoDB Atlas (cloud-hosted)
- Vector Database: Pinecone (cloud-hosted)
graph TB
subgraph "Client Devices"
Browser[Web Browser]
Mobile[Mobile Browser]
end
subgraph "CDN Layer"
Vercel[Vercel Edge Network]
Netlify[Netlify CDN - Backup]
end
subgraph "Frontend Deployment"
FrontendVercel[React App on Vercel]
FrontendNetlify[React App on Netlify]
StaticAssets[Static Assets]
end
subgraph "Backend Deployment"
BackendVercel[Express API on Vercel]
ServerlessFunctions[Serverless Functions]
end
subgraph "External Services"
MongoDB[(MongoDB Atlas)]
Pinecone[(Pinecone Vector DB)]
GeminiAPI[Google Gemini AI API]
end
subgraph "CI/CD Pipeline"
GitHub[GitHub Repository]
GitHubActions[GitHub Actions]
AutoDeploy[Auto Deploy on Push]
end
subgraph "Monitoring & Analytics"
VercelAnalytics[Vercel Analytics]
Logs[Application Logs]
end
Browser --> Vercel
Mobile --> Vercel
Vercel --> FrontendVercel
Netlify --> FrontendNetlify
FrontendVercel --> StaticAssets
FrontendVercel --> BackendVercel
FrontendNetlify --> BackendVercel
BackendVercel --> ServerlessFunctions
ServerlessFunctions --> MongoDB
ServerlessFunctions --> Pinecone
ServerlessFunctions --> GeminiAPI
GitHub --> GitHubActions
GitHubActions --> AutoDeploy
AutoDeploy --> Vercel
AutoDeploy --> Netlify
BackendVercel --> VercelAnalytics
BackendVercel --> Logs
FrontendVercel --> VercelAnalytics
style Browser fill:#4285F4
style Vercel fill:#000000
style FrontendVercel fill:#61DAFB
style BackendVercel fill:#339933
style MongoDB fill:#47A248
style Pinecone fill:#FF6F61
style GeminiAPI fill:#4285F4
style GitHub fill:#181717
Run the entire application stack locally using Docker:
# Build and start all services
docker-compose up --build
# Or run in detached mode
docker-compose up -d
# Stop all services
docker-compose downThis will start:
- Frontend on
http://localhost:3000 - Backend on
http://localhost:5000 - MongoDB on
localhost:27017
For production-grade AWS deployment with high availability and scalability:
# Navigate to infrastructure directory
cd terraform/
# Initialize Terraform
terraform init
# Review deployment plan
terraform plan
# Deploy infrastructure
terraform apply
# Or use provided scripts
cd ../aws/scripts/
./deploy-production.shAWS Infrastructure includes:
- ECS/Fargate for container orchestration
- Application Load Balancer for traffic distribution
- DocumentDB (MongoDB-compatible) for database
- ElastiCache (Redis) for caching
- CloudFront CDN for static asset delivery
- CloudWatch for monitoring and logging
- Auto-scaling groups for high availability
- Multi-AZ deployment for fault tolerance
See aws/README.md and terraform/README.md for detailed deployment instructions.
-
Landing Page:
The landing page provides an overview of the appβs features and two main actions: Create Account (for new users) and Continue as Guest. -
Authentication:
Users can sign up, log in, and reset their password. Authenticated users can save and manage their conversation history. -
Chatting:
The main chat area allows users to interact with the AI assistant. The sidebar displays saved conversations (for logged-in users) and allows renaming and searching. -
Theme:
Toggle between dark and light mode via the navbar. The chosen theme is saved in local storage and persists across sessions.
Lumina features real-time streaming responses that make conversations feel more natural and engaging. Instead of waiting for the complete response, you'll see the AI's thoughts appear word-by-word as they're generated.
The streaming implementation uses Server-Sent Events (SSE) to deliver AI responses in real-time:
- User sends a message β Frontend displays "Processing Message..."
- Backend processes β Shows "Thinking & Reasoning..."
- Connection established β Displays "Connecting..."
- Streaming begins β Text appears word-by-word with a blinking cursor
- Response complete β Message is saved to conversation history
sequenceDiagram
participant User
participant Frontend
participant Backend
participant Gemini AI
User->>Frontend: Send message
Frontend->>Frontend: Show "Processing..."
Frontend->>Backend: POST /api/chat/auth/stream
Backend->>Gemini AI: Request streaming response
loop For each chunk
Gemini AI-->>Backend: Stream text chunk
Backend-->>Frontend: SSE: chunk data
Frontend->>Frontend: Append to message bubble
Frontend->>User: Display growing text + cursor
end
Gemini AI-->>Backend: Stream complete
Backend->>Backend: Save to database
Backend-->>Frontend: SSE: done event
Frontend->>Frontend: Finalize message
- Live Text Rendering: See responses appear in real-time with markdown formatting
- Visual Feedback: Multiple loading states (Processing β Thinking β Connecting β Streaming)
- Blinking Cursor: Animated cursor indicates active streaming
- Automatic Retries: Up to 3 retry attempts with exponential backoff (1s, 2s, 4s)
- Error Handling: Graceful degradation with user-friendly error messages
- Works Everywhere: Available for both authenticated and guest users
Authenticated Streaming:
POST /api/chat/auth/stream
Content-Type: application/json
Authorization: Bearer <token>
{
"message": "Your question here",
"conversationId": "optional-conversation-id"
}
Guest Streaming:
POST /api/chat/guest/stream
Content-Type: application/json
{
"message": "Your question here",
"guestId": "optional-guest-id"
}
The SSE stream sends different event types:
conversationId/guestId: Sent at the start with the conversation identifierchunk: Each piece of text as it's generated from the AIdone: Signals that streaming is completeerror: Indicates an error occurred during streaming
If a connection fails during streaming:
- First retry: Wait 1 second, then retry
- Second retry: Wait 2 seconds, then retry
- Third retry: Wait 4 seconds, then retry
- All failed: Display error message to user
The retry logic uses exponential backoff to avoid overwhelming the server while providing a smooth user experience.
- POST /api/auth/signup: Create a new user.
- POST /api/auth/login: Authenticate a user and return a JWT.
- GET /api/auth/verify-email?email=[email protected]: Check if an email exists.
- POST /api/auth/reset-password: Reset a user's password.
- GET /api/auth/validate-token: Validate the current JWT token.
flowchart TB
Start([User Visits App]) --> CheckAuth{Has Valid<br/>Token?}
CheckAuth -->|Yes| Dashboard[Access Dashboard]
CheckAuth -->|No| Landing[Landing Page]
Landing --> Choice{User Choice}
Choice -->|Sign Up| SignupForm[Signup Form]
Choice -->|Login| LoginForm[Login Form]
Choice -->|Guest| GuestChat[Guest Chat Mode]
SignupForm --> ValidateSignup{Valid<br/>Credentials?}
ValidateSignup -->|No| SignupError[Show Error]
SignupError --> SignupForm
ValidateSignup -->|Yes| CreateUser[Create User in MongoDB]
CreateUser --> GenerateToken[Generate JWT Token]
LoginForm --> ValidateLogin{Valid<br/>Credentials?}
ValidateLogin -->|No| LoginError[Show Error]
LoginError --> LoginForm
ValidateLogin -->|Yes| VerifyPassword[Verify Password with bcrypt]
VerifyPassword -->|Invalid| LoginError
VerifyPassword -->|Valid| GenerateToken
GenerateToken --> StoreToken[Store Token in LocalStorage]
StoreToken --> Dashboard
Dashboard --> Protected[Protected Routes]
Protected --> ConvHistory[Conversation History]
Protected --> SavedChats[Saved Chats]
Protected --> Settings[User Settings]
GuestChat --> TempStorage[Temporary Storage]
TempStorage --> LimitedFeatures[Limited Features]
Dashboard --> Logout{Logout?}
Logout -->|Yes| ClearToken[Clear Token]
ClearToken --> Landing
style Start fill:#4285F4
style Dashboard fill:#34A853
style GuestChat fill:#FBBC04
style GenerateToken fill:#EA4335
style CreateUser fill:#34A853
- POST /api/conversations: Create a new conversation.
- GET /api/conversations: Get all conversations for a user.
- GET /api/conversations/:id: Retrieve a conversation by ID.
- PUT /api/conversations/:id: Rename a conversation.
- GET /api/conversations/search/:query: Search for conversations by title or message content.
- DELETE /api/conversations/:id: Delete a conversation.
flowchart LR
subgraph User["π€ User Actions"]
NewChat[Start New Chat]
LoadChat[Load Existing Chat]
SearchChat[Search Conversations]
RenameChat[Rename Conversation]
DeleteChat[Delete Conversation]
end
subgraph Frontend["βοΈ React Frontend"]
ChatUI[Chat Interface]
Sidebar[Conversation Sidebar]
SearchBar[Search Bar]
end
subgraph API["π Express API"]
ConvRoutes[api/conversations Route]
AuthMiddleware{JWT Auth}
end
subgraph Database["ποΈ MongoDB"]
ConvCollection[(Conversations Collection)]
UserCollection[(Users Collection)]
end
subgraph Operations["π CRUD Operations"]
Create[Create]
Read[Read]
Update[Update]
Delete[Delete]
end
NewChat --> ChatUI
LoadChat --> Sidebar
SearchChat --> SearchBar
RenameChat --> Sidebar
DeleteChat --> Sidebar
ChatUI --> ConvRoutes
Sidebar --> ConvRoutes
SearchBar --> ConvRoutes
ConvRoutes --> AuthMiddleware
AuthMiddleware -->|Valid Token| Operations
AuthMiddleware -->|Invalid Token| ErrorAuth[401 Unauthorized]
Create --> ConvCollection
Read --> ConvCollection
Update --> ConvCollection
Delete --> ConvCollection
ConvCollection -.User Reference.-> UserCollection
ConvCollection --> ConvRoutes
ConvRoutes --> Frontend
style ChatUI fill:#4285F4
style ConvCollection fill:#47A248
style AuthMiddleware fill:#EA4335
style Operations fill:#34A853
- POST /api/chat/auth: Process a chat query for authenticated users and return an AI-generated response.
- POST /api/chat/auth/stream: Stream AI responses in real-time for authenticated users using Server-Sent Events (SSE).
- POST /api/chat/guest: Process a chat query for guest users and return an AI-generated response.
- POST /api/chat/guest/stream: Stream AI responses in real-time for guest users using Server-Sent Events (SSE).
AI-Assistant-Chatbot/
βββ docker-compose.yml
βββ openapi.yaml
βββ README.md
βββ ARCHITECTURE.md
βββ LICENSE
βββ Jenkinsfile
βββ package.json
βββ tsconfig.json
βββ .env
βββ shell/ # Shell scripts for app setups
βββ terraform/ # Infrastructure as Code (Terraform)
βββ aws/ # AWS deployment configurations
βββ img/ # Images and screenshots
βββ agentic_ai/ # Agentic AI pipeline in Python
βββ client/ # Frontend React application
β βββ package.json
β βββ tsconfig.json
β βββ docker-compose.yml
β βββ Dockerfile
β βββ src/
β βββ App.tsx
β βββ index.tsx
β βββ theme.ts
β βββ globals.css
β βββ index.css
β βββ dev/
β β βββ palette.tsx
β β βββ previews.tsx
β β βββ index.ts
β β βββ useInitial.ts
β βββ services/
β β βββ api.ts # API client with streaming support
β βββ types/
β β βββ conversation.d.ts
β β βββ user.d.ts
β βββ components/
β β βββ Navbar.tsx
β β βββ Sidebar.tsx
β β βββ ChatArea.tsx # Main chat interface with streaming
β β βββ CopyIcon.tsx
β βββ styles/
β β βββ (various style files)
β βββ pages/
β βββ LandingPage.tsx
β βββ Home.tsx
β βββ Login.tsx
β βββ Signup.tsx
β βββ NotFoundPage.tsx
β βββ ForgotPassword.tsx
β βββ Terms.tsx
βββ server/ # Backend Express application
βββ package.json
βββ tsconfig.json
βββ Dockerfile
βββ docker-compose.yml
βββ src/
βββ server.ts
βββ models/
β βββ Conversation.ts
β βββ GuestConversation.ts
β βββ User.ts
βββ routes/
β βββ auth.ts
β βββ conversations.ts
β βββ chat.ts # Authenticated chat with streaming
β βββ guest.ts # Guest chat with streaming
βββ services/
β βββ geminiService.ts # AI service with streaming support
β βββ pineconeClient.ts
βββ scripts/
β βββ storeKnowledge.ts
β βββ queryKnowledge.ts
β βββ langchainPinecone.ts
βββ utils/
β βββ (utility functions)
βββ middleware/
β βββ auth.ts
βββ public/
βββ favicon.ico
There is also an Agentic AI pipeline implemented in Python using LangChain. This pipeline demonstrates how to create an autonomous agent that can perform tasks using tools and interact with the AI model.
The pipeline is located in the agentic_ai/ directory. It was developed to complement the main RAG-based AI assistant by showcasing advanced AI capabilities, as well as enhancing the RAG responses with agentic reasoning when needed (e.g. for complex queries).
Tip
For more information on the Agentic AI pipeline, please refer to the agentic_ai/README.md file.
To run the application using Docker, simply run docker-compose up in the root directory of the project. This will start both the backend and frontend services as defined in the docker-compose.yml file.
Why Dockerize?
- Consistency: Ensures the application runs the same way in different environments.
- Isolation: Keeps dependencies and configurations contained.
- Scalability: Makes it easier to scale services independently.
- Simplified Deployment: Streamlines the deployment process.
- Easier Collaboration: Provides a consistent environment for all developers.
There is an OpenAPI specification file (openapi.yaml) in the root directory that describes the API endpoints, request/response formats, and authentication methods. This can be used to generate client SDKs or documentation.
To view the API documentation, you can use tools like Swagger UI or Postman to import the openapi.yaml file. Or just go to the /docs endpoint of the deployed backend.
This project includes a GitHub Actions workflow for continuous integration and deployment. The workflow is defined in the .github/workflows/workflow.yml file and includes steps to:
- Install dependencies for both the frontend and backend.
- Run tests for both the frontend and backend.
- Build the frontend and backend applications.
- Deploy the applications to Vercel and Netlify.
- Notify the team via email on successful deployments.
- Notify the team via email on failed builds or tests.
- Generate and upload artifacts for the frontend and backend builds.
- Run linting checks for both the frontend and backend code.
- and more...
This workflow ensures that every commit and pull request is tested and deployed automatically, providing a robust CI/CD pipeline.
Please ensure you have the necessary secrets configured in your GitHub repository for deployment (e.g, Vercel and Netlify tokens, etc.). Also, feel free to customize the workflow under .github/workflows/workflow.yml to suit your needs.
This project includes unit and integration tests with Jest for both the frontend and backend. To run the tests:
-
Frontend:
Navigate to theclientdirectory and run:npm test -
Backend:
Navigate to theserverdirectory and run:npm test
- Fork the repository.
- Create your feature branch:
git checkout -b feature/your-feature-name - Commit your changes:
git commit -m 'Add some feature' - Push to the branch:
git push origin feature/your-feature-name - Open a Pull Request.
This project is licensed under the MIT License.
If you have any questions or suggestions, feel free to reach out to me:
Thank you for checking out the AI Assistant Project! If you have any questions or feedback, feel free to reach out. Happy coding! π













