This project demonstrates how to create a real-time voice agent using Pipecat framework in Python with FastAPI, integrated with Attendee's meeting bot API.
- Real-time voice-to-voice conversational AI using Pipecat
- Integration with Deepgram for speech-to-text
- Integration with OpenAI for language processing
- Integration with Attendee API for meeting bot functionality
- WebSocket-based communication between browser and server
- Configurable agent personality and voice
- Python 3.10 or higher
- UV for dependency management
- Pipecat framework
- API keys for:
- Deepgram (for STT/TTS)
- OpenAI (for LLM)
- Attendee (for meeting bot API)
- Ngrok or similar tunneling service for WebSocket connections
-
Clone the repository
-
Install dependencies using UV:
uv sync
-
Copy
.env.example
to.env
and fill in your API keys:DEEPGRAM_API_KEY=your_deepgram_api_key OPENAI_API_KEY=your_openai_api_key ATTENDEE_API_KEY=your_attendee_api_key ATTENDEE_API_HOST=https://app.attendee.dev NGROK_URL=wss://your-ngrok-url.ngrok-free.app PORT=8000
-
Run the application:
uv run app/main.py
Alternatively, you can activate the virtual environment and run directly:
source .venv/bin/activate # On Windows: .venv\Scripts\activate python app/main.py
If port 8080 is already in use, you can specify a different port:
PORT=8081 uv run app/main.py
- Start ngrok or your preferred tunneling service to expose port 8000
- Open
http://localhost:8000
in your browser - Configure the voice agent:
- Meeting URL: The URL of the meeting the bot should join
- WebSocket Tunnel URL: Your ngrok WebSocket URL
- Agent Prompt: Customize the AI assistant's personality
- Greeting Message: Set what the agent says when it joins
- Voice Model: Choose from available Deepgram voice models
- Click "Launch Voice Agent" to start the bot
app/main.py
: Main FastAPI application with WebSocket endpointstatic/index.html
: Frontend interfacepyproject.toml
: Project dependencies and metadata.env.example
: Environment variable template
GET /
: Serve the web interfaceWebSocket /ws
: Handle real-time audio streaming and bot configuration
The application can be configured using environment variables:
DEEPGRAM_API_KEY
: Deepgram API key for speech processingOPENAI_API_KEY
: OpenAI API key for language processingATTENDEE_API_KEY
: Attendee API key for meeting bot functionalityATTENDEE_API_HOST
: Attendee API host URL (default: https://app.attendee.dev)NGROK_URL
: Ngrok URL for WebSocket connections (default: ws://localhost:8080)PORT
: Port to run the server on (default: 8080)