FlexVoice

Flex on cloud-paid-APIs aproche with a flexible modular, easly upgradeable voice chat framework by swaps VAD, STT, LLM, and TTS in local pipeline for real-time voice assistants.

⭐ Features & Components

🔥 Fully local real-time voice chat via modular pipeline WebRTC>VAD>SST>LLM>TTS in Gradio interface:

📐 Structured Architecture in independent modules:

def process_pipeline(audio, conversation):
    # 1. Voice Activity Detection: Audio array -> `check_vad()` ->  Boolean (speech detected)
    # 2. Speech-to-Text: Audio array -> `load_stt_models` and `transcribe_with_whisper(audio_array)` ->  Transcribed text
    # 3. LLM: Text prompt -> `send_to_llm` -> Text response
    # 4. Text-to-Speech: Text -> `initialize_tts`, `text_to_speech(text)` -> (audio_array, sample_rate)

flowchart TD
    A(["`🎤 Start: Receive Audio Input (WebRTC)`"]) --> B
    B["`🔍 Voice Activity Detection (VAD): Silero VAD (check_vad)`"] --> C{"`❓ Speech Detected?`"}
    C -- Yes --> D["`🗣️ Speech-to-Text (STT): Whisper (transcribe_with_whisper)`"]
    D --> E["`🤖 Language Model (LLM): OpenAI API (send_to_llm)`"]
    E --> F["`🔊 Text-to-Speech (TTS): Kokoro (text_to_speech)`"]
    F --> G(["`🎧 End: Output Audio Response (Gradio Interface)`"])
    C -- No --> H(["`🚫 End: No Action`"])

    class A,G,H output
    class C decision
    class B,D,E,F process

🎤 Voice Activity Detection: (Silero VAD)
🗣️ Speech-to-Text insanely-fast-whisper
🤖 Language Model (OpenAI API format, local server)
🔊 Text-to-Speech (Kokoro)
🤝 Contributing: Easy to modify - share your implementations!

🚀 Setup & Quick Start

# 📜 Requirements: Python>=3.8+, cu118+ (CUDA Toolkit for Whisper) and server openai-API (default: ooba webui)
# 1. 🔧 Clone & Install dependencies:
git clone https://github.com/Katehuuh/FlexVoice.git && cd FlexVoice
python -m venv venv && venv\Scripts\activate
pip install -r requirements.txt
# 2. Run! Opens http://localhost:7861
python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlexVoice

⭐ Features & Components

📐 Structured Architecture in independent modules:

🚀 Setup & Quick Start

About

Uh oh!

Releases

Uh oh!

Languages

License

Katehuuh/FlexVoice

Folders and files

Latest commit

History

Repository files navigation

FlexVoice

⭐ Features & Components

📐 Structured Architecture in independent modules:

🚀 Setup & Quick Start

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages