NetworkMonitorKokoro

Overview

NetworkMonitorKokoro is a Flask-based service that provides advanced text-to-speech (T2S) and speech-to-text (S2T) functionalities using state-of-the-art machine learning models:

Text-to-Speech (T2S): Converts text input into high-quality synthesized speech using the Kokoro model.
Speech-to-Text (S2T): Transcribes audio files into text using OpenAI's Whisper model.

This repository leverages ONNX for efficient inference and Hugging Face's model hub for seamless model downloads.

You can see the script in action with the Quantum Network Monitor Assistant at https://freenetworkmonitor.click.

Features

T2S (Text-to-Speech)
- High-quality voice synthesis using the Kokoro ONNX model.
- Configurable voice styles via preloaded voicepacks.
S2T (Speech-to-Text)
- Accurate audio transcription with OpenAI Whisper.
- Handles a wide range of audio inputs.
Automatic Model Management
- Models are automatically downloaded from the Hugging Face Hub if not present locally.
Flask API Endpoints
- /generate_audio: Convert text into speech.
- /transcribe_audio: Transcribe audio into text.

Installation

Prerequisites

Ensure you have the following installed:

Python 3.8+ (Check with python3 --version or python --version)
Pip (Check with pip --version)
A CUDA-enabled GPU (optional, for faster inference)

System Dependencies (Debian/Ubuntu):

sudo apt-get install libsndfile1 espeak-ng

System Dependencies (Windows) choco install libsndfile espeak-ng -y
System Dependencies (Mac) brew install libsndfile espeak-ng

Steps

Clone the repository:

git clone https://github.com/yourusername/NetworkMonitorKokoro.git
cd NetworkMonitorKokoro

Create and activate a virtual environment:
- On Linux/macOS:
```
python3 -m venv venv
source venv/bin/activate
```
- On Windows:
```
python3 -m venv venv
venv\Scripts\activate
```
Once activated, you should see (venv) at the start of your command prompt, indicating the virtual environment is active.
Install the required dependencies:
- Run the installation script (cross-platform):
```
python3 install_dependencies.py
```
This script detects your operating system and installs the dependencies accordingly for Linux, Windows, and macOS.
Set up the models:
- The Kokoro T2S model and OpenAI Whisper S2T model will be downloaded automatically during runtime.
Start the Flask server:
```
python3 app.py
```
Deactivate the virtual environment (optional):
```
deactivate
```

Running as a Linux Service

To run NetworkMonitorKokoro as a systemd service on Linux, follow these steps:

Clone the repository:

git clone https://github.com/yourusername/NetworkMonitorKokoro.git
cd NetworkMonitorKokoro

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
python3 install_dependencies.py
```

Create a systemd service file:

sudo nano /etc/systemd/system/networkmonitor-kokoro.service

Add the following content:

[Unit]
Description=NetworkMonitorKokoro Service
After=network.target

[Service]
User=yourusername
WorkingDirectory=/path/to/NetworkMonitorKokoro
ExecStart=/path/to/NetworkMonitorKokoro/venv/bin/python3 /path/to/NetworkMonitorKokoro/app.py
Restart=always
Environment=PYTHONUNBUFFERED=1

[Install]
WantedBy=multi-user.target

Replace /path/to/NetworkMonitorKokoro with the full path to the directory where the repository was cloned and the virtual environment was created. Replace yourusername with your Linux username.

Set proper permissions:

sudo chmod 644 /etc/systemd/system/networkmonitor-kokoro.service

Reload systemd:
```
sudo systemctl daemon-reload
```

Start the service:

sudo systemctl start networkmonitor-kokoro

Enable the service to start on boot:

sudo systemctl enable networkmonitor-kokoro

Check the service status:

sudo systemctl status networkmonitor-kokoro

Usage

API Endpoints

1. Generate Audio

Endpoint: /generate_audio
Method: POST

Request Body:

{
  "text": "Your text here",
  "output_dir": "/absolute/path/to/save/file/to/"
}

Response:

{
  "status": "success",
  "output_path": "/absolute/path/to/save/file/to/<hash>.wav"
}

2. Transcribe Audio

Endpoint: /transcribe_audio
Method: POST
Request Body:
- A form-data request with an audio file.

Response:

{
  "status": "success",
  "transcription": "Your transcription here"
}

Example Requests

Generate Audio

curl -X POST \
     -H "Content-Type: application/json" \
     -d '{"text": "Hello, world!","output_dir":"/tmp"}' \
     http://127.0.0.1:5000/generate_audio

Transcribe Audio

curl -X POST \
     -F "file=@sample_audio.wav" \
     http://127.0.0.1:5000/transcribe_audio

Dependencies

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch for your feature/bugfix.
Submit a pull request with a detailed description of your changes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

Hugging Face for providing pre-trained models.
OpenAI for the Whisper model.
ONNX for efficient inference.

Contact

For questions or support, please open an issue or contact [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
COMMERCIAL_LICENSE.md		COMMERCIAL_LICENSE.md
LICENSE		LICENSE
README.md		README.md
app-0.19.py		app-0.19.py
app.py		app.py
audio_server.service		audio_server.service
commit		commit
config.json		config.json
install.sh		install.sh
install_dependencies.py		install_dependencies.py
kokoro.py		kokoro.py
test_all.py		test_all.py
tts_processor.py		tts_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NetworkMonitorKokoro

Overview

Features

Installation

Prerequisites

Steps

Running as a Linux Service

Usage

API Endpoints

1. Generate Audio

2. Transcribe Audio

Example Requests

Generate Audio

Transcribe Audio

Dependencies

Contributing

License

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Mungert69/NetworkMonitorKokoro

Folders and files

Latest commit

History

Repository files navigation

NetworkMonitorKokoro

Overview

Features

Installation

Prerequisites

Steps

Running as a Linux Service

Usage

API Endpoints

1. Generate Audio

2. Transcribe Audio

Example Requests

Generate Audio

Transcribe Audio

Dependencies

Contributing

License

Acknowledgements

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages