This Flask application performs:
- Text sentiment analysis using the CardiffNLP Twitter RoBERTa model.
- Audio sentiment analysis by transcribing uploaded audio and analyzing the text sentiment.
- Tone analysis of customer feedback.
- Text-to-speech generation using ElevenLabs API.
-
Clone the repository and navigate to the project directory.
-
Create and activate a Python virtual environment:
python3 -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Make sure
ffmpeg
is installed on your system:
ffmpeg -version
If not installed, install it via your package manager, e.g.,
sudo apt install ffmpeg
Run the Flask app:
python app.py
POST /analyze/text
- Sentiment analysis for raw text.POST /analyze/audio
- Sentiment analysis for audio files.POST /generate/audio
- Generate speech audio from text using ElevenLabs API.
- For audio files, the app converts audio to WAV format before transcription.
- When using ElevenLabs text-to-speech output, you may need to convert the audio sample rate and channels for compatibility, e.g.:
ffmpeg -i elevenlabs_output.wav -ar 16000 -ac 1 -c:a pcm_s16le elevenlabs_output1.wav
This converts the audio to 16 kHz, mono channel, and PCM signed 16-bit little-endian format, often required for ASR or other audio processing tools.
CORS is enabled in the app for local Swagger UI and cross-origin requests.
Set your ElevenLabs API key in the app (ELEVENLABS_API_KEY
).