AI-Powered Software Development Life Cycle Automation Platform
Transform your software development process with AI-powered automation. From meeting transcriptions to complete technical documentation, streamline your entire SDLC workflow.
The SDLC Agent Workflow is a production-ready AI platform that automates key aspects of software development, starting with audio transcription and document generation, with a comprehensive roadmap to become a complete SDLC automation solution.
- 🎤 Audio Transcription: High-quality transcription using OpenAI Whisper models
- 🤖 AI Meeting Analysis: Generate key meeting points and summaries with OpenAI GPT
- 📋 PRD Generation: Transform discussions into industry-standard Product Requirements Documents
- 🔧 Android TRD Generation: Convert PRDs into comprehensive Android Technical Requirements Documents
- 🎨 Figma MCP Integration: Model Context Protocol server for comprehensive Figma design data extraction
- 📱 Android MCP Integration: AI-powered Android device automation with LLM integration for intelligent mobile testing and interaction
- 📁 Multi-Format Support: MP3, WAV, M4A, FLAC, AAC, OGG, WMA, MP4, MOV, AVI
- ⚙️ Configurable Settings: Extensive customization through environment variables
Complete SDLC automation platform covering:
- Requirements & Planning → Design & Architecture → Development Support → Testing & Quality → Deployment & Operations → Documentation & Knowledge
- Python 3.10 or higher
- OpenAI API key
uv
package manager (recommended) orpip
-
Clone the repository
git clone [email protected]:tomdwipo/agent.git cd agent
-
Install dependencies
# Using uv (recommended) uv sync # Or using pip pip install -r requirements.txt
-
Configure environment
# Create .env file cp .env.example .env # Add your OpenAI API key echo "OPENAI_API_KEY=your_api_key_here" >> .env
-
Launch the application
# Using uv uv run python transcribe_gradio.py # Or using python directly python transcribe_gradio.py
-
Access the interface Open your browser to
http://localhost:7860
Feature | Status | Description | Documentation |
---|---|---|---|
Audio Transcription | ✅ Complete | OpenAI Whisper integration with multi-format support | API Docs |
AI Meeting Analysis | ✅ Complete | Key points extraction and meeting summaries | API Docs |
PRD Generation v1.0 | ✅ Complete | 8-section industry-standard Product Requirements Documents | Feature Docs |
Android TRD Generation v1.0 | ✅ Complete | 7-section Android Technical Requirements Documents | Feature Docs |
Figma MCP Integration v1.0 | ✅ Complete | Model Context Protocol server for Figma design data extraction | Feature Docs |
Android MCP Integration v1.0 | ✅ Complete | AI-powered Android device automation with LLM integration for intelligent mobile testing | Setup Guide |
Phase | Timeline | Key Components | Expected Impact |
---|---|---|---|
Phase 1: Requirements & Planning | Q3 2025 | Enhanced PRD + Project Planning Agent | 50% planning time reduction |
Phase 2: Design & Architecture | Q4 2025 | System Design + UI/UX Design Agents | 60% faster architecture documentation |
Phase 3: Development Support | Q1 2026 | Code Generation + Development Standards | 70% boilerplate code reduction |
Phase 4: Testing & Quality | Q2 2026 | Test Planning + Quality Assurance Agents | 80% test coverage automation |
Phase 5: Deployment & Operations | Q3 2026 | DevOps + Infrastructure Management | 90% deployment automation |
Phase 6: Documentation & Knowledge | Q4 2026 | Documentation + Knowledge Management | 75% documentation automation |
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ UI Layer │ │ Service Layer │ │ Configuration │
│ │ │ │ │ │
│ • Gradio UI │◄──►│ • OpenAI Service│◄──►│ • Settings │
│ • Components │ │ • Whisper Service│ │ • Constants │
│ • Interface │ │ • File Service │ │ • Environment │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Backend: Python 3.10+, OpenAI API, Whisper
- Frontend: Gradio (Web UI)
- Package Management:
uv
withpyproject.toml
- Configuration: Environment variables with
.env
support - Testing: Comprehensive test suite with pytest
agent/
├── main.py # Main application entry point
├── transcribe_gradio.py # Gradio interface launcher
├── pyproject.toml # Project configuration
├── requirements.txt # Dependencies
├── config/ # Configuration management
│ ├── settings.py # Application settings
│ ├── constants.py # System constants
│ └── __init__.py
├── services/ # Core business logic
│ ├── openai_service.py # OpenAI API integration
│ ├── whisper_service.py# Audio transcription
│ ├── file_service.py # File operations
│ └── __init__.py
├── ui/ # User interface components
│ ├── gradio_interface.py# Main UI interface
│ ├── components.py # UI components
│ └── __init__.py
├── tests/ # Test suite
├── demos/ # Demo applications
└── docs/ # Comprehensive documentation
- Quick Start Guide - Get up and running quickly
- Features Overview - Complete feature documentation
- User Manual - Comprehensive user guide
- Architecture Overview - Technical system design
- API Reference - Complete API documentation
- Contributing Guide - Development workflow
- Testing Guide - Testing procedures
- Complete Project Proposal - Full business case and roadmap
- Architecture Evolution - Technical progress history
- Feature Status Tracking - Development progress
from services.whisper_service import WhisperService
# Initialize service
whisper = WhisperService()
# Transcribe audio file
result = whisper.transcribe("meeting.mp3")
print(result["text"])
from services.openai_service import OpenAIService
# Initialize service
openai_service = OpenAIService()
# Generate PRD from meeting transcript
prd = openai_service.generate_prd(transcript_text)
print(prd)
- Upload Audio → Transcribe meeting recording
- Generate Analysis → Extract key points and action items
- Create PRD → Transform discussion into structured requirements
- Generate TRD → Convert PRD into technical specifications
- Download Documents → Export all generated documents
# OpenAI Configuration
OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-4
OPENAI_MAX_TOKENS=4000
# Whisper Configuration
WHISPER_MODEL=base
WHISPER_LANGUAGE=auto
# Application Settings
DEBUG=false
LOG_LEVEL=INFO
See Configuration API Documentation for complete configuration options.
# Clone repository
git clone [email protected]:tomdwipo/agent.git
cd agent
# Install development dependencies
uv sync --dev
# Run tests
uv run pytest
# Run with development settings
uv run python transcribe_gradio.py
# Run all tests
uv run pytest
# Run specific test file
uv run pytest tests/test_prd_services.py
# Run with coverage
uv run pytest --cov=services
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for new functionality
- Run the test suite (
uv run pytest
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
See Contributing Guidelines for detailed information.
- Core Foundation: Fully functional audio transcription and document generation
- Production Features: PRD and Android TRD generation complete
- Architecture: Modular, scalable design ready for expansion
- Documentation: Comprehensive documentation and testing
- Phase 1: 50% planning time reduction
- Phase 2: 60% faster architecture documentation
- Phase 3: 70% boilerplate code reduction
- Phase 4: 80% test coverage automation
- Phase 5: 90% deployment automation
- Phase 6: 75% documentation automation
Meeting/Discussion → Transcription → PRD → TRD → Architecture → Code → Tests → Deployment → Documentation
- Documentation: Comprehensive guides in docs/
- Issues: Report bugs and request features via GitHub Issues
- Discussions: Join community discussions
We welcome contributions! See our Contributing Guide for:
- Code contribution guidelines
- Development setup instructions
- Testing requirements
- Documentation standards
- Features Implemented: 5/5 core features (100%)
- Architecture Phases: 3/3 complete (Service Layer, Configuration, UI Components)
- Test Coverage: Comprehensive test suite
- Production Readiness: ✅ Ready for deployment
- Transcription Speed: Real-time processing for most audio formats
- PRD Generation: ~30 seconds for typical meeting transcript
- TRD Generation: ~45 seconds from PRD input
- Multi-format Support: 9 audio/video formats supported
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for Whisper and GPT API
- Gradio for the excellent web UI framework
- Python Community for the amazing ecosystem
- Contributors who help make this project better
- Repository: github.com/tomdwipo/agent
- Documentation: Complete Documentation Hub
- Project Proposal: SDLC Agent Workflow Proposal
- Issues: GitHub Issues
🚀 Ready to transform your SDLC workflow? Get started with the Quick Start guide above!