Skip to content

tomdwipo/agent

Repository files navigation

🎯 SDLC Agent Workflow

AI-Powered Software Development Life Cycle Automation Platform

Version Python License Status

Transform your software development process with AI-powered automation. From meeting transcriptions to complete technical documentation, streamline your entire SDLC workflow.


🚀 What is SDLC Agent Workflow?

The SDLC Agent Workflow is a production-ready AI platform that automates key aspects of software development, starting with audio transcription and document generation, with a comprehensive roadmap to become a complete SDLC automation solution.

🎯 Current Capabilities (Production Ready ✅)

  • 🎤 Audio Transcription: High-quality transcription using OpenAI Whisper models
  • 🤖 AI Meeting Analysis: Generate key meeting points and summaries with OpenAI GPT
  • 📋 PRD Generation: Transform discussions into industry-standard Product Requirements Documents
  • 🔧 Android TRD Generation: Convert PRDs into comprehensive Android Technical Requirements Documents
  • 🎨 Figma MCP Integration: Model Context Protocol server for comprehensive Figma design data extraction
  • 📱 Android MCP Integration: AI-powered Android device automation with LLM integration for intelligent mobile testing and interaction
  • 📁 Multi-Format Support: MP3, WAV, M4A, FLAC, AAC, OGG, WMA, MP4, MOV, AVI
  • ⚙️ Configurable Settings: Extensive customization through environment variables

🔮 Future Vision (2025-2026 Roadmap)

Complete SDLC automation platform covering:

  • Requirements & PlanningDesign & ArchitectureDevelopment SupportTesting & QualityDeployment & OperationsDocumentation & Knowledge

⚡ Quick Start

Prerequisites

  • Python 3.10 or higher
  • OpenAI API key
  • uv package manager (recommended) or pip

Installation

  1. Clone the repository

    git clone [email protected]:tomdwipo/agent.git
    cd agent
  2. Install dependencies

    # Using uv (recommended)
    uv sync
    
    # Or using pip
    pip install -r requirements.txt
  3. Configure environment

    # Create .env file
    cp .env.example .env
    
    # Add your OpenAI API key
    echo "OPENAI_API_KEY=your_api_key_here" >> .env
  4. Launch the application

    # Using uv
    uv run python transcribe_gradio.py
    
    # Or using python directly
    python transcribe_gradio.py
  5. Access the interface Open your browser to http://localhost:7860


🎯 Features Overview

✅ Production Features

Feature Status Description Documentation
Audio Transcription ✅ Complete OpenAI Whisper integration with multi-format support API Docs
AI Meeting Analysis ✅ Complete Key points extraction and meeting summaries API Docs
PRD Generation v1.0 ✅ Complete 8-section industry-standard Product Requirements Documents Feature Docs
Android TRD Generation v1.0 ✅ Complete 7-section Android Technical Requirements Documents Feature Docs
Figma MCP Integration v1.0 ✅ Complete Model Context Protocol server for Figma design data extraction Feature Docs
Android MCP Integration v1.0 ✅ Complete AI-powered Android device automation with LLM integration for intelligent mobile testing Setup Guide

📋 Planned Features (2025-2026)

Phase Timeline Key Components Expected Impact
Phase 1: Requirements & Planning Q3 2025 Enhanced PRD + Project Planning Agent 50% planning time reduction
Phase 2: Design & Architecture Q4 2025 System Design + UI/UX Design Agents 60% faster architecture documentation
Phase 3: Development Support Q1 2026 Code Generation + Development Standards 70% boilerplate code reduction
Phase 4: Testing & Quality Q2 2026 Test Planning + Quality Assurance Agents 80% test coverage automation
Phase 5: Deployment & Operations Q3 2026 DevOps + Infrastructure Management 90% deployment automation
Phase 6: Documentation & Knowledge Q4 2026 Documentation + Knowledge Management 75% documentation automation

🏗️ Architecture

System Overview

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   UI Layer      │    │  Service Layer  │    │ Configuration   │
│                 │    │                 │    │                 │
│ • Gradio UI     │◄──►│ • OpenAI Service│◄──►│ • Settings      │
│ • Components    │    │ • Whisper Service│   │ • Constants     │
│ • Interface     │    │ • File Service  │    │ • Environment   │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Technology Stack

  • Backend: Python 3.10+, OpenAI API, Whisper
  • Frontend: Gradio (Web UI)
  • Package Management: uv with pyproject.toml
  • Configuration: Environment variables with .env support
  • Testing: Comprehensive test suite with pytest

Project Structure

agent/
├── main.py                 # Main application entry point
├── transcribe_gradio.py    # Gradio interface launcher
├── pyproject.toml         # Project configuration
├── requirements.txt       # Dependencies
├── config/               # Configuration management
│   ├── settings.py       # Application settings
│   ├── constants.py      # System constants
│   └── __init__.py
├── services/             # Core business logic
│   ├── openai_service.py # OpenAI API integration
│   ├── whisper_service.py# Audio transcription
│   ├── file_service.py   # File operations
│   └── __init__.py
├── ui/                   # User interface components
│   ├── gradio_interface.py# Main UI interface
│   ├── components.py     # UI components
│   └── __init__.py
├── tests/                # Test suite
├── demos/                # Demo applications
└── docs/                 # Comprehensive documentation

📚 Documentation

🎯 For Users

🛠️ For Developers

📋 For Project Managers & Stakeholders


🚀 Usage Examples

Basic Audio Transcription

from services.whisper_service import WhisperService

# Initialize service
whisper = WhisperService()

# Transcribe audio file
result = whisper.transcribe("meeting.mp3")
print(result["text"])

PRD Generation

from services.openai_service import OpenAIService

# Initialize service
openai_service = OpenAIService()

# Generate PRD from meeting transcript
prd = openai_service.generate_prd(transcript_text)
print(prd)

Complete Workflow

  1. Upload Audio → Transcribe meeting recording
  2. Generate Analysis → Extract key points and action items
  3. Create PRD → Transform discussion into structured requirements
  4. Generate TRD → Convert PRD into technical specifications
  5. Download Documents → Export all generated documents

🔧 Configuration

Environment Variables

# OpenAI Configuration
OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-4
OPENAI_MAX_TOKENS=4000

# Whisper Configuration
WHISPER_MODEL=base
WHISPER_LANGUAGE=auto

# Application Settings
DEBUG=false
LOG_LEVEL=INFO

Advanced Configuration

See Configuration API Documentation for complete configuration options.


🧪 Development

Setup Development Environment

# Clone repository
git clone [email protected]:tomdwipo/agent.git
cd agent

# Install development dependencies
uv sync --dev

# Run tests
uv run pytest

# Run with development settings
uv run python transcribe_gradio.py

Running Tests

# Run all tests
uv run pytest

# Run specific test file
uv run pytest tests/test_prd_services.py

# Run with coverage
uv run pytest --cov=services

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Run the test suite (uv run pytest)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

See Contributing Guidelines for detailed information.


📈 Project Status & Roadmap

Current Status: Production Ready v1.0

  • Core Foundation: Fully functional audio transcription and document generation
  • Production Features: PRD and Android TRD generation complete
  • Architecture: Modular, scalable design ready for expansion
  • Documentation: Comprehensive documentation and testing

Success Metrics by Phase

  • Phase 1: 50% planning time reduction
  • Phase 2: 60% faster architecture documentation
  • Phase 3: 70% boilerplate code reduction
  • Phase 4: 80% test coverage automation
  • Phase 5: 90% deployment automation
  • Phase 6: 75% documentation automation

Complete Workflow Vision

Meeting/Discussion → Transcription → PRD → TRD → Architecture → Code → Tests → Deployment → Documentation

🤝 Community & Support

Getting Help

  • Documentation: Comprehensive guides in docs/
  • Issues: Report bugs and request features via GitHub Issues
  • Discussions: Join community discussions

Contributing

We welcome contributions! See our Contributing Guide for:

  • Code contribution guidelines
  • Development setup instructions
  • Testing requirements
  • Documentation standards

📊 Metrics & Performance

Current Application Metrics

  • Features Implemented: 5/5 core features (100%)
  • Architecture Phases: 3/3 complete (Service Layer, Configuration, UI Components)
  • Test Coverage: Comprehensive test suite
  • Production Readiness: ✅ Ready for deployment

Performance Benchmarks

  • Transcription Speed: Real-time processing for most audio formats
  • PRD Generation: ~30 seconds for typical meeting transcript
  • TRD Generation: ~45 seconds from PRD input
  • Multi-format Support: 9 audio/video formats supported

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🎉 Acknowledgments

  • OpenAI for Whisper and GPT API
  • Gradio for the excellent web UI framework
  • Python Community for the amazing ecosystem
  • Contributors who help make this project better

📞 Contact & Links


🚀 Ready to transform your SDLC workflow? Get started with the Quick Start guide above!

About

Software Development Life Cycle agent workflow

Resources

Stars

Watchers

Forks

Packages

No packages published