llm-cost-tracker

A comprehensive platform combining LLM cost tracking with quantum-inspired task planning. This self-hostable solution captures token, latency, and cost data from LangChain and LiteLLM while providing advanced task scheduling using quantum computing concepts like superposition, entanglement, and quantum annealing.

🌟 Dual-Purpose Platform

📊 LLM Cost Tracking

OpenTelemetry collector and rules engine for LLM operations
Real-time cost monitoring with Postgres storage and Grafana visualization
Budget-aware model switching using Vellum's price catalog

⚛️ Quantum Task Planning

Quantum-inspired scheduling with superposition, entanglement, and interference patterns
High-performance optimization using quantum annealing algorithms
Enterprise-grade scalability with auto-scaling and load balancing

✨ Key Features

📊 LLM Cost Tracking Features

Feature	Details
Real-time Metering	Asynchronous Python middleware hooks into LangChain's `AsyncIteratorCallbackHandler` to capture token usage, latency, prompts, and the specific model used.
Budget Rules Engine	YAML rules (`monthly_budget`, `swap_threshold`) trigger automatic model routing via LiteLLM router or Vellum API, based on up-to-date model prices.
Dashboards	Comes with a pre-built Grafana JSON dashboard located in `/dashboards/llm-cost-dashboard.json` (UID: `llm-cost-dashboard`) to visualize costs by application, model, and user.
Alerting	Integrates with Prometheus to send alerts to Slack or OpsGenie whenever predefined cost thresholds are exceeded.
Pluggable Storage	Defaults to Postgres, with adapters available for ClickHouse and BigQuery to offer flexibility in data storage.

⚛️ Quantum Task Planning Features

Feature	Details
Quantum Scheduling	Tasks exist in superposition states, allowing for probabilistic execution planning and optimal resource allocation using quantum annealing algorithms.
Task Entanglement	Related tasks can be quantum-entangled, ensuring coordinated execution and maintaining dependencies through quantum interference patterns.
Performance Optimization	High-performance caching with LRU eviction, load balancing with circuit breakers, and auto-scaling based on queue utilization and resource metrics.
Global Compliance	Built-in GDPR/CCPA compliance with PII detection, data anonymization, consent management, and data subject rights (right to access, delete, portability).
Multilingual Support	Native internationalization (i18n) with support for 6 languages: English, Spanish, French, German, Japanese, and Chinese (Simplified).
Production Ready	Zero-downtime deployments, comprehensive monitoring with Grafana/Prometheus, automated backups, security scanning, and enterprise-grade reliability.

🏗️ Reference Architecture

LLM Cost Tracking Flow

LangChain ↔ Cost-Middleware → OpenTelemetry SDK → OTLP Collector → Postgres → Grafana
                                    ↘ Prometheus/Alertmanager

Quantum Task Planning Flow

Tasks → Quantum Planner → Annealing Optimizer → Load Balancer → Execution
   ↓         ↓                    ↓                   ↓            ↓
Cache    Monitoring        Auto-Scaler       Circuit Breakers   Results

Integrated System Architecture

                    ┌─── LLM Cost Tracker ───┐
                    │                        │
    LangChain ──────┼── OpenTelemetry ──────┼──── Postgres
                    │       │               │       │
                    │   Prometheus ─────────┼────── Grafana
                    │                       │
    Tasks ──────────┼── Quantum Planner ───┼──── Execution
                    │       │               │       │
                    │   Monitoring ─────────┼────── Results
                    │                       │
                    └───────────────────────┘

⚡ Quick Start

🐳 Production Deployment (Recommended)

# Clone repository
git clone https://github.com/terragon-labs/llm-cost-tracker
cd llm-cost-tracker

# Configure production environment
cp .env.production.example .env.production
# Edit .env.production with your settings

# Deploy with zero-downtime
chmod +x scripts/deploy.sh
./scripts/deploy.sh deploy

# Access services
# API: https://api.your-domain.com
# Grafana: https://grafana.your-domain.com  
# Quantum Dashboard: https://api.your-domain.com/api/v1/quantum/system/state

🔬 Development Setup

# Clone and setup
git clone https://github.com/terragon-labs/llm-cost-tracker
cd llm-cost-tracker

# Start services
docker compose up -d

# Install dependencies
poetry install

# Run LLM cost tracking demo
python examples/streamlit_demo.py

# Test quantum task planner
curl -X GET http://localhost:8000/api/v1/quantum/demo

# Access Grafana and import dashboard
# http://localhost:3000 (admin/admin)
# Import: /dashboards/llm-cost-dashboard.json

🔐 Security

This tool handles sensitive API keys. To safeguard these credentials, we follow an encrypted proxy pattern. All keys should be stored in environment variables or a secure vault. For reporting vulnerabilities, please refer to our organization's SECURITY.md file.

📚 Documentation

API Reference - Complete API documentation with examples
Quantum Architecture - Deep dive into quantum-inspired concepts
Deployment Guide - Production deployment instructions
Examples - Sample implementations and use cases

🚀 API Examples

Quantum Task Planning

from llm_cost_tracker import QuantumTaskPlanner, QuantumTask, ResourcePool

# Initialize planner
planner = QuantumTaskPlanner()

# Create tasks
task1 = QuantumTask(
    id="analyze_data",
    name="Data Analysis",
    priority=9.0,
    estimated_duration_minutes=30
)

# Add to planner
planner.add_task(task1)

# Generate optimal schedule
schedule = planner.generate_schedule()
print(f"Optimal execution order: {schedule}")

# Execute tasks
results = await planner.execute_schedule_async(schedule)

REST API Usage

# Create a quantum task
curl -X POST http://localhost:8000/api/v1/quantum/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "id": "task_001",
    "name": "Machine Learning Pipeline", 
    "priority": 8.5,
    "estimated_duration_minutes": 45
  }'

# Generate optimal schedule
curl -X GET http://localhost:8000/api/v1/quantum/schedule

# Monitor system state
curl -X GET http://localhost:8000/api/v1/quantum/system/state

📈 Roadmap

LLM Cost Tracker

v0.1.0: ✅ Core tracing, Grafana dashboard, and Prometheus alerts
v0.2.0: Implementation of the budget-aware model swapper with Slack alerts
v0.3.0: Introduction of multi-tenant RBAC and per-project budgets

Quantum Task Planner

v0.1.0: ✅ Quantum-inspired scheduling with superposition and entanglement
v0.1.0: ✅ Performance optimization with caching and load balancing
v0.1.0: ✅ Global compliance (GDPR/CCPA) and i18n support
v0.2.0: Advanced quantum algorithms and machine learning integration
v0.3.0: Distributed quantum planning across multiple nodes

🤝 Contributing

We welcome contributions! Please see our organization-wide CONTRIBUTING.md for guidelines and our CODE_OF_CONDUCT.md. A CHANGELOG.md is maintained for version history.

📝 Licenses & Attribution

This project is licensed under the Apache-2.0 License. It incorporates functionalities inspired by Helicone, which is licensed under the MIT License. A copy of relevant downstream licenses can be found in the LICENSES/ directory.

📚 References

LangChain Callbacks: AsyncIteratorCallbackHandler Docs
Vellum LLM Cost Comparison: Vellum AI Blog
Helicone: Official Site

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.devcontainer		.devcontainer
.github		.github
.terragon		.terragon
.vscode		.vscode
config		config
dashboards		dashboards
docs		docs
examples		examples
scripts		scripts
sql		sql
src/llm_cost_tracker		src/llm_cost_tracker
tests		tests
workflow-configs-ready-to-deploy		workflow-configs-ready-to-deploy
.automation-scope.yaml		.automation-scope.yaml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.env.production.example		.env.production.example
.env.test		.env.test
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.releaserc.json		.releaserc.json
.yamllint.yaml		.yamllint.yaml
ARCHITECTURE.md		ARCHITECTURE.md
AUTONOMOUS_ENHANCEMENT_REPORT.md		AUTONOMOUS_ENHANCEMENT_REPORT.md
AUTONOMOUS_SDLC_COMPLETE.md		AUTONOMOUS_SDLC_COMPLETE.md
AUTONOMOUS_SDLC_COMPLETION_REPORT.md		AUTONOMOUS_SDLC_COMPLETION_REPORT.md
AUTONOMOUS_SDLC_ENHANCEMENT_COMPLETE.md		AUTONOMOUS_SDLC_ENHANCEMENT_COMPLETE.md
AUTONOMOUS_SDLC_ENHANCEMENT_SUMMARY.md		AUTONOMOUS_SDLC_ENHANCEMENT_SUMMARY.md
AUTONOMOUS_SDLC_EXECUTION_COMPLETE.md		AUTONOMOUS_SDLC_EXECUTION_COMPLETE.md
AUTONOMOUS_SDLC_IMPLEMENTATION_COMPLETE.md		AUTONOMOUS_SDLC_IMPLEMENTATION_COMPLETE.md
AUTONOMOUS_SDLC_MASTER_COMPLETION_FINAL.md		AUTONOMOUS_SDLC_MASTER_COMPLETION_FINAL.md
AUTONOMOUS_SDLC_QUANTUM_COMPLETION_REPORT.md		AUTONOMOUS_SDLC_QUANTUM_COMPLETION_REPORT.md
AUTONOMOUS_SDLC_QUANTUM_ENHANCEMENT_COMPLETE.md		AUTONOMOUS_SDLC_QUANTUM_ENHANCEMENT_COMPLETE.md
AUTONOMOUS_SDLC_QUANTUM_RESEARCH_PUBLICATION.md		AUTONOMOUS_SDLC_QUANTUM_RESEARCH_PUBLICATION.md
AUTONOMOUS_SDLC_RESEARCH_PUBLICATION.md		AUTONOMOUS_SDLC_RESEARCH_PUBLICATION.md
AUTONOMOUS_SYSTEM.md		AUTONOMOUS_SYSTEM.md
AUTONOMOUS_VALUE_DISCOVERY_SETUP.md		AUTONOMOUS_VALUE_DISCOVERY_SETUP.md
CHANGELOG.md		CHANGELOG.md
CHECKPOINT_A1_VALIDATION_REPORT.md		CHECKPOINT_A1_VALIDATION_REPORT.md
CITATION.cff		CITATION.cff
CI_WORKFLOW.md		CI_WORKFLOW.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
Dockerfile		Dockerfile
Dockerfile.production		Dockerfile.production
FINAL_AUTONOMOUS_SDLC_COMPLETION_REPORT.json		FINAL_AUTONOMOUS_SDLC_COMPLETION_REPORT.json
FINAL_AUTONOMOUS_SDLC_METRICS.json		FINAL_AUTONOMOUS_SDLC_METRICS.json
FINAL_SDLC_ACTIVATION_GUIDE.md		FINAL_SDLC_ACTIVATION_GUIDE.md
GENERATION_4_AUTONOMOUS_SDLC_REPORT.md		GENERATION_4_AUTONOMOUS_SDLC_REPORT.md
LICENSE		LICENSE
LICENSE_HEADER.txt		LICENSE_HEADER.txt
Makefile		Makefile
PRODUCTION_DEPLOYMENT_GUIDE.md		PRODUCTION_DEPLOYMENT_GUIDE.md
PROJECT_CHARTER.md		PROJECT_CHARTER.md
QUANTUM_RESEARCH_PUBLICATION.md		QUANTUM_RESEARCH_PUBLICATION.md
README.md		README.md
SDLC_FINAL_ACTIVATION_GUIDE.md		SDLC_FINAL_ACTIVATION_GUIDE.md
SDLC_IMPLEMENTATION_COMPLETE.md		SDLC_IMPLEMENTATION_COMPLETE.md
SDLC_IMPLEMENTATION_SUMMARY.md		SDLC_IMPLEMENTATION_SUMMARY.md
SECURITY.md		SECURITY.md
TERRAGON_AUTONOMOUS_SDLC_COMPLETE.md		TERRAGON_AUTONOMOUS_SDLC_COMPLETE.md
TERRAGON_AUTONOMOUS_SDLC_COMPLETION_SUMMARY.md		TERRAGON_AUTONOMOUS_SDLC_COMPLETION_SUMMARY.md
TERRAGON_AUTONOMOUS_SDLC_RESEARCH_PUBLICATION.md		TERRAGON_AUTONOMOUS_SDLC_RESEARCH_PUBLICATION.md
TERRAGON_SDLC_COMPREHENSIVE_DOCUMENTATION.md		TERRAGON_SDLC_COMPREHENSIVE_DOCUMENTATION.md
TERRAGON_SDLC_MASTER_COMPLETION_REPORT.md		TERRAGON_SDLC_MASTER_COMPLETION_REPORT.md
WORKFLOW_ACTIVATION_GUIDE.md		WORKFLOW_ACTIVATION_GUIDE.md
WORKFLOW_ACTIVATION_SUMMARY.md		WORKFLOW_ACTIVATION_SUMMARY.md
WORKFLOW_DEPLOYMENT_INSTRUCTIONS.md		WORKFLOW_DEPLOYMENT_INSTRUCTIONS.md
WORKFLOW_SETUP_INSTRUCTIONS.md		WORKFLOW_SETUP_INSTRUCTIONS.md
WORKFLOW_SETUP_MANUAL.md		WORKFLOW_SETUP_MANUAL.md
advanced_scaling_optimization_system.py		advanced_scaling_optimization_system.py
advanced_scaling_system.py		advanced_scaling_system.py
advanced_system_enhancements.py		advanced_system_enhancements.py
autonomous_research_discovery.py		autonomous_research_discovery.py
autonomous_robust_sdlc_system.py		autonomous_robust_sdlc_system.py
autonomous_scalable_optimization_system.py		autonomous_scalable_optimization_system.py
autonomous_senior_assistant.py		autonomous_senior_assistant.py
backlog.yml		backlog.yml
benchmark-config.json		benchmark-config.json
benchmark_results.json		benchmark_results.json
codecov.yml		codecov.yml
compliance_records.json		compliance_records.json
comprehensive_quality_gates.py		comprehensive_quality_gates.py
comprehensive_quality_gates_report.json		comprehensive_quality_gates_report.json
comprehensive_quality_gates_results.json		comprehensive_quality_gates_results.json
comprehensive_quality_gates_system.py		comprehensive_quality_gates_system.py
comprehensive_quality_system.py		comprehensive_quality_system.py
comprehensive_quality_validation_results.json		comprehensive_quality_validation_results.json
comprehensive_quality_validation_system.py		comprehensive_quality_validation_system.py
demo_autonomous_system.py		demo_autonomous_system.py
demo_basic_functionality.py		demo_basic_functionality.py
demo_enhanced_system.py		demo_enhanced_system.py
demo_gen1_enhanced_validation.py		demo_gen1_enhanced_validation.py
demo_generation_1_validation.py		demo_generation_1_validation.py
demo_generation_2_robustness.py		demo_generation_2_robustness.py
demo_generation_3_scaling.py		demo_generation_3_scaling.py
demo_generation_3_scaling_simple.py		demo_generation_3_scaling_simple.py

Uh oh!

License

danieleschmidt/llm-cost-tracker

Folders and files

Latest commit

History

Repository files navigation

llm-cost-tracker

🌟 Dual-Purpose Platform

📊 LLM Cost Tracking

⚛️ Quantum Task Planning

✨ Key Features

📊 LLM Cost Tracking Features

⚛️ Quantum Task Planning Features

🏗️ Reference Architecture

LLM Cost Tracking Flow

Quantum Task Planning Flow

Integrated System Architecture

⚡ Quick Start

🐳 Production Deployment (Recommended)

🔬 Development Setup

🔐 Security

📚 Documentation

🚀 API Examples

Quantum Task Planning

REST API Usage

📈 Roadmap

LLM Cost Tracker

Quantum Task Planner

🤝 Contributing

See Also

📝 Licenses & Attribution

📚 References

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 3

Uh oh!

Languages

Packages