HF Eco2AI Plugin

A comprehensive Hugging Face Trainer callback that logs CO₂, kWh, and regional grid intensity for every epoch. Built on Eco2AI's energy tracking with enterprise-grade monitoring, automation, and sustainability features.

🌱 Overview

Eco2AI hit HackerNews for accurate energy tracking, but lacks integration with popular ML frameworks. This plugin provides:

Seamless HF integration - Just add one callback
Real-time carbon tracking - CO₂ emissions per epoch/step
Regional grid data - Accurate carbon intensity by location
Enterprise monitoring - Prometheus/Grafana stack with alerting
Automation & CI/CD - Comprehensive workflows and maintenance
Security & compliance - Multi-tool security scanning and reporting
Sustainability optimization - Carbon budget enforcement and optimization recommendations

🏗️ Enterprise Features

Comprehensive SDLC Implementation

Project Foundation: ADRs, roadmaps, charters, and comprehensive documentation
Development Environment: Pre-commit hooks, version consistency, configuration validation
Testing Infrastructure: Unit, integration, performance, and E2E test suites
Build & Containerization: Multi-stage builds, security scanning, SBOM generation
Monitoring & Observability: Prometheus/Grafana stack with custom dashboards and alerts
Workflow Automation: GitHub Actions templates for CI/CD, security, and carbon tracking
Metrics & Automation: Real-time metrics collection, automated maintenance, and reporting

⚡ Key Metrics

Energy Consumption (kWh)
CO₂ Emissions (kg CO₂eq)
Grid Carbon Intensity (g CO₂/kWh)
GPU Power Draw (Watts)
Training Efficiency (samples/kWh)

📋 Requirements

# Core dependencies
python>=3.10
transformers>=4.40.0
pytorch-lightning>=2.2.0  # Optional
eco2ai>=2.0.0
pynvml>=11.5.0  # NVIDIA GPU monitoring

# Monitoring
prometheus-client>=0.20.0
grafana-api>=1.0.3
pandas>=2.0.0
plotly>=5.20.0

# Cloud carbon data
carbontracker>=1.5.0
codecarbon>=2.3.0

🛠️ Installation

Quick Setup (Recommended)

# Clone repository
git clone https://github.com/danieleschmidt/hf-eco2ai-plugin.git
cd hf-eco2ai-plugin

# Run automated setup (includes environment, monitoring, and validation)
python scripts/final-integration.py --task all

# Verify installation
python scripts/validate-setup.py --category all

Manual Installation

# From PyPI (when published)
pip install hf-eco2ai-plugin

# From source
git clone https://github.com/danieleschmidt/hf-eco2ai-plugin.git
cd hf-eco2ai-plugin
pip install -e .[dev,all]

# Set up development environment
pre-commit install
cp .env.example .env

🚀 Quick Start

Basic Usage

from transformers import Trainer, TrainingArguments
from hf_eco2ai import Eco2AICallback

# Add callback to trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    callbacks=[Eco2AICallback()]  # That's it!
)

# Train as normal - carbon tracking happens automatically
trainer.train()

# View carbon report
print(trainer.carbon_report)

Detailed Configuration

from hf_eco2ai import Eco2AICallback, CarbonConfig

# Configure tracking
carbon_config = CarbonConfig(
    project_name="llama-finetune",
    country="USA",
    region="CA",  # California
    gpu_ids=[0, 1, 2, 3],
    log_level="STEP",  # or "EPOCH"
    export_prometheus=True,
    prometheus_port=9091,
    save_report=True,
    report_path="carbon_impact.json"
)

# Initialize callback
eco_callback = Eco2AICallback(config=carbon_config)

# Use with trainer
trainer = Trainer(
    model=model,
    args=args,
    callbacks=[eco_callback]
)

📊 Features

Real-time Monitoring

# Access metrics during training
def on_epoch_end(trainer, eco_callback):
    metrics = eco_callback.get_current_metrics()
    print(f"Epoch {trainer.epoch}")
    print(f"Energy used: {metrics.energy_kwh:.2f} kWh")
    print(f"CO₂ emitted: {metrics.co2_kg:.2f} kg")
    print(f"Efficiency: {metrics.samples_per_kwh:.0f} samples/kWh")

PyTorch Lightning Integration

from pytorch_lightning import Trainer
from hf_eco2ai.lightning import Eco2AILightningCallback

# Works with Lightning too!
trainer = Trainer(
    callbacks=[Eco2AILightningCallback()],
    accelerator="gpu",
    devices=4
)

trainer.fit(model, datamodule)

Multi-GPU Tracking

# Automatically tracks all GPUs
eco_callback = Eco2AICallback(
    track_gpu_energy=True,
    gpu_ids="auto",  # Detects all available GPUs
    aggregate_gpus=True  # Sum energy across GPUs
)

# Or track specific GPUs
eco_callback = Eco2AICallback(
    gpu_ids=[0, 2, 4, 6],  # Track only these GPUs
    per_gpu_metrics=True   # Report per-GPU metrics
)

🏗️ Architecture

┌─────────────────┐     ┌──────────────┐     ┌─────────────────┐
│  HF Trainer     │────▶│ Eco2AI       │────▶│ Energy Monitor  │
│                 │     │ Callback     │     │ (CPU/GPU)       │
└─────────────────┘     └──────────────┘     └─────────────────┘
         │                      │                      │
         ▼                      ▼                      ▼
┌─────────────────┐     ┌──────────────┐     ┌─────────────────┐
│ Training Loop   │     │ Grid Carbon  │     │ Metrics Export  │
│                 │     │ Intensity    │     │ (Prometheus)    │
└─────────────────┘     └──────────────┘     └─────────────────┘

📈 Dashboards

Grafana Setup

# Import dashboard
curl -X POST http://localhost:3000/api/dashboards/db \
  -H "Content-Type: application/json" \
  -d @dashboards/hf-carbon-tracking.json

# Or use Docker
docker run -d \
  -p 3000:3000 \
  -v $(pwd)/dashboards:/var/lib/grafana/dashboards \
  hf-eco2ai/grafana-carbon

Dashboard Features

Real-time power consumption
Cumulative CO₂ emissions
Training efficiency trends
Regional grid intensity
Cost estimation (optional)

🌍 Regional Carbon Data

Automatic Region Detection

# Auto-detect location and grid carbon intensity
eco_callback = Eco2AICallback(
    auto_detect_location=True,
    use_real_time_carbon=True  # Live grid data
)

Manual Region Configuration

# Specify exact location for accurate carbon data
from hf_eco2ai import regions

eco_callback = Eco2AICallback(
    country="Germany",
    region="Bavaria",
    grid_carbon_intensity=regions.GERMANY.BAVARIA  # 411 g CO₂/kWh
)

# Or use coordinates
eco_callback = Eco2AICallback(
    latitude=48.1351,
    longitude=11.5820,
    use_real_time_carbon=True
)

🔧 Advanced Features

Custom Metrics

from hf_eco2ai import Eco2AICallback, MetricCollector

class CustomEco2AICallback(Eco2AICallback):
    def compute_additional_metrics(self, logs):
        # Add custom efficiency metrics
        if "loss" in logs and self.current_energy > 0:
            logs["loss_per_kwh"] = logs["loss"] / self.current_energy
            logs["carbon_per_sample"] = self.total_co2 / self.samples_seen
        
        return logs

Experiment Comparison

from hf_eco2ai.analysis import CarbonComparison

# Compare different training runs
comparison = CarbonComparison()

# Add experiments
comparison.add_experiment("baseline", "carbon_reports/baseline.json")
comparison.add_experiment("efficient", "carbon_reports/efficient.json")
comparison.add_experiment("quantized", "carbon_reports/quantized.json")

# Generate comparison report
comparison.plot_comparison(
    metrics=["total_co2", "samples_per_kwh", "cost"],
    save_path="comparison.html"
)

Carbon Budget Enforcement

from hf_eco2ai import CarbonBudgetCallback

# Stop training if carbon budget exceeded
budget_callback = CarbonBudgetCallback(
    max_co2_kg=10.0,  # 10kg CO₂ budget
    action="stop",    # or "warn"
    check_frequency=100  # Check every 100 steps
)

trainer = Trainer(
    model=model,
    args=args,
    callbacks=[budget_callback]
)

📊 Reports

Generate Carbon Report

# After training
carbon_report = eco_callback.generate_report()

print(carbon_report.summary())
"""
Training Carbon Impact Report
============================
Total Energy: 45.3 kWh
Total CO₂: 18.7 kg CO₂eq
Grid Intensity: 412 g CO₂/kWh
Duration: 3h 24m
Efficiency: 1,847 samples/kWh

Equivalent to:
- 78 km driven by car
- 2.3 trees needed to offset
- $4.53 in carbon credits
"""

# Export detailed report
carbon_report.to_json("carbon_impact.json")
carbon_report.to_pdf("carbon_impact.pdf")
carbon_report.to_csv("carbon_metrics.csv")

MLflow Integration

import mlflow
from hf_eco2ai.mlflow import log_carbon_metrics

# Automatically log to MLflow
with mlflow.start_run():
    trainer = Trainer(
        model=model,
        args=args,
        callbacks=[
            Eco2AICallback(mlflow_tracking=True)
        ]
    )
    
    trainer.train()
    
    # Metrics automatically logged to MLflow

🚦 CI/CD Integration

Comprehensive GitHub Actions Workflows

The project includes enterprise-grade workflow templates:

# Set up workflows (requires repository admin)
cp docs/workflows/comprehensive-ci.yml.template .github/workflows/ci.yml
cp docs/workflows/security.yml.template .github/workflows/security.yml
cp docs/workflows/carbon-tracking.yml.template .github/workflows/carbon-tracking.yml
cp docs/workflows/dependency-update.yml.template .github/workflows/dependency-update.yml

# Configure repository secrets (in GitHub Settings → Secrets)
# CODECOV_TOKEN, PYPI_API_TOKEN, SLACK_WEBHOOK_URL

Features

Multi-OS Testing: Ubuntu, Windows, macOS with Python 3.10-3.12
Security Scanning: Bandit, Safety, pip-audit, CodeQL, TruffleHog
Carbon Tracking: Automated CI/CD footprint monitoring with budget enforcement
Dependency Management: Automated security updates and dependency maintenance
Quality Gates: Pre-commit hooks, test coverage, security validation
Automated Releases: Semantic versioning, changelog generation, PyPI publishing

Example Training with Carbon Tracking

# .github/workflows/train.yml
name: Train with Carbon Tracking

on: [push]

jobs:
  train:
    runs-on: gpu-runner
    steps:
    - uses: actions/checkout@v4
    
    - name: Train model
      run: |
        python train.py --carbon-tracking
    
    - name: Check carbon budget
      run: |
        python scripts/collect-metrics.py --update
        python -m hf_eco2ai check-budget \
          --report carbon_impact.json \
          --max-co2 5.0
    
    - name: Upload carbon report
      uses: actions/upload-artifact@v4
      with:
        name: carbon-report
        path: carbon_impact.json

🌱 Best Practices

Reduce Carbon Impact

from hf_eco2ai.optimization import CarbonOptimizer

optimizer = CarbonOptimizer()

# Get recommendations
recommendations = optimizer.analyze_training(
    model=model,
    dataset_size=len(train_dataset),
    target_accuracy=0.95
)

print(recommendations)
"""
Recommendations to reduce carbon impact:
1. Use mixed precision training (-40% energy)
2. Enable gradient checkpointing (-25% memory)
3. Train during low-carbon hours (23:00-06:00)
4. Use renewable energy regions (Norway: 20g CO₂/kWh)
5. Consider model pruning after training
"""

Schedule Low-Carbon Training

from hf_eco2ai.scheduling import LowCarbonScheduler

scheduler = LowCarbonScheduler(
    region="California",
    flexibility_hours=12  # Can wait up to 12 hours
)

# Find optimal training window
best_time = scheduler.find_low_carbon_window(
    estimated_duration_hours=4,
    start_after=datetime.now()
)

print(f"Train at {best_time} for {scheduler.carbon_reduction:.1%} less CO₂")

🤝 Contributing

We welcome contributions! Priority areas:

Additional cloud provider regions
More efficient energy measurement
Integration with other frameworks
Carbon offset integrations
Visualization improvements

See CONTRIBUTING.md for guidelines.

📄 Citation

@software{hf_eco2ai_plugin,
  title={HF Eco2AI Plugin: Carbon Tracking for Transformers Training},
  author={Daniel Schmidt},
  year={2025},
  url={https://github.com/danieleschmidt/hf-eco2ai-plugin}
}

@article{eco2ai,
  title={Eco2AI: Carbon Emissions Tracking for AI},
  author={Eco2AI Team},
  year={2024}
}

📝 License

MIT License - See LICENSE for details.

📊 Enterprise Management

Automated Metrics Collection

# Real-time metrics collection
python scripts/collect-metrics.py --update --format summary

# Maintenance automation
python scripts/maintenance.py --task full --dry-run

# Integration management
python scripts/integration-tools.py --action all

Monitoring Stack

# Start monitoring services
docker-compose up -d

# Access dashboards
open http://localhost:3000  # Grafana
open http://localhost:9090  # Prometheus
open http://localhost:9093  # Alertmanager

Validation and Setup

# Comprehensive validation
python scripts/validate-setup.py --category all

# Setup verification
python scripts/final-integration.py --task validation

🔗 Resources

Documentation

Setup Guide - Complete environment setup
Operations Manual - Daily operations and maintenance
Troubleshooting Guide - Common issues and solutions
API Reference - Comprehensive API documentation
Workflow Setup Guide - GitHub Actions configuration

Architecture Documents

ADR Template - Decision documentation standard
Project Charter - Project scope and objectives
Roadmap - Development roadmap and milestones

Monitoring & Automation

Monitoring Guide - Complete monitoring setup
Carbon Runbooks - Emergency procedures
Deployment Guide - Multi-environment deployment

📧 Contact

GitHub Issues: Bug reports and feature requests
GitHub Discussions: Questions and community support
Repository: danieleschmidt/hf-eco2ai-plugin

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
benchmarks		benchmarks
config		config
deployment		deployment
deployment_package		deployment_package
docker		docker
docs		docs
production_deployment_package		production_deployment_package
production_release		production_release
scripts		scripts
src/hf_eco2ai		src/hf_eco2ai
templates/reports		templates/reports
tests		tests
.bandit		.bandit
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.gitleaksignore		.gitleaksignore
.pre-commit-config.yaml		.pre-commit-config.yaml
=4.40.0		=4.40.0
AI_OPTIMIZATION_PLAN.md		AI_OPTIMIZATION_PLAN.md
ARCHITECTURE.md		ARCHITECTURE.md
AUTONOMOUS_EXECUTION_MANIFEST.md		AUTONOMOUS_EXECUTION_MANIFEST.md
AUTONOMOUS_SDLC_COMPLETION_REPORT.md		AUTONOMOUS_SDLC_COMPLETION_REPORT.md
AUTONOMOUS_SDLC_COMPLETION_SUMMARY.md		AUTONOMOUS_SDLC_COMPLETION_SUMMARY.md
AUTONOMOUS_SDLC_COMPLETION_SUMMARY_FINAL.md		AUTONOMOUS_SDLC_COMPLETION_SUMMARY_FINAL.md
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMPLIANCE.md		COMPLIANCE.md
COMPREHENSIVE_QUALITY_GATES_REPORT.md		COMPREHENSIVE_QUALITY_GATES_REPORT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
DEVELOPMENT.md		DEVELOPMENT.md
Dockerfile		Dockerfile
FINAL_ASSESSMENT.md		FINAL_ASSESSMENT.md
FINAL_PRODUCTION_DEPLOYMENT_REPORT.json		FINAL_PRODUCTION_DEPLOYMENT_REPORT.json
FINAL_PRODUCTION_DEPLOYMENT_SUMMARY.md		FINAL_PRODUCTION_DEPLOYMENT_SUMMARY.md
FINAL_PRODUCTION_SUMMARY.md		FINAL_PRODUCTION_SUMMARY.md
FINAL_SDLC_EXECUTION_REPORT.md		FINAL_SDLC_EXECUTION_REPORT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
PRODUCTION_DEPLOYMENT_RECOMMENDATIONS.md		PRODUCTION_DEPLOYMENT_RECOMMENDATIONS.md
PROJECT_CHARTER.md		PROJECT_CHARTER.md
README.md		README.md
SDLC_ENHANCEMENT_REPORT.md		SDLC_ENHANCEMENT_REPORT.md
SECURITY.md		SECURITY.md
TERRAGON_AUTONOMOUS_SDLC_EXECUTION_FINAL_REPORT.md		TERRAGON_AUTONOMOUS_SDLC_EXECUTION_FINAL_REPORT.md
TERRAGON_AUTONOMOUS_SDLC_FINAL_REPORT.md		TERRAGON_AUTONOMOUS_SDLC_FINAL_REPORT.md
TERRAGON_AUTONOMOUS_SDLC_FINAL_SUMMARY.md		TERRAGON_AUTONOMOUS_SDLC_FINAL_SUMMARY.md
TERRAGON_SDLC_V5_FINAL_EXECUTION_REPORT.md		TERRAGON_SDLC_V5_FINAL_EXECUTION_REPORT.md
advanced_carbon_report.json		advanced_carbon_report.json
autonomous_execution_report.json		autonomous_execution_report.json
autonomous_sdlc_completion_master.py		autonomous_sdlc_completion_master.py
autonomous_sdlc_completion_report.py		autonomous_sdlc_completion_report.py
autonomous_validation_results.json		autonomous_validation_results.json
autonomous_validator.py		autonomous_validator.py
breakthrough_carbon_intelligence_demo.py		breakthrough_carbon_intelligence_demo.py
carbon_impact_demo.json		carbon_impact_demo.json
cliff.toml		cliff.toml
comprehensive_quality_gates.py		comprehensive_quality_gates.py
comprehensive_quality_gates_report.json		comprehensive_quality_gates_report.json
comprehensive_quality_gates_runner.py		comprehensive_quality_gates_runner.py
comprehensive_quality_gates_test.py		comprehensive_quality_gates_test.py
comprehensive_quality_report.json		comprehensive_quality_report.json
comprehensive_quality_testing_suite.py		comprehensive_quality_testing_suite.py
credentials.db		credentials.db
docker-compose.yml		docker-compose.yml
enhanced_autonomous_execution_engine.py		enhanced_autonomous_execution_engine.py
enhanced_carbon_analytics.py		enhanced_carbon_analytics.py
enhanced_carbon_intelligence_demo.py		enhanced_carbon_intelligence_demo.py
enhanced_quality_gates_report.json		enhanced_quality_gates_report.json
enterprise_robustness_results.json		enterprise_robustness_results.json
enterprise_robustness_suite.py		enterprise_robustness_suite.py
final_production_deployment.py		final_production_deployment.py
final_quality_validation.py		final_quality_validation.py
gen2_robustness_demo.py		gen2_robustness_demo.py
gen2_simple_robustness_demo.py		gen2_simple_robustness_demo.py
gen3_final_scaling_demo.py		gen3_final_scaling_demo.py
gen3_quantum_scaling_demo.py		gen3_quantum_scaling_demo.py
generation_1_carbon_demo.json		generation_1_carbon_demo.json
generation_2_robust_demo.json		generation_2_robust_demo.json
generation_2_robustness_test.py		generation_2_robustness_test.py
generation_3_scaling_demo.json		generation_3_scaling_demo.json
generation_3_scaling_test.py		generation_3_scaling_test.py
global_deployment_config.json		global_deployment_config.json
production_deployment_orchestrator.py		production_deployment_orchestrator.py
production_deployment_report.json		production_deployment_report.json
production_deployment_suite.py		production_deployment_suite.py
production_health_checker.py		production_health_checker.py
production_performance_validator.py		production_performance_validator.py
production_quality_gates_runner.py		production_quality_gates_runner.py
production_security_validator.py		production_security_validator.py
pyproject.toml		pyproject.toml
pyproject.toml.backup		pyproject.toml.backup

Uh oh!

License

danieleschmidt/hf-eco2ai-plugin

Folders and files

Latest commit

History

Repository files navigation

HF Eco2AI Plugin

🌱 Overview

🏗️ Enterprise Features

Comprehensive SDLC Implementation

⚡ Key Metrics

📋 Requirements

🛠️ Installation

Quick Setup (Recommended)

Manual Installation

🚀 Quick Start

Basic Usage

Detailed Configuration

📊 Features

Real-time Monitoring

PyTorch Lightning Integration

Multi-GPU Tracking

🏗️ Architecture

📈 Dashboards

Grafana Setup

Dashboard Features

🌍 Regional Carbon Data

Automatic Region Detection

Manual Region Configuration

🔧 Advanced Features

Custom Metrics

Experiment Comparison

Carbon Budget Enforcement

📊 Reports

Generate Carbon Report

MLflow Integration

🚦 CI/CD Integration

Comprehensive GitHub Actions Workflows

Features

Example Training with Carbon Tracking

🌱 Best Practices

Reduce Carbon Impact

Schedule Low-Carbon Training

🤝 Contributing

📄 Citation

📝 License

📊 Enterprise Management

Automated Metrics Collection

Monitoring Stack

Validation and Setup

🔗 Resources

Documentation

Architecture Documents

Monitoring & Automation

📧 Contact

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Contributors 4

Uh oh!

Languages

Packages