Skip to content

CMBAgents/cmbcluster

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CMBCluster

Multi-tenant Streamlit Platform for Research

CMBCluster is a multi-tenant Streamlit platform designed specifically for research teams. Built on the proven JupyterHub architecture pattern, it provides isolated, persistent research environments for each user while maintaining enterprise-grade security and scalability.

Architecture

The platform follows the JupyterHub architecture pattern with these key components:

graph TB
    Users[πŸ‘₯ Users] --> Proxy[🌐 Proxy]
    Proxy --> Hub[🎯 Hub - Authenticate user]
    
    Hub --> PV[πŸ’Ύ Pods + Volumes / USER SESSION]
    
    CloudVolumes[☁️ Cloud Volumes] --> PV
    ImageRegistry[πŸ“¦ Image Registry] --> PV
    
    Proxy -.->|ROUTE INFOSEND| Hub
    Hub -.->|SIGNED OUTUSER REDIRECT| Proxy
    Proxy -.->|SIGNED IN USERREDIRECT| PV
    Hub -.->|VOLUME PROVIDE/POD CREATE/USER REDIRECT| PV
    Hub -.->|CULL PODSIF STALE| PV
    
    subgraph "Kubernetes Cluster"
        Proxy
        Hub
        PV
    end
    
    style Users fill:#f9f,stroke:#333,stroke-width:2px
    style Proxy fill:#9f9,stroke:#333,stroke-width:2px
    style Hub fill:#9f9,stroke:#333,stroke-width:2px
    style PV fill:#99f,stroke:#333,stroke-width:2px
    style CloudVolumes fill:#bbf,stroke:#333,stroke-width:2px
    style ImageRegistry fill:#bbf,stroke:#333,stroke-width:2px
Loading

Features

πŸ” Enterprise Security

  • Google OAuth Integration: Seamless single sign-on
  • RBAC: Role-based access control with admin/user permissions
  • Network Policies: Kubernetes-native network isolation
  • Pod Security: Non-root containers with security contexts

πŸš€ Scalable Infrastructure

  • Auto-scaling: Horizontal pod autoscaling based on demand
  • Resource Management: Configurable CPU/memory limits per user
  • Load Balancing: NGINX ingress with SSL termination
  • High Availability: Multi-replica deployments with health checks

πŸ”¬ Research-Focused

  • Scientific Computing: Pre-installed research libraries (pandas, numpy, scipy, matplotlib)
  • Interactive Analysis: Streamlit-based data exploration interface
  • Persistent Workspaces: User data persists across sessions
  • Collaborative Tools: Shared data access and project management

☁️ Cloud-Native

  • Kubernetes-Native: Built for modern container orchestration
  • GKE Optimized: Tested and optimized for Google Kubernetes Engine
  • Helm Charts: Easy deployment and configuration management
  • CI/CD Ready: GitHub Actions integration for automated deployments

Quick Start

Prerequisites

  • Google Cloud Platform Account with billing enabled
  • Local Development Tools:
    # macOS
    brew install google-cloud-sdk kubectl helm docker
    
    # Ubuntu/Debian
    sudo apt-get install google-cloud-sdk kubectl helm docker.io
    
    # CentOS/RHEL
    sudo yum install google-cloud-sdk kubectl helm docker
  • Domain Name (for production deployment)

1. Clone Repository

git clone https://github.com/archetana/cmbcluster.git
cd cmbcluster

2. Setup Environment

# Copy environment template
cp .env.example .env

# Edit configuration
vim .env

Required environment variables:

PROJECT_ID=cambridge-infosys
BASE_DOMAIN=cmbcluster.yourdomain.com
GOOGLE_CLIENT_ID=your-oauth-client-id
GOOGLE_CLIENT_SECRET=your-oauth-secret
SECRET_KEY=your-generated-secret-key

3. Choose Deployment Method

Option A: Local Development

# Start local environment
make dev

# Or using docker-compose directly
docker-compose up --build

Access at: http://localhost:8501

Option B: Production Deployment

# Setup GKE cluster
make setup PROJECT_ID=your-project DOMAIN=your-domain.com

# Deploy application
make deploy

Local Development

Quick Start

# Start all services
docker-compose up --build

# Access points:
# Frontend: http://localhost:8501
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
# User Environment: http://localhost:8502

Development Commands

# Make the script executable
chmod +x scripts/build-images.sh
chmod +x scripts/setup-cluster.sh
chmod +x scripts/deploy.sh
chmod +x scripts/cleanup.sh
chmod +x scripts/local-dev.sh

# Build images
make build

# Run tests
make test

# View logs
make logs

# Stop services
docker-compose down

# Clean up everything
make clean

File Structure

cmbcluster/
β”œβ”€β”€ README.md                # This file
β”œβ”€β”€ .env.example            # Environment template
β”œβ”€β”€ .gitignore              # Git ignore rules
β”œβ”€β”€ docker-compose.yml      # Local development setup
β”œβ”€β”€ Makefile               # Build and deployment commands
β”œβ”€β”€ backend/               # FastAPI backend service
β”‚   β”œβ”€β”€ Dockerfile
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ main.py           # Application entry point
β”‚   β”œβ”€β”€ auth.py           # OAuth authentication
β”‚   β”œβ”€β”€ pod_manager.py    # Kubernetes pod management
β”‚   β”œβ”€β”€ config.py         # Configuration settings
β”‚   └── models.py         # Data models
β”œβ”€β”€ frontend/              # Streamlit frontend application
β”‚   β”œβ”€β”€ Dockerfile
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ main.py           # Main UI application
β”‚   β”œβ”€β”€ config.py         # Frontend configuration
β”‚   β”œβ”€β”€ components/       # Reusable UI components
β”‚   β”‚   β”œβ”€β”€ auth.py
β”‚   β”‚   └── api_client.py
β”‚   └── pages/           # Multi-page UI
β”‚       β”œβ”€β”€ Dashboard.py
β”‚       β”œβ”€β”€ Environment.py
β”‚       └── Settings.py
β”œβ”€β”€ user-environment/      # User research environment container
β”‚   β”œβ”€β”€ Dockerfile
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── app.py           # Streamlit research application
β”œβ”€β”€ k8s/                  # Kubernetes manifests
β”‚   β”œβ”€β”€ namespace.yaml
β”‚   β”œβ”€β”€ rbac.yaml
β”‚   β”œβ”€β”€ backend-deployment.yaml
β”‚   β”œβ”€β”€ frontend-deployment.yaml
β”‚   └── ingress.yaml
β”œβ”€β”€ helm/                 # Helm chart templates
β”‚   β”œβ”€β”€ Chart.yaml
β”‚   β”œβ”€β”€ values.yaml
β”‚   └── templates/
β”‚       β”œβ”€β”€ backend.yaml
β”‚       β”œβ”€β”€ frontend.yaml
β”‚       └── ingress.yaml
β”œβ”€β”€ scripts/              # Deployment and utility scripts
β”‚   β”œβ”€β”€ setup-cluster.sh
β”‚   β”œβ”€β”€ build-images.sh
β”‚   β”œβ”€β”€ deploy.sh
β”‚   β”œβ”€β”€ cleanup.sh
β”‚   └── local-dev.sh
└── terraform/            # Infrastructure as code (optional)
    β”œβ”€β”€ main.tf
    β”œβ”€β”€ variables.tf
    └── outputs.tf

Production Deployment

1. Setup Google Cloud Infrastructure

# Authenticate with Google Cloud
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# Setup cluster and infrastructure
./scripts/setup-cluster.sh YOUR_PROJECT_ID

This creates:

  • GKE cluster with autoscaling (1-10 nodes)
  • NGINX Ingress Controller with LoadBalancer
  • cert-manager for automated SSL certificates
  • Required service accounts and RBAC policies
  • Storage classes for persistent volumes

2. Configure OAuth

  1. Go to Google Cloud Console
  2. Navigate to APIs & Services > Credentials
  3. Create OAuth 2.0 Client ID
  4. Add authorized redirect URIs:
    https://api.yourdomain.com/auth/callback
    
  5. Update .env with client ID and secret

3. Deploy Application

# Build and push container images
./scripts/build-images.sh YOUR_PROJECT_ID

# Deploy with Helm
./scripts/deploy.sh YOUR_PROJECT_ID yourdomain.com

4. Configure DNS

Point your domain to the ingress IP:

# Get ingress IP
kubectl get ingress -n cmbcluster

# Create DNS records:
# A record: yourdomain.com -> INGRESS_IP
# A record: *.yourdomain.com -> INGRESS_IP  
# A record: api.yourdomain.com -> INGRESS_IP

Usage

For Users

  1. Access Platform: Navigate to https://yourdomain.com
  2. Login: Click "πŸ” Login with Google" and authenticate
  3. Launch Environment: Click ":material/rocket_launch: Launch Environment" to create your research pod
  4. Start Research: Access your isolated Streamlit environment with:
    • Pre-installed research libraries (pandas, numpy, scipy, matplotlib)
    • Persistent workspace storage (/workspace)
    • Scientific computing tools (NumPy, SciPy, Matplotlib)
    • Data visualization capabilities (Plotly, Seaborn)

User Environment Features

# Pre-installed libraries available in user environments
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import pandas as pd
import plotly.express as px
import astropy
import healpy as hp
import camb

# Persistent workspace
workspace_dir = "/workspace"  # Your files persist here

For Administrators

# Monitor deployments
kubectl get pods -n cmbcluster

# View logs
kubectl logs -f deployment/cmbcluster-backend -n cmbcluster
kubectl logs -f deployment/cmbcluster-frontend -n cmbcluster

# Scale deployments
kubectl scale deployment cmbcluster-backend --replicas=5 -n cmbcluster

# List user environments
kubectl get pods -l app=cmbcluster-user-env -n cmbcluster

# Clean up inactive environments
kubectl delete pods -l app=cmbcluster-user-env --field-selector status.phase=Succeeded -n cmbcluster

Configuration

Environment Variables

Variable Description Default Required
PROJECT_ID GCP Project ID - βœ…
BASE_DOMAIN Platform domain cmbcluster.local βœ…
GOOGLE_CLIENT_ID OAuth Client ID - βœ…
GOOGLE_CLIENT_SECRET OAuth Secret - βœ…
SECRET_KEY JWT signing key - βœ…
MAX_INACTIVE_HOURS Auto-cleanup time 1H βœ…
MAX_USER_PODS Pods per user 1 βœ…
TOKEN_EXPIRE_HOURS JWT expiration 24 βœ…
NAMESPACE Kubernetes namespace cmbcluster βœ…
FILE_ENCRYPTION_KEY Environment file encryption key - ⚠️

⚠️ Important: The FILE_ENCRYPTION_KEY is required for production deployments to encrypt uploaded environment files. See ENCRYPTION.md for details.

Resource Limits

Default user environment resources:

userEnvironment:
  defaultResources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 2000m
      memory: 4Gi
  storage:
    size: 10Gi
    storageClass: standard-rwo

Helm Configuration

Customize deployment in helm/values.yaml:

# Backend scaling
backend:
  replicaCount: 2
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi

# Frontend scaling  
frontend:
  replicaCount: 2
  
# Auto-scaling
autoscaling:
  enabled: true
  minReplicas: 1
  maxReplicas: 10

Architecture Details

Component Responsibilities

🌐 Proxy (NGINX Ingress)

  • Route Info Send: Forward user requests to Hub
  • User Routing: Direct authenticated users to their pods
  • SSL Termination: Handle HTTPS certificates
  • Load Balancing: Distribute traffic across replicas

🎯 Hub (Backend Service)

  • Authenticate User: Google OAuth integration
  • Signed Out User Redirect: Send unauthenticated users to login
  • Volume Provide/Pod Create: Provision user resources
  • User Redirect: Route users to their environments
  • Cull Pods If Stale: Clean up inactive environments

πŸ”¬ Pods + Volumes

  • Image Pull: Download user environment containers
  • User Session: Maintain persistent workspace state
  • Resource Isolation: Dedicated CPU/memory/storage per user
  • Data Persistence: User files survive pod restarts

Data Flow

1. User β†’ Proxy β†’ Hub (check authentication)
2. Hub β†’ Google OAuth (if not authenticated)  
3. Hub β†’ Create/Find User Pod
4. Proxy β†’ User Pod (direct traffic)
5. User Pod β†’ Streamlit App (research environment)

Monitoring

Health Checks

# Check all services
kubectl get pods -n cmbcluster

# Backend health
curl https://api.yourdomain.com/health

# Frontend health  
curl https://yourdomain.com/_stcore/health

Logs

# Backend logs
kubectl logs -f deployment/cmbcluster-backend -n cmbcluster

# Frontend logs
kubectl logs -f deployment/cmbcluster-frontend -n cmbcluster

# User environment logs
kubectl logs  -n cmbcluster

# Ingress logs
kubectl logs -f deployment/ingress-nginx-controller -n ingress-nginx

Metrics

The platform exposes Prometheus metrics:

# Backend metrics
cmbcluster_active_users
cmbcluster_pods_created_total
cmbcluster_authentication_requests_total

# Resource metrics  
cmbcluster_cpu_usage
cmbcluster_memory_usage
cmbcluster_storage_usage

Security

Authentication Flow

sequenceDiagram
    participant U as User
    participant P as Proxy
    participant H as Hub  
    participant G as Google OAuth
    
    U->>P: Access platform
    P->>H: Check authentication
    H->>G: Redirect to OAuth
    G->>H: Return user info
    H->>H: Create JWT token
    H->>P: Redirect with token
    P->>U: Access granted
Loading

Security Features

  • TLS 1.3: All traffic encrypted in transit
  • Network Policies: Pod-to-pod communication restrictions
  • Pod Security: Non-root containers, read-only filesystems
  • RBAC: Kubernetes role-based access control
  • Secrets Management: Encrypted credential storage

Data Protection

  • Workspace Isolation: Each user has dedicated storage
  • Encryption at Rest: Persistent volumes encrypted
  • Data Retention: Configurable cleanup policies
  • Backup Support: Regular workspace backups

Troubleshooting

Common Issues

Pod Won't Start

# Check pod status
kubectl get pods -n cmbcluster
kubectl describe pod  -n cmbcluster
kubectl logs  -n cmbcluster

# Common causes:
# - Image pull errors
# - Resource constraints  
# - Storage mounting issues

Authentication Errors

# Check OAuth configuration
kubectl get secret cmbcluster-secrets -n cmbcluster -o yaml

# Verify redirect URLs in Google Cloud Console
# Ensure domain matches configuration

Ingress Issues

# Check ingress status
kubectl get ingress -n cmbcluster
kubectl describe ingress cmbcluster-ingress -n cmbcluster

# Verify certificates
kubectl get certificates -n cmbcluster
kubectl describe certificate cmbcluster-tls -n cmbcluster

Storage Problems

# Check persistent volumes
kubectl get pv
kubectl get pvc -n cmbcluster

# Storage class issues
kubectl get storageclass

Debug Commands

# Get all resources
kubectl get all -n cmbcluster

# Check events
kubectl get events -n cmbcluster --sort-by='.lastTimestamp'

# Pod shell access (for debugging)
kubectl exec -it  -n cmbcluster -- /bin/bash

# Port forwarding for local access
kubectl port-forward service/cmbcluster-backend 8000:80 -n cmbcluster

Testing

Unit Tests

# Backend tests
cd backend
python -m pytest tests/

# Frontend tests  
cd frontend
python -m pytest tests/

Integration Tests

# End-to-end testing
pytest tests/integration/

# Load testing
locust -f tests/load/locustfile.py

Development Testing

# Test local deployment
make dev
curl http://localhost:8000/health
curl http://localhost:8501/_stcore/health

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Workflow

# 1. Fork and clone repository
git clone https://github.com/yourusername/cmbcluster.git
cd cmbcluster

# 2. Create feature branch
git checkout -b feature/your-feature-name

# 3. Make changes and test locally
make dev
make test

# 4. Commit and push
git commit -m "Add your feature"
git push origin feature/your-feature-name

# 5. Create pull request

Code Standards

  • Python: Follow PEP 8, use Black formatter
  • Docker: Multi-stage builds, minimal base images
  • Kubernetes: Follow security best practices
  • Documentation: Update README and inline docs

Roadmap

Near Term (Q1 2025)

  • GPU Support: CUDA-enabled environments for ML workloads
  • Advanced Monitoring: Grafana dashboards and alerting
  • User Quotas: Storage and compute limits per user
  • Backup System: Automated workspace backups

Medium Term (Q2-Q3 2025)

  • Multi-cloud Support: AWS EKS and Azure AKS deployment
  • Jupyter Integration: Built-in Jupyter notebook support
  • Collaborative Features: Real-time collaboration tools
  • Data Pipeline Integration: Connect with external data sources

Long Term (Q4 2025+)

  • Enterprise SSO: SAML and LDAP integration
  • Advanced Analytics: Usage analytics and cost optimization
  • Custom Environments: User-defined container images
  • Federation: Multi-cluster deployments

Support

Getting Help

Community

  • Slack: Join our CMBCluster Slack
  • Monthly Meetings: First Friday of each month at 10 AM PST
  • Office Hours: Wednesdays 2-3 PM PST

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built on the JupyterHub architecture pattern
  • Inspired by the research community needs
  • Special thanks to early adopters and contributors
  • Container orchestration powered by Kubernetes

CMBCluster - Empowering research through scalable, secure, and collaborative computing environments.

For more information, visit our GitHub repository or contact us at [email protected].

[1] https://pplx-res.cloudinary.com/image/private/user_uploads/55150389/b6a3b9f3-c4a1-40cc-a9f2-7cca9d35b398/image.jpg

About

CMBAgent Cloud - personal research environments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.4%
  • Shell 8.3%
  • Other 2.3%