Clustrix

Clustrix is a Python package that enables seamless distributed computing on clusters. With a simple decorator, you can execute any Python function remotely on cluster resources while automatically handling dependency management, environment setup, and result collection.

Features

Simple Decorator Interface: Just add @cluster to any function
Interactive Jupyter Widget: %%clusterfy magic command with GUI configuration manager
Multiple Cluster Support: SLURM, PBS, SGE, Kubernetes, and SSH
Automatic Dependency Management: Captures and replicates your exact Python environment
Native Cost Monitoring: Built-in cost tracking for all major cloud providers
Loop Parallelization: Automatically distributes loops across cluster nodes
Flexible Configuration: Easy setup with config files, environment variables, or interactive widget
Error Handling: Comprehensive error reporting and job monitoring

Quick Start

Installation

pip install clustrix

Basic Configuration

import clustrix

# Configure your cluster
clustrix.configure(
    cluster_type='slurm',
    cluster_host='your-cluster.example.com',
    username='your-username',
    default_cores=4,
    default_memory='8GB'
)

Using the Decorator

from clustrix import cluster

@cluster(cores=8, memory='16GB', time='02:00:00')
def expensive_computation(data, iterations=1000):
    import numpy as np
    result = 0
    for i in range(iterations):
        result += np.sum(data ** 2)
    return result

# This function will execute on the cluster
data = [1, 2, 3, 4, 5]
result = expensive_computation(data, iterations=10000)
print(f"Result: {result}")

Jupyter Notebook Integration

Clustrix provides seamless integration with Jupyter notebooks through an interactive widget:

import clustrix  # Auto-loads the magic command

# Use the %%clusterfy magic command to open the configuration widget

%%clusterfy
# Interactive widget appears with:
# - Dropdown to select configurations
# - Forms to create/edit cluster setups  
# - One-click configuration application
# - Save/load configurations to files

The widget includes pre-built templates for:

Local Development: Run jobs on your local machine
AWS GPU Instances: p3.2xlarge, p3.8xlarge templates
Google Cloud: CPU and GPU instance configurations
Azure: Virtual machine templates with GPU support
SLURM HPC: University cluster configurations
Kubernetes: Container-based execution

Configuration File

Create a clustrix.yml file in your project directory:

cluster_type: slurm
cluster_host: cluster.example.com
username: myuser
key_file: ~/.ssh/id_rsa

default_cores: 4
default_memory: 8GB
default_time: "01:00:00"
default_partition: gpu

remote_work_dir: /scratch/myuser/clustrix
conda_env_name: myproject

auto_parallel: true
max_parallel_jobs: 50
cleanup_on_success: true

module_loads:
  - python/3.9
  - cuda/11.2

environment_variables:
  CUDA_VISIBLE_DEVICES: "0,1"

Advanced Usage

Cost Monitoring

Clustrix includes built-in cost monitoring for cloud providers:

from clustrix import cost_tracking_decorator, get_cost_monitor

# Automatic cost tracking with decorator
@cost_tracking_decorator('aws', 'p3.2xlarge')  
@cluster(cores=8, memory='60GB')
def expensive_training():
    # Your training code here
    pass

# Manual cost monitoring
monitor = get_cost_monitor('gcp')
cost_estimate = monitor.estimate_cost('n2-standard-4', hours_used=2.0)
print(f"Estimated cost: ${cost_estimate.estimated_cost:.2f}")

# Get pricing information
pricing = monitor.get_pricing_info()
recommendations = monitor.get_cost_optimization_recommendations()

Supported cloud providers: AWS, Google Cloud, Azure, Lambda Cloud

Custom Resource Requirements

@cluster(
    cores=16,
    memory='32GB',
    time='04:00:00',
    partition='gpu',
    environment='tensorflow-env'
)
def train_model(data, epochs=100):
    # Your machine learning code here
    pass

Manual Parallelization Control

@cluster(parallel=False)  # Disable automatic loop parallelization
def sequential_computation(data):
    result = []
    for item in data:
        result.append(process_item(item))
    return result

@cluster(parallel=True)   # Enable automatic loop parallelization
def parallel_computation(data):
    results = []
    for item in data:  # This loop will be automatically distributed
        results.append(expensive_operation(item))
    return results

Different Cluster Types

# SLURM cluster
clustrix.configure(cluster_type='slurm', cluster_host='slurm.example.com')

# PBS cluster  
clustrix.configure(cluster_type='pbs', cluster_host='pbs.example.com')

# Kubernetes cluster
clustrix.configure(cluster_type='kubernetes')

# Simple SSH execution (no scheduler)
clustrix.configure(cluster_type='ssh', cluster_host='server.example.com')

Command Line Interface

# Configure Clustrix
clustrix config --cluster-type slurm --cluster-host cluster.example.com --cores 8

# Check current configuration
clustrix config

# Load configuration from file
clustrix load my-config.yml

# Check cluster status
clustrix status

How It Works

Function Serialization: Clustrix captures your function, arguments, and dependencies using advanced serialization
Environment Replication: Creates an identical Python environment on the cluster with all required packages
Job Submission: Submits your function as a job to the cluster scheduler
Execution: Runs your function on cluster resources with specified requirements
Result Collection: Automatically retrieves results once execution completes
Cleanup: Optionally cleans up temporary files and environments

Supported Cluster Types

SLURM: Full support for Slurm Workload Manager
PBS/Torque: Support for PBS Professional and Torque
SGE: Sun Grid Engine support
Kubernetes: Execute jobs as Kubernetes pods
SSH: Direct execution via SSH (no scheduler)

Dependencies

Clustrix automatically handles dependency management by:

Capturing your current Python environment with pip freeze
Creating virtual environments on cluster nodes
Installing exact package versions to match your local environment
Supporting conda environments for complex scientific software stacks

Error Handling and Monitoring

from clustrix import ClusterExecutor

# Monitor job status
executor = ClusterExecutor(clustrix.get_config())
job_id = "12345"
status = executor._check_job_status(job_id)

# Cancel jobs if needed
executor.cancel_job(job_id)

Examples

Machine Learning Training

@cluster(cores=8, memory='32GB', time='12:00:00', partition='gpu')
def train_neural_network(training_data, model_config):
    import tensorflow as tf
    
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
    model.fit(training_data, epochs=model_config['epochs'])
    
    return model.get_weights()

# Execute training on cluster
weights = train_neural_network(my_data, {'epochs': 50})

Scientific Computing

@cluster(cores=16, memory='64GB')
def monte_carlo_simulation(n_samples=1000000):
    import numpy as np
    
    # This loop will be automatically parallelized
    results = []
    for i in range(n_samples):
        x, y = np.random.random(2)
        if x*x + y*y <= 1:
            results.append(1)
        else:
            results.append(0)
    
    pi_estimate = 4 * sum(results) / len(results)
    return pi_estimate

pi_value = monte_carlo_simulation(10000000)

Data Processing Pipeline

@cluster(cores=8, memory='16GB')
def process_large_dataset(file_path, chunk_size=10000):
    import pandas as pd
    
    results = []
    for chunk in pd.read_csv(file_path, chunksize=chunk_size):
        # Process each chunk
        processed = chunk.groupby('category').sum()
        results.append(processed)
    
    return pd.concat(results)

# Process data on cluster
processed_data = process_large_dataset('/path/to/large_file.csv')

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

Clustrix is released under the MIT License. See LICENSE for details.

Support

Documentation: https://clustrix.readthedocs.io
Issues: https://github.com/ContextLab/clustrix/issues

Name		Name	Last commit message	Last commit date
Latest commit History 269 Commits
.github/workflows		.github/workflows
clustrix.egg-info		clustrix.egg-info
clustrix		clustrix
dist		dist
docs		docs
notes		notes
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
fix_notebooks.py		fix_notebooks.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Clustrix

Features

Quick Start

Installation

Basic Configuration

Using the Decorator

Jupyter Notebook Integration

Configuration File

Advanced Usage

Cost Monitoring

Custom Resource Requirements

Manual Parallelization Control

Different Cluster Types

Command Line Interface

How It Works

Supported Cluster Types

Dependencies

Error Handling and Monitoring

Examples

Machine Learning Training

Scientific Computing

Data Processing Pipeline

Contributing

License

Support

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 6

Languages

License

ContextLab/clustrix

Folders and files

Latest commit

History

Repository files navigation

Clustrix

Features

Quick Start

Installation

Basic Configuration

Using the Decorator

Jupyter Notebook Integration

Configuration File

Advanced Usage

Cost Monitoring

Custom Resource Requirements

Manual Parallelization Control

Different Cluster Types

Command Line Interface

How It Works

Supported Cluster Types

Dependencies

Error Handling and Monitoring

Examples

Machine Learning Training

Scientific Computing

Data Processing Pipeline

Contributing

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 6

Languages

Packages