GitHub - ElliotXie/CASSIA: CASSIA: A multiagent llm based single cell Annottaion framework

CASSIA (Collaborative Agent System for Single-cell Interpretable Annotation) is a tool that enhances cell type annotation using multi-agent Large Language Models (LLMs).

🌐 CASSIA Web UI - A web interface supporting most of CASSIA’s functionality

📚 Complete R Documentation/Vignette

📝 Example Python workflow/Vignette

🤖 LLMs Annotation Benchmark

📰 News

2025-05-05
📊 CASSIA annotation benchmark is now online!
The latest update introduces a new benchmarking platform that evaluates how different LLMs perform on single-cell annotation tasks, including accuracy and cost.
LLaMA4 Maverick, Gemini 2.5 Flash, and DeepSeekV3 are the top three most balanced options—nearly free!
🔧 A new auto-merge function unifies CASSIA output across different levels, making subclustering much easier. 🐛 Fixed a bug in the annotation boost agent to improve downstream refinement.

2025-04-19
🔄 CASSIA adds a retry mechanism and optimized report storage!
The latest update introduces an automatic retry mechanism for failed tasks and optimizes how reports are stored for easier access and management.
🎨 The CASSIA logo has been drawn and added to the project!

2025-04-17
🚀 CASSIA now supports automatic single-cell annotation benchmarking!
The latest update introduces a new function that enables fully automated benchmarking of single-cell annotation. Results are evaluated automatically using LLMs, achieving performance on par with human experts.
A dedicated benchmark website is coming soon—stay tuned!

🏗️ Installation (R, for Python tutorial please visit here)

# Install dependencies
install.packages("devtools")
install.packages("reticulate")

# Install CASSIA
devtools::install_github("ElliotXie/CASSIA/CASSIA_R")

Note: If the environment is not set up correctly the first time, please restart R and run the code below

library(CASSIA)
setup_cassia_env()

🔑 Set Up API Keys

It should take about 3 minutes to get your API key.

We recommend starting with OpenRouter since it provides access to most models through a single API key.

# For OpenRouter
setLLMApiKey("your_openrouter_api_key", provider = "openrouter", persist = TRUE)

# For OpenAI
setLLMApiKey("your_openai_api_key", provider = "openai", persist = TRUE)

# For Anthropic
setLLMApiKey("your_anthropic_api_key", provider = "anthropic", persist = TRUE)

API Provider Guides:

🧬 Example Data

CASSIA includes example marker data in two formats:

# Load example data
markers_unprocessed <- loadExampleMarkers(processed = FALSE)  # Direct Seurat output
markers_processed <- loadExampleMarkers(processed = TRUE)     # Processed format

⚙️ Pipeline Usage

# The default provider is set to OpenRouter.

runCASSIA_pipeline(
    output_file_name = "cassia_test",            # Base name for output files
    tissue = "Large Intestine",                   # Tissue type (e.g., "brain")
    species = "Human",              		 # Species (e.g., "human")
    marker = markers_unprocessed,               # Marker data from findallmarker
    max_workers = 4                              # Number of parallel workers
)

🤖 Supported Models

You can choose any model for annotation and scoring. Some classic models are listed below. OpenRouter supports most of the current popular models, although some have not been extensively benchmarked in the CASSIA paper — feel free to experiment with them.

OpenRouter

google/gemini-2.5-flash-preview: One of the best-performed low-cost models, comparable with models like gpt-4o (Most recommended)
deepseek/deepseek-chat-v3-0324: One of the best-performed open-source models, turns to gives very detailed annotations (Recommended)
deepseek/deepseek-chat-v3-0324:free: Free but slower

OpenAI

gpt-4o: Used in the benchmark

Anthropic

claude-3-7-sonnet-latest: The latest High-performance model

📤 Output

The pipeline generates four key files:

Complete Annotation Results CSV File
Annotation Summary HTML Report
Annotation Boost Agent Report for Low Quality Annotation

🧰 Troubleshooting

Authentication (Error 401)

# Check if API key is set correctly
key <- Sys.getenv("ANTHROPIC_API_KEY")
print(key)  # Should not be empty

# Reset API key if needed
setLLMApiKey("your_api_key", provider = "anthropic", persist = TRUE)

File Errors

Use absolute paths when necessary
Check file permissions
Ensure files aren't open in other programs

Best Practices

Keep API keys secure
Maintain sufficient API credits

Note: This README covers only basic CASSIA functionality. For a complete tutorial including advanced features and detailed examples, please visit: CASSIA Complete Tutorial.

📖 Citation

📖 Read our preprint (v2, latest)

📖 Original preprint (v1, historical)

CASSIA: a multi-agent large language model for reference-free, interpretable, and automated cell annotation of single-cell RNA-sequencing data
Elliot Xie, Lingxin Cheng, Jack Shireman, Yujia Cai, Jihua Liu, Chitrasen Mohanty, Mahua Dey, Christina Kendziorski
bioRxiv 2024.12.04.626476; doi: https://doi.org/10.1101/2024.12.04.626476

📬 Contact

If you have any questions or need help, feel free to email us. We are always happy to help: [email protected] If you find this project helpful, please share it with your friend, and give this repo a star ⭐. Many thanks!

Name		Name	Last commit message	Last commit date
Latest commit History 377 Commits
.claude		.claude
CASSIA_R		CASSIA_R
CASSIA_example		CASSIA_example
CASSIA_python		CASSIA_python
CASSIA_source_R		CASSIA_source_R
code4paper		code4paper
dev		dev
docs		docs
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
CASSIA_0.1.0.tar.gz		CASSIA_0.1.0.tar.gz
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📰 News

🏗️ Installation (R, for Python tutorial please visit here)

🔑 Set Up API Keys

🧬 Example Data

⚙️ Pipeline Usage

🤖 Supported Models

OpenRouter

OpenAI

Anthropic

📤 Output

🧰 Troubleshooting

Authentication (Error 401)

File Errors

Best Practices

📖 Citation

📬 Contact

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

ElliotXie/CASSIA

Folders and files

Latest commit

History

Repository files navigation

📰 News

🏗️ Installation (R, for Python tutorial please visit here)

🔑 Set Up API Keys

🧬 Example Data

⚙️ Pipeline Usage

🤖 Supported Models

OpenRouter

OpenAI

Anthropic

📤 Output

🧰 Troubleshooting

Authentication (Error 401)

File Errors

Best Practices

📖 Citation

📬 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages