LIMI: Less is More for Agency

LIMI: Less is More for Agency

Visit our Hugging Face organization (click links above), search for models and datasets starting with LIMI, and you will find all you need! Enjoy!

To learn more about LIMI, feel free to explore our documentation and resources. Our release consists of the following sections:

Model Zoo && Quick Start: Basic usage and demonstrations with Transformers, vLLM, and SGLang for LIMI and LIMI-Air;
Training: Instructions for fine-tuning and post-training with slime framework and distributed training scripts;
Evaluation: Comprehensive evaluation suite with metrics for agentic capabilities assessment;
Framework Integration: Usage of LIMI with frameworks for agentic applications, tool use, and reasoning tasks.

News

2025.09.23: 🚀 LIMI paper is now available on arXiv! Check out our paper for detailed methodology and experimental results.
2025.09.23: 🤗 Released LIMI models on Hugging Face! Both LIMI (355B) and LIMI-Air (106B) are now available.
2025.09.23: 📊 Released the LIMI training dataset with 78 carefully curated samples on Hugging Face.

Introduction

LIMI establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations. This discovery fundamentally reshapes how we develop autonomous AI systems, suggesting that mastering agency requires understanding its essence, not scaling training data. As industries transition from thinking AI to working AI, LIMI provides a paradigm for sustainable cultivation of truly agentic intelligence.

Key Highlights

A New Data Paradigm: We challenge the traditional "more is better" data philosophy by achieving superior AI agency with only 78 high-quality samples, proving that data quality far outweighs quantity.
Resource Efficiency: By focusing on core capabilities instead of massive datasets, we significantly reduce the computational resources required for training while effectively boosting the model's performance on complex tasks.
Focus on Productive Workers: Our approach is dedicated to cultivating AI's essential ability to act as a "worker"—to autonomously identify problems, plan, and execute tasks—rather than just "thinking" and "generating."
Outperforming Leading Models: LIMI significantly surpasses multiple large-scale models in AgencyBench, achieving a performance boost of up to 53.7% with only 1/128th of the sample size.

Performance on AgencyBench

Our models achieve state-of-the-art performance across multiple agentic evaluation tasks:

Model	FTFC (↑)	RC@3 (↑)	SR@3 (↑)	Avg.
GLM-4.5-Air	15.0	16.1	20.0	17.0
GLM-4.5	37.8	50.0	47.4	45.1
GLM-4.5-CodeAgent	48.0	48.0	47.5	47.8
LIMI-Air	35.4	34.3	33.1	34.3
LIMI	71.7	74.2	74.6	73.5

For detailed benchmark results, experimental setup, and comprehensive comparisons, please refer to our paper.

Model Zoo

Our LIMI models are available on Hugging Face 🤗:

Model	Backbone	Size	Link
LIMI	GLM-4.5	355B	🤗
LIMI-Air	GLM-4.5-Air	106B	🤗

Datasets

We release our datasets through Hugging Face 🤗:

Dataset	Description	Link
LIMI	Updated training set for the paper (78 samples)	🤗

Quick Start

Our models are fine-tuned on GLM-4.5 and are compatible with most mainstream frameworks like HF Transformers, SGLang, Megatron, slime and etc.

Using the Latest Model (LIMI)

Start with HF Transformers

# Install required packages
pip install transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Initialize model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMI",
    torch_dtype="auto",
    trust_remote_code=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)

# Prepare input messages (We use the following template and system prompt during training and inference)
messages = [
    {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
    {"role": "user", "content": "Modify the \texttt{equation.py} function, considering the physical meaning and relationships of the inputs."}
]

# Format input using chat template
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Tokenize input
inputs = tokenizer(text, return_tensors="pt").to(model.device)

# Generate response
outputs = model.generate(
    **inputs,
    max_new_tokens=128000,
    temperature=0.6,
    top_p=0.95,
    do_sample=True
)

# Decode and print response
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Start with VLLM

# Install required packages
pip install vllm

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

# Initialize the model
llm = LLM(
    model="GAIR/LIMI",
    tensor_parallel_size=4,  # adjust based on available GPUs
    trust_remote_code=True,
    swap_space=60,
    gpu_memory_utilization=0.96,
)

# Prepare input messages (We use the following template and system prompt during training and inference)
messages = [
    {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."},
    {"role": "user", "content": "Modify the \texttt{equation.py} function, considering the physical meaning and relationships of the inputs."}
]

# Setup tokenizer
tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMI", trust_remote_code=True)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Configure generation parameters
sampling_params = SamplingParams(
    temperature=0.6,
    max_tokens=128000,
    top_p=0.95,
)

# Generate response
output = llm.generate(text, sampling_params)
print(output[0].outputs[0].text)

Training

We utilize slime framework for training, which provides a convenient and efficient training pipeline.

Environment Setup
- Set up slime following their official documentation.
- Ensure all dependencies are properly installed and configured.
Data Preparation
- Obtain the LIMI dataset from 🤗 Hugging Face.
Configuration
- Use our provided training script.
- The script file contains all necessary hyperparameters and training settings.

Evaluation

To support the rigorous assessment of agentic capabilities outlined in this work, we release a comprehensive evaluation suite. This framework is designed to benchmark agency for Large Language Models (LLMs) on the held-out evaluation subset $D_{\text{eval}}$.

The evaluation module implements the three key metrics: First-Turn Functional Completeness (FTFC), Success Rate (SR@R) and Remaining Chances (RC@R), with a computational budget of R = 3 rounds. For detailed benchmark tasks, please refer to AgencyBench.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

@misc{xiao2025limiagency,
      title={LIMI: Less is More for Agency}, 
      author={Yang Xiao and Mohan Jiang and Jie Sun and Keyu Li and Jifan Lin and Yumin Zhuang and Ji Zeng and Shijie Xia and Qishuo Hua and Xuefeng Li and Xiaojie Cai and Tongyu Wang and Yue Zhang and Liming Liu and Xia Wu and Jinlong Hou and Yuan Cheng and Wenjie Li and Xiang Wang and Dequan Wang and Pengfei Liu},
      year={2025},
      eprint={2509.17567},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.17567}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
scripts		scripts
slime		slime
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LIMI: Less is More for Agency

News

Introduction

Key Highlights

Performance on AgencyBench

Model Zoo

Datasets

Quick Start

Using the Latest Model (LIMI)

Training

Evaluation

License

Citation

About

Uh oh!

Releases

Packages

Contributors 4

Languages

GAIR-NLP/LIMI

Folders and files

Latest commit

History

Repository files navigation

LIMI: Less is More for Agency

News

Introduction

Key Highlights

Performance on AgencyBench

Model Zoo

Datasets

Quick Start

Using the Latest Model (LIMI)

Training

Evaluation

License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages