Skip to content

Development Roadmap (2025 Q3) #7736

@zhyncs

Description

@zhyncs

Here is the development roadmap for 2025 Q3. Contributions and feedback are welcome (Join Slack). The next 2025 Q4 roadmap is #12780

Focus

  • Feature compatibility and reliability: Make all advanced features fully compatible with each other and achieve production-level reliability, such as P/D disaggregation, all parallelisms, speculative decoding, and load balancing.
  • Usability: easy installation on all backends; simple launch scripts for large-scale deployments.
  • Kernel optimizations for new generations of hardware (Blackwell, MI350, TPU, etc).
  • Reinforcement learning training framework integration.

Core refactor

Speculative decoding

KVCache system

Kernel

Parallelism

PD Disaggregation

Quantization

RL framework integration

  • AREAL, slime, veRL integration (sorted alphabetically)
  • Faster weight sync
  • Reproduce Deepseek/Kimi + GRPO training

Multi-LoRA serving

Hardware

Model coverage

API layer

Sub-issues

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions