Change the repository type filter
All
Repositories list
22 repositories
- Intelligent Mixture-of-Models Router for Efficient LLM Inference
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
- Community maintained hardware plugin for vLLM on Ascend
rfcs
Public