ModelTC

All

56 repositories

LightLLM
Public
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
nlp deep-learning llama gpt model-serving llm openai-triton
Python
•
Apache License 2.0
•277•3.6k•79•22•Updated Sep 6, 2025Sep 6, 2025
LightX2V
Public
Light Video Generation Inference Framework
video-generation diffusion-models hunyuan-video wan-video
Python
•30•541•22•1•Updated Sep 5, 2025Sep 5, 2025
mtc-token-healing
Public
Token healing implementation in Rust
Rust
•
Apache License 2.0
•0•4•0•4•Updated Sep 5, 2025Sep 5, 2025
ComfyUI-Lightx2vWrapper
Public
ComfyUI custom node for lightx2v
comfyui comfyui-nodes
Python
•
MIT License
•5•36•0•0•Updated Sep 5, 2025Sep 5, 2025
LightKernel
Public
HTML
•
Apache License 2.0
•0•1•0•0•Updated Sep 5, 2025Sep 5, 2025
lightllm-blog
Public
SCSS
•
MIT License
•0•1•0•0•Updated Sep 4, 2025Sep 4, 2025
general-sam-py
Public
Python bindings for general-sam and some utilities
Python
•
Apache License 2.0
•0•4•0•3•Updated Sep 2, 2025Sep 2, 2025
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•24•575•5•0•Updated Aug 29, 2025Aug 29, 2025
Wan2.2-Lightning
Public
Wan2.2-Lightning: Speed up wan2.2 model with distillation
Python
•
Apache License 2.0
•371•130•14•0•Updated Aug 28, 2025Aug 28, 2025
LightCompress
Public
A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
benchmark deployment tool evaluation pruning quantization wan awq large-language-models llm
Python
•
Apache License 2.0
•61•556•38•0•Updated Aug 22, 2025Aug 22, 2025
general-sam
Public
A general suffix automaton implementation in Rust with Python bindings
Rust
•
Apache License 2.0
•0•7•0•3•Updated Aug 18, 2025Aug 18, 2025
greedy-tokenizer
Public
Greedily tokenize strings with the longest tokens iteratively.
Python
•
Apache License 2.0
•0•0•0•2•Updated Aug 18, 2025Aug 18, 2025
SageAttention
Public
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
Cuda
•
Apache License 2.0
•209•2•0•0•Updated Aug 16, 2025Aug 16, 2025
fa3
Public
Python
•
BSD 3-Clause "New" or "Revised" License
•1•0•0•0•Updated Aug 7, 2025Aug 7, 2025
flash-attn-3-build
Public
Dockerfile
•2•0•0•0•Updated Jul 24, 2025Jul 24, 2025
HarmoniCa
Public
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
acceleration icml dit pixart diffusion-models diffusion-transformer pixart-sigma feature-caching icml-2025
Python
•
Apache License 2.0
•1•42•2•0•Updated Jul 10, 2025Jul 10, 2025
TFMQ-DM
Public
[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".
highlight quantization cvpr ldm diffusion-models tpami post-training-quantization ddim stable-diffusion cvpr2024
Jupyter Notebook
•
Apache License 2.0
•4•103•0•0•Updated Jul 10, 2025Jul 10, 2025
LightTTS
Public
Light-tts is a lightweight TTS inference framework optimized for CosyVoice2, enabling fast and scalable speech synthesis in Python.
Python
•
Apache License 2.0
•0•7•0•0•Updated Jun 24, 2025Jun 24, 2025
OmniBal
Public
[ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance".
vlm icml-2025
Python
•
Apache License 2.0
•3•24•3•0•Updated Jun 16, 2025Jun 16, 2025
lightx2v_comfyui_node
Public
0•0•0•0•Updated Apr 28, 2025Apr 28, 2025
MQBench
Public
Model Quantization Benchmark
Python
•
Apache License 2.0
•141•833•10•5•Updated Apr 20, 2025Apr 20, 2025
flash-attention
Public
Fast and memory-efficient exact attention
Python
•
BSD 3-Clause "New" or "Revised" License
•2k•0•0•0•Updated Apr 17, 2025Apr 17, 2025
verl
Public
verl: Volcano Engine Reinforcement Learning for LLMs
Python
•
Apache License 2.0
•2.3k•1•0•0•Updated Mar 17, 2025Mar 17, 2025
LLM_QAT
Public
Python
•0•0•0•0•Updated Feb 19, 2025Feb 19, 2025
modeltc.github.io
Public
HTML
•0•0•0•0•Updated Jan 25, 2025Jan 25, 2025
quant_horizon
Public
Cuda
•
Apache License 2.0
•1•11•0•0•Updated Jan 10, 2025Jan 10, 2025
EasyLLM
Public
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
Python
•
Apache License 2.0
•8•49•0•0•Updated Sep 18, 2024Sep 18, 2024
DeepSpeed
Public
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python
•
Apache License 2.0
•4.5k•0•0•0•Updated Sep 13, 2024Sep 13, 2024
opencompass
Public
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Python
•
Apache License 2.0
•656•1•0•0•Updated Sep 6, 2024Sep 6, 2024
xtuner
Public
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Python
•
Apache License 2.0
•354•0•0•0•Updated Aug 22, 2024Aug 22, 2024