DefTruth

Follow

🎯

#pragma unroll

DefTruth DefTruth

🎯

#pragma unroll

Follow

@xlite-dev, @vipshop, LeetCUDA.

1.8k followers · 153 following

@xlite-dev, @vipshop
Guangzhou, China
20:52 (UTC +08:00)
https://github.com/xlite-dev

Achievements

Achievements

Organizations

DefTruth/README.md

🏢 Group: Qwner. @xlite-dev | @vipshop | Prev. @PaddlePaddle 🏰

🛠 Creator: lite.ai.toolkit | Awesome-LLM-Inference | LeetCUDA | ffpa-attn 🎧

🖥 HGEMM | 🤗cache-dit | Awesome-DiT-Inference | torchlm 🖱

🎉 Contributor: FastDeploy | vLLM | SGLang | Many Others ⚙️

✉️ Contact: [email protected] | GitHub: DefTruth | 知乎: DefTruth 🤖

❤ I love open source, bro, and I think you do too. ❤

Pinned Loading

xlite-dev/LeetCUDA xlite-dev/LeetCUDA Public

📚LeetCUDA: 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.

Cuda 4.8k 519
xlite-dev/lite.ai.toolkit xlite-dev/lite.ai.toolkit Public

🛠 A lite C++ AI toolkit: 100+🎉 models with MNN, ORT and TRT.

C++ 4.1k 743
xlite-dev/Awesome-LLM-Inference xlite-dev/Awesome-LLM-Inference Public

📚A curated list of Awesome LLM Inference Papers with Codes.

Python 4.1k 287
PaddlePaddle/FastDeploy PaddlePaddle/FastDeploy Public

Large Language Model Deployment Toolkit

Cuda 3.2k 483
xlite-dev/ffpa-attn xlite-dev/ffpa-attn Public

📚FFPA: Extend FA-2 with Split-D for large headdim, 2x↑ vs SDPA.

Cuda 186 8
vipshop/cache-dit vipshop/cache-dit Public

🤗CacheDiT: A Training-free and Easy-to-use Cache Acceleration Toolbox for DiT (DBCache, DBPrune, FBCache, etc)

Python 23 2