-
Notifications
You must be signed in to change notification settings - Fork 375
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add trtllm-gen attention kernel with FP8 Q/K/V and FP4/FP8 output
#1242
opened Jul 11, 2025 by
weireweire
Loading…
feat: Support MXFP8 x MXFP4 CUTLASS grouped GEMM
#1241
opened Jul 11, 2025 by
jinyangyuan-nvidia
Loading…
5 tasks
feat: Restore convenience
FLASHINFER_ENABLE_AOT
option
#1235
opened Jul 8, 2025 by
mgorny
Loading…
3 of 5 tasks
[Feature] Support batch prefill for POD Attention
#1231
opened Jul 8, 2025 by
Edenzzzz
Loading…
6 tasks
Feature/sm100 low latency nvfp4 kernels
priority: high
#1214
opened Jul 4, 2025 by
azhurkevich
Loading…
1 of 5 tasks
Use flashinfer
softmax
in top_k_top_p_sampling_from_logits
#1171
opened Jun 24, 2025 by
lgeiger
Loading…
5 tasks done
Port AllGather/ReduceScatter from TensorRT-LLM
#1145
opened Jun 15, 2025 by
wenscarl
Loading…
5 tasks
[WIP] refactor: unifying return status of different backend implementation
#1141
opened Jun 12, 2025 by
yzh119
Loading…
5 tasks
[RFC] Redesigned CMake build infrastructure for C++ API
#944
opened Mar 14, 2025 by
diptorupd
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.