Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[SW-235186] Enable group indexing support
#349 opened Aug 25, 2025 by jmamzax Loading…
Fix for Llama4 models (targets main)
#341 opened Aug 19, 2025 by vidyasiv Loading…
Pass Attn Sinks to Fused SDPA
#339 opened Aug 18, 2025 by SKRohit Loading…
[SW-227615] adding quant config files to vllm
#336 opened Aug 14, 2025 by linoybu Loading…
Fix for Llama4 models
#329 opened Aug 11, 2025 by vidyasiv Loading…
2 tasks done
Add flag pin_memory to call from hpu.py in vllm
#325 opened Aug 5, 2025 by xuechendi Loading…
Add Calibration Script for SGLang FP8
#318 opened Jul 29, 2025 by SKRohit Loading…
Add block_softmax_adjustment and block_softmax kernels
#289 opened Jul 16, 2025 by czhu15 Loading…
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
Optimized MoE on Gaudi
#159 opened Apr 18, 2025 by gyou2021 Draft
[FIX] fp8 gc compile error
#110 opened Mar 4, 2025 by maktukmak Draft
ProTip! Type g i on any issue or pull request to go back to the issue listing page.