-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V0 deprecation] Remove
_set_default_args_v0
function
#25409
opened Sep 22, 2025 by
Isotr0py
Loading…
1 of 5 tasks
Roll back uniform decode with mixed batch cudagraph
v1
#25407
opened Sep 22, 2025 by
MatthewBonanni
Loading…
5 tasks
[Speculators][Speculative Decoding] Fix gpt-oss eagle3 accuracy issue
bug
Something isn't working
gpt-oss
Related to GPT-OSS models
llama
Related to Llama models
speculative-decoding
v1
#25406
opened Sep 22, 2025 by
jiahanc
Loading…
5 tasks
[Bugfix] Fix DeepSeekV31ToolParser to correctly parse multiple tools in non-streaming output
deepseek
Related to DeepSeek models
frontend
tool-calling
#25405
opened Sep 22, 2025 by
taohui
Loading…
5 tasks
[Core] Optimize LoRA weight loading
ready
ONLY add when PR is ready to merge/full CI is needed
#25403
opened Sep 22, 2025 by
jeejeelee
Loading…
5 tasks
[Bugfix] Fix missing
clear_connector_metadata
kv-connector
v1
#25397
opened Sep 22, 2025 by
NickLucche
Loading…
[CI Failure] Fix fp8 kv cache on <SM90
ci-failure
Issue about an unexpected test failure in CI
ready
ONLY add when PR is ready to merge/full CI is needed
#25396
opened Sep 22, 2025 by
mgoin
Loading…
5 tasks
[CI/Build] Skip Qwen3-VL initialization tests until models are actually released
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#25394
opened Sep 22, 2025 by
DarkLight1337
Loading…
5 tasks
[PERF][Qwen3-next] Speedup Related to Qwen models
chunk_gated_delta_rule_fwd_h
(part of GDN attn)
qwen
#25393
opened Sep 22, 2025 by
vadiklyutiy
•
Draft
Revert "[Metrics] Hide deprecated metrics with gpu_ prefix (#24245)"
v1
#25392
opened Sep 22, 2025 by
markmc
Loading…
[Compiler] Disable Inductor standalone compile by default
ready
ONLY add when PR is ready to merge/full CI is needed
[NIXL][Misc] Expose metrics from NIXL for logging to CLI
ci/build
kv-connector
v1
#25388
opened Sep 22, 2025 by
NickLucche
Loading…
[Ray][CPU] Ray executor and Ray DP support for CPU backend
ci/build
documentation
Improvements or additions to documentation
v1
#25386
opened Sep 22, 2025 by
alex-coniasse
Loading…
5 tasks
[Bugfix] Add Flash Attention guards in MLACommonImpl constructor
v1
#25385
opened Sep 22, 2025 by
kzawora-intel
Loading…
[V0 Deprecation][KVConnector] Remove KVConnector v1/v0 differentiation
ci/build
documentation
Improvements or additions to documentation
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#25376
opened Sep 22, 2025 by
NickLucche
Loading…
[Model] Support multi-vector retrieval
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
[Docs] wheel larger than limit
documentation
Improvements or additions to documentation
#25367
opened Sep 22, 2025 by
pfk-beta
Loading…
[Bugfix] Qwen3-next generate ! always
qwen
Related to Qwen models
#25365
opened Sep 22, 2025 by
yych0745
Loading…
5 tasks
[Core] Enable KV cache connector + hybrid allocator
kv-connector
tpu
Related to Google TPUs
v1
#25363
opened Sep 22, 2025 by
KuntaiDu
Loading…
5 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.