-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Pull requests: vllm-project/vllm
Pull requests list
[PERF] Decouple projections from GDN custom op
qwen
Related to Qwen models
#27512
opened Oct 25, 2025 by
vadiklyutiy
Loading...
Add standalone multimodal encoder benchmark
frontend
performance
Performance-related issues
#27511
opened Oct 25, 2025 by
alhridoy
Loading...
add cpu device support for nixl_connector
kv-connector
#27510
opened Oct 25, 2025 by
ZhengHongming888
Loading...
5 tasks
qwen3moe on gh200
qwen
Related to Qwen models
#27507
opened Oct 25, 2025 by
bhaktatejas922
Loading...
[Multimodal] Move profiling info out of processing info
deepseek
Related to DeepSeek models
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
#27506
opened Oct 25, 2025 by
DarkLight1337
•
Draft
7 tasks
Clarify V0→V1 error; keep SamplingParams importable when VLLM_USE_V1=0
frontend
v1
#27503
opened Oct 25, 2025 by
nick-allison
Loading...
3 of 5 tasks
Prefill / Decode Split into Compiled Region
#27501
opened Oct 25, 2025 by
therealnaveenkamal
Loading...
1 of 5 tasks
feat: make extraInit containers fully configurable in helm chart
documentation
Improvements or additions to documentation
#27497
opened Oct 25, 2025 by
HanFa
Loading...
3 of 5 tasks
[Bugfix] fix empty prompts for async-engine mode in benchmark throughput
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
#27494
opened Oct 25, 2025 by
luccafong
Loading...
Add more dims for batch invariant shims
#27489
opened Oct 24, 2025 by
bwasti
Loading...
3 of 5 tasks
[Chore] Optimize P2PNCCLEngine ONLY add when PR is ready to merge/full CI is needed
http_address
kv-connector
ready
#27488
opened Oct 24, 2025 by
yewentao256
Loading...
[Bugfix][LoRA][FusedMoE] Select MxFP4 Backend based on LoRA Enablement
ready
ONLY add when PR is ready to merge/full CI is needed
#27487
opened Oct 24, 2025 by
varun-sundar-rabindranath
Loading...
[Refactor] Add Shared Block Max Reduction Helper
#27483
opened Oct 24, 2025 by
harishappana-git
Loading...
5 tasks
[Test] Draft: Nixl fault tests
ci/build
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#27481
opened Oct 24, 2025 by
wseaton
Loading...
[Test] Batch Invariant: Unit test using parameterized backend
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#27478
opened Oct 24, 2025 by
yewentao256
Loading...
[Rocm][fused_moe][fp4] view weight to torch.float4_e2m1fn_x2 when running aiter fused moe for fp4 model
rocm
Related to AMD ROCm
#27474
opened Oct 24, 2025 by
zejunchen-zejun
Loading...
[Kernel] Enable moe LoRA kernel support FP16
ready
ONLY add when PR is ready to merge/full CI is needed
#27468
opened Oct 24, 2025 by
jeejeelee
Loading...
5 tasks
[Bugfix] Fix processor initialization for model from modelscope instead of HF
ready
ONLY add when PR is ready to merge/full CI is needed
#27461
opened Oct 24, 2025 by
lengrongfu
Loading...
5 tasks
[Performance][MLA][ROCm] Remove redundant D2D copy in deepseek
deepseek
Related to DeepSeek models
rocm
Related to AMD ROCm
v1
#27457
opened Oct 24, 2025 by
ganyi1996ppo
Loading...
5 tasks
[Model] [Bugfix] Fix inconsistencies in the handling of layer names
#27453
opened Oct 24, 2025 by
Alnusjaponica
•
Draft
2 tasks
Fix decoding server's logprobs handling in Prefill/Decode disaggregation mode
frontend
kv-connector
v1
#27449
opened Oct 24, 2025 by
Prowindy
Loading...
5 tasks
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.
You can’t perform that action at this time.