-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Pull requests: vllm-project/vllm
Pull requests list
[Frontend] Make RequestIdMiddleware return the internal request_id
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#27983
opened Nov 3, 2025 by
markmc
Loading...
[Quantization] support gpt-oss for quantized kv cache weight loading
gpt-oss
Related to GPT-OSS models
#27980
opened Nov 3, 2025 by
xuebwang-amd
Loading...
5 tasks
[KVConnector] Enable get_block_ids_with_load_errors() in LMCache connector
kv-connector
#27978
opened Nov 3, 2025 by
ziruiliu
Loading...
5 tasks
fix(benchmarks): Remove hardcoded dtype in hf backend
performance
Performance-related issues
#27976
opened Nov 3, 2025 by
git-jxj
Loading...
3 of 5 tasks
[Refactor] Lazy import tool_parser
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
llama
Related to Llama models
tool-calling
#27974
opened Nov 3, 2025 by
chaunceyjiang
Loading...
5 tasks
[Model] fix ernie45 reasoning_parser
ready
ONLY add when PR is ready to merge/full CI is needed
#27973
opened Nov 3, 2025 by
CSWYF3634076
Loading...
[Bugfix] Handle escaped characters in GLM tool parser to prevent double serialization
ci/build
frontend
gpt-oss
Related to GPT-OSS models
tool-calling
v1
#27970
opened Nov 3, 2025 by
soaringk
Loading...
3 of 5 tasks
[Model][Bugfix] fix pipeline parallelism support for NemotronH
#27968
opened Nov 3, 2025 by
tomeras91
Loading...
[Model] app optimal triton fused moe configs for NemotronH MoE
performance
Performance-related issues
#27967
opened Nov 3, 2025 by
tomeras91
Loading...
[Bugfix][ROCm] Fix AITER attention backend for deepseek-ocr model
deepseek
Related to DeepSeek models
rocm
Related to AMD ROCm
v1
#27965
opened Nov 3, 2025 by
vllmellm
Loading...
5 tasks
[Doc][Last/N] Improve all pooling task | Refactor pooling-related documentation
documentation
Improvements or additions to documentation
[Refactor] to simplify and extract the shared logic between chat completion and responses
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
tool-calling
#27961
opened Nov 3, 2025 by
chaunceyjiang
Loading...
5 tasks
[LoRA][FusedMoE] Introduce FusedMoEPermuteExpertsUnpermuteWithLoRA
needs-rebase
#27959
opened Nov 3, 2025 by
varun-sundar-rabindranath
Loading...
[V0 deprecation] Remove VLLM_USE_V1 usage in most modules
documentation
Improvements or additions to documentation
frontend
kv-connector
multi-modality
Related to multi-modality (#4194)
structured-output
v1
#27955
opened Nov 3, 2025 by
wangxiyuan
Loading...
5 tasks
[CPU] Refactor CPU attention backend
ci/build
v1
#27954
opened Nov 3, 2025 by
bigPYJ1151
Loading...
2 of 5 tasks
[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores
documentation
Improvements or additions to documentation
v1
#27953
opened Nov 3, 2025 by
StanHatko
Loading...
4 of 5 tasks
[CI/Build] Update checking logic in cutlass_group_gemm_supported
moe
rocm
Related to AMD ROCm
#27948
opened Nov 2, 2025 by
zhewenl
Loading...
Fix hard-coded parameter name in gemma3n.py
#27946
opened Nov 2, 2025 by
seungduk-yanolja
Loading...
5 tasks
[CI/Build] Update LM Eval Version in AMD CI
ci/build
rocm
Related to AMD ROCm
#27944
opened Nov 2, 2025 by
zhewenl
Loading...
[V1][Perf] Optimize Medusa proposer: reduce sync overhead
speculative-decoding
v1
#27943
opened Nov 2, 2025 by
skyloevil
Loading...
[Metrics] [KVConnector] Add Offloading Connector metrics
kv-connector
v1
#27942
opened Nov 2, 2025 by
omerpaz95
Loading...
[Bugfix][Core] Load plugins in new processes created by
fork
#27940
opened Nov 2, 2025 by
matan-dup
Loading...
3 of 5 tasks
ProTip!
Mix and match filters to narrow down what you’re looking for.
You can’t perform that action at this time.