-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: triton-inference-server/server
Pull requests list
Removing unused values for TensorRT-LLM container build
#8472
opened Oct 23, 2025 by
mc-nv
Loading...
ci: Add support for Changes to our CI configuration files and scripts
max_inflight_requests parameter to prevent unbounded memory growth in ensemble models
PR: ci
#8458
opened Oct 13, 2025 by
pskiran1
Loading...
7 of 20 tasks
feat: Add Hermes tool call parser for OpenAI API
#8456
opened Oct 12, 2025 by
amit-timalsina
Loading...
11 of 12 tasks
Feat: revamp build.py CLI to improve usability and maintainability
#8437
opened Oct 2, 2025 by
kpedro88
Loading...
9 of 22 tasks
feat: Minor improvements to build.py
build
Issues pertaining to builds
enhancement
New feature or request
#8362
opened Aug 19, 2025 by
kpedro88
Loading...
6 of 22 tasks
fix: WAR for Python CUDA library unknown race condition
PR: fix
A bug fix
#8360
opened Aug 19, 2025 by
GuanLuo
Loading...
feat: Added --build_variant flag for cpu only build.
#8329
opened Aug 5, 2025 by
Sunidhi-Gaonkar1
Loading...
4 of 22 tasks
feat: add parameters in onprem k8s chart (volume, resources & env. variables)
#8324
opened Aug 1, 2025 by
vladmirtxrx
Loading...
3 of 22 tasks
Support tokenizer override per model for multi-model Triton + vLLM serving with OpenAI-Compatible
#8321
opened Jul 31, 2025 by
JunmooByun
Loading...
11 of 13 tasks
docs: Fix typos and grammar issues in markdown files
#8306
opened Jul 23, 2025 by
cluster2600
Loading...
12 of 13 tasks
fix: Fix the server runtime errors on cpu only platform and with pytorch backend
#8272
opened Jun 27, 2025 by
snadampal
Loading...
6 of 21 tasks
docs: fix capitalization of Triton Inference Server
#8252
opened Jun 13, 2025 by
ShriyashP
Loading...
5 of 13 tasks
feat: Add guided decoding support to OpenAI frontend
#8245
opened Jun 11, 2025 by
pei0033
Loading...
7 of 22 tasks
docs: update the link formats for additional security networking guides
#8229
opened Jun 2, 2025 by
xander-aphe-hatschi
Loading...
22 tasks
refactor: replace tf model with onnx model for L0_response_cache
#8114
opened Apr 2, 2025 by
ziqifan617
•
Draft
[build]: Add rt_base_image parameter to differentiate triton build base image and runtime base image
#8064
opened Mar 12, 2025 by
nv-tusharma
•
Draft
5 of 20 tasks
feat: GRPC Callback API migration for Non Inference
#8062
opened Mar 11, 2025 by
indrajit96
Loading...
7 of 20 tasks
Build: Build using the PA binaries and whl if available.
#8043
opened Feb 27, 2025 by
pvijayakrish
Loading...
8 of 20 tasks
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.
You can’t perform that action at this time.