Pull requests: sgl-project/sglang

Pull requests list

support cutlass fp4 kernel in sm120 run-ci

#11737 opened Oct 17, 2025 by AichenF

@AniZpZ @HydraQYH @Qiaolin-Yu

[Test] support llm-compressor: w8a8_fp8_block, wNa16

#11701 opened Oct 16, 2025 by Wangzheee

4 tasks

@AniZpZ

[quantization] AWQ Marlin doesn't work when dtype is bfloat16 express-lane run-ci

#11494 opened Oct 12, 2025 by kevin85421

4 tasks

@AniZpZ @FlamingoPg

[6/n]decouple quantization implementation from vLLM dependency ready-to-merge run-ci

#10750 opened Sep 22, 2025 by Hongbosherlock

4 tasks

@AniZpZ @Hongbosherlock @FlamingoPg @zhyncs

integrate autoRound quantization algorithm high priority intel run-ci

#10153 opened Sep 8, 2025 by WeiweiZhang1

@AniZpZ @BBuf

feat: Add FP4 (E2M1) KV Cache Support with Quantization Utilities for MLA high priority quant run-ci

#10078 opened Sep 5, 2025 by JackChuang

4 tasks done

@AniZpZ @zhyncs @Fridge003

[2/N][Bug] Fix w4afp8 MoE NaN issue (python) high priority

#9918 opened Sep 2, 2025 by yuhyao

3 of 4 tasks

@AniZpZ

Add scale_ub for per_token_quant_fp8

#9880 opened Sep 1, 2025 by Hongbosherlock

4 tasks

@AniZpZ

Support w4a16 quantization for CompressedTensorsMoEMethod

#9248 opened Aug 16, 2025 by RaymondWang0

4 tasks

@AniZpZ

Fix W8A8Int8Config init error high priority

#8014 opened Jul 14, 2025 by lambert0312

1 of 6 tasks

@AniZpZ

[Kernel] feat: add flexprefill for long-context high priority

#6867 opened Jun 4, 2025 by artetaout

4 of 6 tasks

@AniZpZ @Alcanderian @ispobock @BBuf

Fix channel-wise INT8 moe config tuning error high priority

#5872 opened Apr 29, 2025 by lambert0312

2 of 6 tasks

@AniZpZ @BBuf

ProTip! Mix and match filters to narrow down what you’re looking for.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pull requests: sgl-project/sglang

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Pull requests list