Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[1/N] add fp8 fp32 scale support for custom RL model#368

Open
yiakwy-xpu-ml-framework-team wants to merge 2 commits into
antirez:main from
yiakwy-xpu-ml-framework-team:add_fp8_fp32_scale_support
Open

[1/N] add fp8 fp32 scale support for custom RL model #368
yiakwy-xpu-ml-framework-team wants to merge 2 commits into
antirez:main from
yiakwy-xpu-ml-framework-team:add_fp8_fp32_scale_support

Conversation

@yiakwy-xpu-ml-framework-team

@yiakwy-xpu-ml-framework-team yiakwy-xpu-ml-framework-team commented Jun 9, 2026
edited
Loading

Copy link
Copy Markdown

Background

We added fp8 RL+SFT version of Deepseek V4 in week 0 support and suppressed DeepSeek V4 baseline in all major dimensions from our internal evaluation.

Hence we want to add 2 bit support for DeepSeek V4 with our Expert Pruning technology:
截屏2026年06月09日 15 40 50

Noted, in H100/H800, we usually don't use E8M0 for scale, since it will introduce runtime overhead. FP32 scale is the best.

Copy link
Copy Markdown
Author

@antirez could you have a look at it ?

antirez commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Hi, the PR itself has a few quality issues but especially it is not clear why it would be useful for the proejct as a whole given that we convert from DS4 hugging face formats.

yiakwy-xpu-ml-framework-team commented Jun 10, 2026
edited
Loading

Copy link
Copy Markdown
Author
截屏2026年06月10日 09 14 48

Quantization is successful.

@antirez Thank you for the quick response, let me explain.

  • our sft/RL model of deepseek v4 has embedding layer (bf16 or int32), while deepseek model has embedding with type int64

  • since we are running in Hopper platform , our expert weight stored with E4M3 FP8 weight and weight scale stored with FP32 for best performance (which can verified in SGLang):

    sglang fp8 serving

    Customer DSV4 sglang fp8 serving in Hopper platform with identity injectioin, private/public knowledge injection and enhanced security shield module
  • Huggingface model is not SGLang compatible version, while our version is; and huggingface does not consider convert SFT/RL model from Bf16 to FP8 variants

The model is tuned specifically to handle Candonese, Chinese madarin and English efficiently.

yiakwy-xpu-ml-framework-team commented Jun 11, 2026
edited
Loading

Copy link
Copy Markdown
Author

Hi @antirez I can make sure the modification can generate correct model checkpoint for ds4. Wish your attention.

GB10 (2 bit dsv4-sft-rl, 15 toks/sec) :

截屏2026年06月12日 17 54 44

Serving with raw model (no system prompt):

2135b13f7dc854cf793c7d7e849dc316

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /