Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Any hope for DeepSeek v3 MTP support? #11455

Green0-0 started this conversation in Ideas
Discussion options

"Based on our evaluation, the acceptance rate of the second token prediction ranges between 85% and 90% across various generation topics, demonstrating consistent reliability. This high acceptance rate enables DeepSeek-V3 to achieve a significantly improved decoding speed, delivering 1.8 times TPS (Tokens Per Second)."
(The DeepSeek v3 report)

You must be logged in to vote

Replies: 2 comments

Comment options

It may also be worth looking at DeepseekVL2 models which share the same vocabulary as DeepseekV3.

This one, maybe then it could be offloaded to the gpu?

You must be logged in to vote
0 replies
Comment options

It is weird that llama.cpp does not support MTP yet! In the future, most models would be equipped with MTP.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /