Add Q4_0 and Q8_0 quantization support for 310p npu Ascend CANN · ggml-org/llama.cpp · Discussion #16484

cecuca
Oct 9, 2025

Add support for launching quantized weights on 310p NPU like Atlas 300i duo cards. There is none backends which supports quantization for 310p except mindIE(only INT8 with special format convertion). Quant support will make this card best value for local inference.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Q4_0 and Q8_0 quantization support for 310p npu Ascend CANN #16484

Uh oh!

{{title}}

Uh oh!

cecuca
Oct 9, 2025

Replies: 0 comments

Select a reply

Uh oh!

Add Q4_0 and Q8_0 quantization support for 310p npu Ascend CANN #16484

Uh oh!

cecuca Oct 9, 2025

Replies: 0 comments

cecuca
Oct 9, 2025