English Chinese
DeepSparkInference推理模型库作为DeepSpark开源社区的核心项目,于2024年3月正式开源,一期甄选了48个推理模型示例,涵盖计算机视觉,自然语言处理,语音识别等领域,后续将逐步拓展更多AI领域。
DeepSparkInference中的模型提供了在国产推理引擎IGIE或ixRT下运行的推理示例和指导文档,部分模型提供了基于国产通用GPU智铠100的评测结果。
IGIE(Iluvatar GPU Inference Engine)是基于TVM框架研发的高性能、高通用、全流程的AI推理引擎。支持多框架模型导入、量化、图优化、多算子库支持、多后端支持、算子自动调优等特性,为推理场景提供易部署、高吞吐量、低延迟的完整方案。
ixRT(Iluvatar CoreX RunTime)是天数智芯自研的高性能推理引擎,专注于最大限度发挥天数智芯通用GPU 的性能,实现各领域模型的高性能推理。ixRT支持动态形状推理、插件和INT8/FP16推理等特性。
DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型类别并拓展大模型推理。
| Model |
Engine |
Supported |
IXUCA SDK |
| Baichuan2-7B |
vLLM |
✅ |
4.3.0 |
| ChatGLM-3-6B |
vLLM |
✅ |
4.3.0 |
| ChatGLM-3-6B-32K |
vLLM |
✅ |
4.3.0 |
| CosyVoice2-0.5B |
PyTorch |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Llama-8B |
vLLM |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Llama-70B |
vLLM |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Qwen-1.5B |
vLLM |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Qwen-7B |
vLLM |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Qwen-14B |
vLLM |
✅ |
4.3.0 |
| DeepSeek-R1-Distill-Qwen-32B |
vLLM |
✅ |
4.3.0 |
| ERNIE-4.5-21B-A3B |
FastDeploy |
✅ |
4.3.0 |
| ERNIE-4.5-300B-A47B |
FastDeploy |
✅ |
4.3.0 |
| GLM-4V |
vLLM |
✅ |
4.3.0 |
| InternLM3 |
LMDeploy |
✅ |
4.3.0 |
| Llama2-7B |
vLLM |
✅ |
4.3.0 |
| Llama2-7B |
TRT-LLM |
✅ |
4.3.0 |
| Llama2-13B |
TRT-LLM |
✅ |
4.3.0 |
| Llama2-70B |
TRT-LLM |
✅ |
4.3.0 |
| Llama3-70B |
vLLM |
✅ |
4.3.0 |
| E5-V |
vLLM |
✅ |
4.3.0 |
| MiniCPM-o |
vLLM |
✅ |
4.3.0 |
| MiniCPM-V |
vLLM |
✅ |
4.3.0 |
| Qwen-7B |
vLLM |
✅ |
4.3.0 |
| Qwen-VL |
vLLM |
✅ |
4.3.0 |
| Qwen2-VL |
vLLM |
✅ |
4.3.0 |
| Qwen2.5-VL |
vLLM |
✅ |
4.3.0 |
| Qwen1.5-7B |
vLLM |
✅ |
4.3.0 |
| Qwen1.5-7B |
TGI |
✅ |
4.3.0 |
| Qwen1.5-14B |
vLLM |
✅ |
4.3.0 |
| Qwen1.5-32B Chat |
vLLM |
✅ |
4.3.0 |
| Qwen1.5-72B |
vLLM |
✅ |
4.3.0 |
| Qwen2-7B Instruct |
vLLM |
✅ |
4.3.0 |
| Qwen2-72B Instruct |
vLLM |
✅ |
4.3.0 |
| StableLM2-1.6B |
vLLM |
✅ |
4.3.0 |
| Whisper |
vLLM |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| AlexNet |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| CLIP |
FP16 |
✅ |
✅ |
4.3.0 |
| Conformer-B |
FP16 |
✅ |
4.3.0 |
| ConvNeXt-Base |
FP16 |
✅ |
✅ |
4.3.0 |
| ConvNext-S |
FP16 |
✅ |
4.3.0 |
| ConvNeXt-Small |
FP16 |
✅ |
✅ |
4.3.0 |
| ConvNeXt-Tiny |
FP16 |
✅ |
4.3.0 |
| CSPDarkNet53 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| CSPResNet50 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| CSPResNeXt50 |
FP16 |
✅ |
✅ |
4.3.0 |
| DeiT-tiny |
FP16 |
✅ |
✅ |
4.3.0 |
| DenseNet121 |
FP16 |
✅ |
✅ |
4.3.0 |
| DenseNet161 |
FP16 |
✅ |
✅ |
4.3.0 |
| DenseNet169 |
FP16 |
✅ |
✅ |
4.3.0 |
| DenseNet201 |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNet-B0 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| EfficientNet-B1 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| EfficientNet-B2 |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNet-B3 |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNet-B4 |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNet-B5 |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNet-B6 |
FP16 |
✅ |
4.3.0 |
| EfficientNetV2 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| EfficientNetv2_rw_t |
FP16 |
✅ |
✅ |
4.3.0 |
| EfficientNetv2_s |
FP16 |
✅ |
✅ |
4.3.0 |
| GoogLeNet |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| HRNet-W18 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| InceptionV3 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| Inception-ResNet-V2 |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| Mixer_B |
FP16 |
✅ |
4.3.0 |
| MNASNet0_5 |
FP16 |
✅ |
4.3.0 |
| MNASNet0_75 |
FP16 |
✅ |
4.3.0 |
| MNASNet1_0 |
FP16 |
✅ |
4.3.0 |
| MNASNet1_3 |
FP16 |
✅ |
4.3.0 |
| MobileNetV2 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| MobileNetV3_Large |
FP16 |
✅ |
4.3.0 |
| MobileNetV3_Small |
FP16 |
✅ |
✅ |
4.3.0 |
| MViTv2_base |
FP16 |
✅ |
4.2.0 |
| RegNet_x_16gf |
FP16 |
✅ |
4.3.0 |
| RegNet_x_1_6gf |
FP16 |
✅ |
4.3.0 |
| RegNet_x_3_2gf |
FP16 |
✅ |
4.3.0 |
| RegNet_x_32gf |
FP16 |
✅ |
4.3.0 |
| RegNet_x_400mf |
FP16 |
✅ |
4.3.0 |
| RegNet_y_1_6gf |
FP16 |
✅ |
4.3.0 |
| RegNet_y_16gf |
FP16 |
✅ |
4.3.0 |
| RegNet_y_3_2gf |
FP16 |
✅ |
4.3.0 |
| RegNet_y_32gf |
FP16 |
✅ |
4.3.0 |
| RegNet_y_400mf |
FP16 |
✅ |
4.3.0 |
| RepVGG |
FP16 |
✅ |
✅ |
4.3.0 |
| Res2Net50 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| ResNeSt50 |
FP16 |
✅ |
4.3.0 |
| ResNet101 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| ResNet152 |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| ResNet18 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| ResNet34 |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| ResNet50 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| ResNetV1D50 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| ResNeXt50_32x4d |
FP16 |
✅ |
✅ |
4.3.0 |
| ResNeXt101_64x4d |
FP16 |
✅ |
✅ |
4.3.0 |
| ResNeXt101_32x8d |
FP16 |
✅ |
✅ |
4.3.0 |
| SEResNet50 |
FP16 |
✅ |
4.3.0 |
| ShuffleNetV1 |
FP16 |
✅ |
4.3.0 |
| ShuffleNetV2_x0_5 |
FP16 |
✅ |
✅ |
4.3.0 |
| ShuffleNetV2_x1_0 |
FP16 |
✅ |
✅ |
4.3.0 |
| ShuffleNetV2_x1_5 |
FP16 |
✅ |
✅ |
4.3.0 |
| ShuffleNetV2_x2_0 |
FP16 |
✅ |
✅ |
4.3.0 |
| SqueezeNet 1.0 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| SqueezeNet 1.1 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| SVT Base |
FP16 |
✅ |
4.3.0 |
| Swin Transformer |
FP16 |
✅ |
4.3.0 |
| Swin Transformer Large |
FP16 |
✅ |
4.3.0 |
| Twins_PCPVT |
FP16 |
✅ |
4.3.0 |
| VAN_B0 |
FP16 |
✅ |
4.3.0 |
| VGG11 |
FP16 |
✅ |
4.3.0 |
| VGG13 |
FP16 |
✅ |
4.3.0 |
| VGG13_BN |
FP16 |
✅ |
4.3.0 |
| VGG16 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| VGG19 |
FP16 |
✅ |
4.3.0 |
| VGG19_BN |
FP16 |
✅ |
4.3.0 |
| ViT |
FP16 |
✅ |
4.3.0 |
| Wide ResNet50 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| Wide ResNet101 |
FP16 |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| ATSS |
FP16 |
✅ |
✅ |
4.3.0 |
| CenterNet |
FP16 |
✅ |
✅ |
4.3.0 |
| DETR |
FP16 |
✅ |
4.3.0 |
| FCOS |
FP16 |
✅ |
✅ |
4.3.0 |
| FoveaBox |
FP16 |
✅ |
✅ |
4.3.0 |
| FSAF |
FP16 |
✅ |
✅ |
4.3.0 |
| GFL |
FP16 |
✅ |
4.3.0 |
| HRNet |
FP16 |
✅ |
✅ |
4.3.0 |
| PAA |
FP16 |
✅ |
✅ |
4.3.0 |
| RetinaFace |
FP16 |
✅ |
✅ |
4.3.0 |
| RetinaNet |
FP16 |
✅ |
✅ |
4.3.0 |
| RTMDet |
FP16 |
✅ |
4.3.0 |
| SABL |
FP16 |
✅ |
4.3.0 |
| SSD |
FP16 |
✅ |
4.3.0 |
| YOLOF |
FP16 |
✅ |
4.3.0 |
| YOLOv3 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| YOLOv4 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| YOLOv5 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| YOLOv5s |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| YOLOv6 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| YOLOv7 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| YOLOv8 |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| YOLOv9 |
FP16 |
✅ |
✅ |
4.3.0 |
| YOLOv10 |
FP16 |
✅ |
✅ |
4.3.0 |
| YOLOv11 |
FP16 |
✅ |
✅ |
4.3.0 |
| YOLOv12 |
FP16 |
✅ |
4.3.0 |
| YOLOv13 |
FP16 |
✅ |
4.3.0 |
| YOLOX |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| FaceNet |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
IXUCA SDK |
| Kie_layoutXLM |
FP16 |
✅ |
4.3.0 |
| SVTR |
FP16 |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| HRNetPose |
FP16 |
✅ |
4.3.0 |
| Lightweight OpenPose |
FP16 |
✅ |
4.3.0 |
| RTMPose |
FP16 |
✅ |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| Mask R-CNN |
FP16 |
✅ |
4.2.0 |
| SOLOv1 |
FP16 |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| UNet |
FP16 |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| FastReID |
FP16 |
✅ |
4.3.0 |
| DeepSort |
FP16 |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| RepNet-Vehicle-ReID |
FP16 |
✅ |
4.3.0 |
| Model |
vLLM |
IxFormer |
IXUCA SDK |
| Aria |
✅ |
4.3.0 |
| Chameleon-7B |
✅ |
4.3.0 |
| CLIP |
✅ |
4.3.0 |
| Fuyu-8B |
✅ |
4.3.0 |
| H2OVL Mississippi |
✅ |
4.3.0 |
| Idefics3 |
✅ |
4.3.0 |
| InternVL2-4B |
✅ |
4.3.0 |
| LLaVA |
✅ |
4.3.0 |
| LLaVA-Next-Video-7B |
✅ |
4.3.0 |
| Llama-3.2 |
✅ |
4.3.0 |
| MiniCPM-V 2 |
✅ |
4.3.0 |
| Pixtral |
✅ |
4.3.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| ALBERT |
FP16 |
✅ |
4.3.0 |
| BERT Base NER |
INT8 |
✅ |
4.3.0 |
| BERT Base SQuAD |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
4.3.0 |
| BERT Large SQuAD |
FP16 |
✅ |
✅ |
4.3.0 |
| INT8 |
✅ |
✅ |
4.3.0 |
| DeBERTa |
FP16 |
✅ |
4.3.0 |
| RoBERTa |
FP16 |
✅ |
4.3.0 |
| RoFormer |
FP16 |
✅ |
4.3.0 |
| VideoBERT |
FP16 |
✅ |
4.2.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| Conformer |
FP16 |
✅ |
✅ |
4.3.0 |
| Transformer ASR |
FP16 |
✅ |
4.2.0 |
| Model |
Prec. |
IGIE |
ixRT |
IXUCA SDK |
| Wide & Deep |
FP16 |
✅ |
4.3.0 |
| Docker Installer |
IXUCA SDK |
Introduction |
| corex-docker-installer-4.3.0-*-py3.10-x86_64.run |
4.3.0 |
适用小模型推理 |
| corex-docker-installer-4.3.0-*-llm-py3.10-x86_64.run |
4.3.0 |
适用大模型推理 |
请参见 DeepSpark Code of Conduct on Gitee or on GitHub。
请联系 contact@deepspark.org.cn。
请参见 DeepSparkInference Contributing Guidelines。
DeepSparkInference仅提供公共数据集的下载和预处理脚本。这些数据集不属于DeepSparkInference,DeepSparkInference也不对其质量或维护负责。请确保您具有这些数据集的使用许可,基于这些数据集训练的模型仅可用于非商业研究和教育。
致数据集所有者:
如果不希望您的数据集公布在DeepSparkInference上或希望更新DeepSparkInference中属于您的数据集,请在Gitee或Github上提交issue,我们将按您的issue删除或更新。衷心感谢您对我们社区的支持和贡献。
本项目许可证遵循Apache-2.0。