sm-120

Here are 7 public repositories matching this topic...

Language: All

Filter by language

LianHe-BI / Blackwell-optimized-llama.cpp-Docker-image

Blackwell-optimized llama.cpp Docker image – works on all NVIDIA GPUs, but tuned for RTX 50 series. Built from scratch with CUDA 12.8, sm_120, NVFP4-ready. 250+ tok/s on 4B F16. Includes llama-chat script.

docker cpp docker-image cuda python3 pytorch nvidia quantization performance-optimization ready-to-use llm llamacpp rtx-50-series nvfp4 sm-120

Updated Mar 28, 2026

Andgihat / llama-cpp-mtp-turboquant-sm120-blackwell-windows

Star 6

Windows prebuilt of llama.cpp combining Multi-Token Prediction (MTP) + TurboQuant KV cache compression + native sm_120 (Blackwell consumer GPU, FP4 tensor cores). For RTX 5060 Ti / 5070 / 5080 / 5090.

windows prebuilt mtp blackwell llama-cpp rtx-5090 cuda-12-8 sm-120 turboquant rtx-50 rtx-5060ti

Updated Jun 5, 2026

D3velop-llc / csm-rtx5090

Star 2

Optimized CSM-1B TTS pipeline for RTX 5090 (Blackwell sm_120). CUDA graph replay via patched HF Transformers. ~0.46x RTF. Topics (tags): csm text-to-speech rtx-5090 blackwell cuda-graphs torch-compile sesame streaming pytorch

text-to-speech streaming pytorch tts sesame csm huggingface blackwell torch-compile rtx-5090 sm-120 cuda-graphs

Updated Apr 5, 2026
Python

mikecaronna / GEN3C

Star 0

GEN3C: Generative Novel 3D Captions - Adapted for NVIDIA Blackwell GPU architecture (sm_120). Includes automatic GPU detection, CPU-based T5 text encoding for Blackwell compatibility, and full backward compatibility with older GPUs.

pytorch nvidia video-generation blackwell gen3c cuda-12-8 sm-120 transformer-engine rtx-blackwell