cpu-reference

Here is 1 public repository matching this topic...

AHX47 / flash-moe-universal

Cross‐platform inference engine for huge AI models (1B–397B). Runs on any CPU (x86_64/ARM64) with AVX2/NEON, supports dense & MoE models (Qwen, Llama, Mistral...). GPU backends (Metal, OpenCL, CUDA) coming soon. No Python, no frameworks – pure C with optional PyQt5 GUI.

metal neon opencl x86-64 cuda moe avx2 arm64 pyqt5-desktop-application tui-app apple-silicon qwen ai-local cpu-reference ahx47

Updated Jun 2, 2026
C

Improve this page

Add a description, image, and links to the cpu-reference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the cpu-reference topic, visit your repo's landing page and select "manage topics."

Learn more

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu-reference

Here is 1 public repository matching this topic...

AHX47 / flash-moe-universal

Improve this page

Add this topic to your repo