Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: Zhayr1/bitmamba.cpp

v1.0.0 - Windows Release

30 Jan 02:53
@Zhayr1 Zhayr1
5a5bebc
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

🚀 BitMamba (v1.0.0)
This is the initial release of the optimized C++ inference engine for BitMamba. It features custom AVX2/FMA kernels and OpenMP parallelization to achieve maximum token generation speed.

✨ Key Features
Top Performance: Reaches ~50 tokens/sec on standard hardware (1B model) thanks to handwritten SIMD kernels.

Standalone Binary: Static compilation (no need to install Python, MinGW, or external DLLs).

Low Footprint: Highly efficient RAM usage (~630MB).

Parallelization: Uses OpenMP to maximize CPU core usage.

📥 Download Model Weights

You can download the compressed model weights (1B) directly from Hugging Face using the following command in PowerShell or CMD:

curl -L -O https://huggingface.co/Zhayr1/BitMamba-2-1B/resolve/main/bitmamba_cpp/bitmamba_1b.bin

Also you need to download the tokenizer.bin

curl -L -O https://github.com/Zhayr1/bitmamba.cpp/raw/refs/heads/main/tokenizer.bin

📦 Downloads

bitmamba.exe: Windows x64 binary.

Requirements: An x64 CPU with AVX2 support (Intel Haswell 2013+ or AMD Ryzen/Excavator+).

Note: If your CPU is old (pre-2013), this binary might crash with Illegal Instruction.

🛠️ How to Run (Windows)
Open PowerShell or CMD in the folder containing the .exe and run:

.\bitmamba.exe <model_path> "<prompt>" <mode: tokenizer | raw> <temp> <top_p> <min_p> <top_k> <max_tokens> <ctx_len>

Example:

.\bitmamba.exe bitmamba_1b.bin "Hello, I am" tokenizer 0.7 1.1 0.05 0.9 40 200
Assets 3
Loading

AltStyle によって変換されたページ (->オリジナル) /