Name	Name	Last commit message	Last commit date
Latest commit History 18 Commits
StarCoderApp	StarCoderApp
assets	assets
.gitignore	.gitignore
CMakeLists.txt	CMakeLists.txt
Makefile	Makefile
README.md	README.md
common.cpp	common.cpp
common.h	common.h
convert-hf-to-ggml.py	convert-hf-to-ggml.py
ggml-cuda.cu	ggml-cuda.cu
ggml-cuda.h	ggml-cuda.h
ggml-opencl.c	ggml-opencl.c
ggml-opencl.h	ggml-opencl.h
ggml.c	ggml.c
ggml.h	ggml.h
main.cpp	main.cpp
quantize.cpp	quantize.cpp
requirements.txt	requirements.txt

💫StarCoder in C++

This is a C++ example running 💫 StarCoder inference using the ggml library.

The program can run on the CPU - no video card is required.

The example supports the following 💫 StarCoder models:

bigcode/starcoder
bigcode/gpt_bigcode-santacoder aka the smol StarCoder
HuggingFaceH4/starchat-beta - the coding assistants based on StarCoderPlus

Sample performance on MacBook M1 Pro:

TODO

Sample output:

$ ./bin/starcoder -h
usage: ./bin/starcoder [options]
options:
 -h, --help show this help message and exit
 -s SEED, --seed SEED RNG seed (default: -1)
 -t N, --threads N number of threads to use during computation (default: 8)
 -p PROMPT, --prompt PROMPT
 prompt to start generation with (default: random)
 -n N, --n_predict N number of tokens to predict (default: 200)
 --top_k N top-k sampling (default: 40)
 --top_p N top-p sampling (default: 0.9)
 --temp N temperature (default: 1.0)
 -b N, --batch_size N batch size for prompt processing (default: 8)
 -m FNAME, --model FNAME
 model path (default: models/starcoder-117M/ggml-model.bin)
$ ./bin/starcoder -m ../models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin -p "def fibonnaci(" -t 4 --top_k 0 --top_p 0.95 --temp 0.2 
main: seed = 1683881276
starcoder_model_load: loading model from '../models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin'
starcoder_model_load: n_vocab = 49280
starcoder_model_load: n_ctx = 2048
starcoder_model_load: n_embd = 2048
starcoder_model_load: n_head = 16
starcoder_model_load: n_layer = 24
starcoder_model_load: ftype = 3
starcoder_model_load: ggml ctx size = 1794.90 MB
starcoder_model_load: memory size = 768.00 MB, n_mem = 49152
starcoder_model_load: model size = 1026.83 MB
main: prompt: 'def fibonnaci('
main: number of tokens in prompt = 7, first 8 tokens: 563 24240 78 2658 64 2819 7 
def fibonnaci(n):
 if n == 0:
 return 0
 elif n == 1:
 return 1
 else:
 return fibonacci(n-1) + fibonacci(n-2)
print(fibo(10))
main: mem per token = 9597928 bytes
main: load time = 480.43 ms
main: sample time = 26.21 ms
main: predict time = 3987.95 ms / 19.36 ms per token
main: total time = 4580.56 ms

Quick start

git clone https://github.com/bigcode-project/starcoder.cpp
cd starcoder.cpp
# Convert HF model to ggml
python convert-hf-to-ggml.py bigcode/gpt_bigcode-santacoder
# Build ggml libraries
make
# quantize the model
./quantize models/bigcode/gpt_bigcode-santacoder-ggml.bin models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin 3
# run inference
./main -m models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin -p "def fibonnaci(" --top_k 0 --top_p 0.95 --temp 0.2

Downloading and converting the original models (💫 StarCoder)

You can download the original model and convert it to ggml format using the script convert-hf-to-ggml.py:

# Convert HF model to ggml
python convert-hf-to-ggml.py bigcode/gpt_bigcode-santacoder

This conversion requires that you have python and Transformers installed on your computer.

Quantizing the models

You can also try to quantize the ggml models via 4-bit integer quantization.

# quantize the model
./quantize models/bigcode/gpt_bigcode-santacoder-ggml.bin models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin 3

Model	Original size	Quantized size	Quantization type
`bigcode/gpt_bigcode-santacoder`	5396.45 MB	1026.83 MB	4-bit integer (q4_1)
`bigcode/starcoder`	71628.23 MB	13596.23 MB	4-bit integer (q4_1)

iOS App

The repo includes a proof-of-concept iOS app in the StarCoderApp directory. You need to provide the converted (and possibly quantized) model weights, placing a file called bigcode_ggml_model.bin.bin inside that folder. This is what it looks like on an iPhone:

starcoder-ios-screenshot

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Say383/starcoder.cpp

Folders and files

Latest commit

History

Repository files navigation

💫StarCoder in C++

Quick start

Downloading and converting the original models (💫 StarCoder)

Quantizing the models

iOS App

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Say383/starcoder.cpp

Folders and files

Latest commit

History

Repository files navigation

💫StarCoder in C++

Quick start

Downloading and converting the original models (💫 StarCoder)

Quantizing the models

iOS App

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages