|
1 | 1 | # StarCoder 2 |
2 | 2 |
|
3 | | -<p align="center"><a href="https://huggingface.co/bigcode">[🤗 Models]</a> | <ahref="">[Paper]</a> | <a href="https://marketplace.visualstudio.com/items?itemName=HuggingFace.huggingface-vscode">[VSCode]</a> |
| 3 | +<p align="center"><a href="https://huggingface.co/bigcode">[🤗 Models & Datasets]</a> | <a href="https://drive.google.com/file/d/17iGn3c-sYNiLyRSY-A85QOzgzGnGiVI3/view">[Paper]</a></a> |
4 | 4 | </p> |
5 | 5 |
|
6 | 6 | StarCoder2 is a family of code generation models (3B, 7B, and 15B), trained on 600+ programming languages from [The Stack v2]() and some natural language text such as Wikipedia, Arxiv, and GitHub issues. The models use Grouped Query Attention, a context window of 16,384 tokens, with sliding window attention of 4,096 tokens. The 3B & 7B models were trained on 3+ trillion tokens, while the 15B was trained on 4+ trillion tokens. |
@@ -110,7 +110,6 @@ print( pipe("def hello():") ) |
110 | 110 | ``` |
111 | 111 |
|
112 | 112 | ## Text-generation-inference: |
113 | | -TODO |
114 | 113 |
|
115 | 114 | ```bash |
116 | 115 | docker run -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=<YOUR BIGCODE ENABLED TOKEN> -d ghcr.io/huggingface/text-generation-inference:latest --model-id bigcode/starcoder2-15b --max-total-tokens 8192 |
|
0 commit comments