Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 54821be

Browse files
update usage section
1 parent 3786335 commit 54821be

File tree

1 file changed

+9
-8
lines changed

1 file changed

+9
-8
lines changed

‎README.md‎

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,13 @@ export HF_TOKEN=xxx
3535
```
3636

3737
## Model usage and memory footprint
38-
Here are some examples to load the model and generate code. Ensure you've installed `transformers` from source (it should be the case if you used `requirements.txt`). We also include the memory footprint of the largest model, `StarCoder2-15B`, for each setup.
39-
38+
Here are some examples to load the model and generate code, with the memory footprint of the largest model, `StarCoder2-15B`. Ensure you've installed `transformers` from source (it should be the case if you used `requirements.txt`)
39+
```bash
40+
pip install git+https://github.com/huggingface/transformers.git
41+
```
4042

41-
### Running the model on CPU/ one GPU / multi GPU
43+
### Running the model on CPU/GPU/multi GPU
44+
* _Using full precision_
4245
```python
4346
# pip install git+https://github.com/huggingface/transformers.git # TODO: merge PR to main
4447
from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -55,10 +58,7 @@ outputs = model.generate(inputs)
5558
print(tokenizer.decode(outputs[0]))
5659
```
5760

58-
### Running the model on a GPU using different precisions
59-
6061
* _Using `torch.bfloat16`_
61-
6262
```python
6363
# pip install accelerate
6464
import torch
@@ -79,7 +79,7 @@ print(tokenizer.decode(outputs[0]))
7979
Memory footprint: 32251.33 MB
8080
```
8181

82-
#### Quantized Versions through `bitsandbytes`
82+
### Quantized Versions through `bitsandbytes`
8383
* _Using 8-bit precision (int8)_
8484

8585
```python
@@ -117,7 +117,8 @@ pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device=0)
117117
print( pipe("def hello():") )
118118
```
119119

120-
## Text-generation-inference: TODO
120+
## Text-generation-inference:
121+
TODO
121122

122123
```bash
123124
docker run -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=<YOUR BIGCODE ENABLED TOKEN> -d ghcr.io/huggingface/text-generation-inference:latest --model-id bigcode/starcoder2-15b --max-total-tokens 8192

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /