Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Bitsandbytes quantization and QLORA fine-tuning #1389

nikoladj99 started this conversation in General
Discussion options

Hello friends!

I want to fine tune a quantized RoBERTa base model using the QLORA approach. Below is the configuration.

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
llm_int8_skip_modules=["classifier"]
)

model = AutoModelForSequenceClassification.from_pretrained(
"roberta-base",
num_labels=2,
quantization_config=bnb_config,
torch_dtype=torch.bfloat16,
device_map="auto",
)

What I’m not sure I understand, when I look at the datatypes for the LORA matrices, they are in float 32 format. Also, after executing the function prepare_model_for_kbit_training, the other parts of the layers, all except the weights, are converted to float 32 (bias, layernorm...). Do they and the LORA matrices have to be in 32b format, or can they somehow be converted to 16b format? When combining LORA matrices and model weights matrices, LORA matrices are ​​converted to bfloat16 or everything is converted to float 32? Is the quantization potential used if some layers remain in 32b format?

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /