66 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
1
vote
0
answers
69
views
How to pass P_map: dict[str, torch.Tensor] to PEFT (LoRA)?
My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference
My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗〖‖f_(...
3
votes
0
answers
63
views
Azure ML Endpoint Fails with HFValidationError even after using pathlib.Path
I am trying to deploy a fine-tuned Mistral-7B model on an Azure ML Online Endpoint. The deployment repeatedly fails during the init() phase of the scoring script with an huggingface_hub.errors....
1
vote
0
answers
66
views
ValueError when resuming LoRA fine-tuning with sentence-transformers CrossEncoderTrainer: "Unrecognized model" error
I'm fine-tuning a CrossEncoder model with LoRA using sentence-transformers library on Kaggle (12-hour limit). I need to resume training from a checkpoint, but I'm getting a ValueError when trying to ...
1
vote
0
answers
119
views
Fine-tuned LLaMA 2–7B with QLoRA, but reloading fails: missing 4bit metadata. Likely saved after LoRA+resize. Need proper 4bit save method
I’ve been working on fine-tuning LLaMA 2–7B using QLoRA with bitsandbytes 4-bit quantization and ran into a weird issue. I did adaptive pretraining on Arabic data with a custom tokenizer (vocab size ~...
0
votes
1
answer
345
views
What should I set as LoRA target_modules for each stage of continued pretraining on Qwen2.5-VL-Instruct using Unsloth?
I would like to perform continued pretraining of Qwen2.5-VL-Instruct using Unsloth + LoRA, following a three-stage training process:
Stage 1: Train only the projector (Alignment)
Stage 2: Train both ...
2
votes
1
answer
815
views
How to properly save and load a PEFT-trained Unsloth model with resized token embeddings?
I'm using Unsloth's FastVisionModel with the base model unsloth/qwen2-VL-2B-Instruct to train on a dataset that includes text with many unique characters. Here's the overall process I followed:
...
0
votes
0
answers
156
views
How to resolve 'grad strides mismatching' warning in custom Kronecker Linear layer using torch.einsum?
I'm implementing a more efficient version of lokr.Linear from the LoKr module in PEFT. The current implementation uses torch.kron to construct the delta_weight before applying rank dropout, but this ...
0
votes
1
answer
635
views
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! during training on multi-GPU setup
I'm facing an issue when training a model using PEFT and LoRA on a multi-GPU setup with PyTorch and Hugging Face Transformers. The error I get is:
RuntimeError: Expected all tensors to be on the same ...
9
votes
1
answer
6k
views
TypeError in SFTTrainer Initialization: Unexpected Keyword Argument 'tokenizer'
Question:
I am trying to fine-tune the Mistral-7B-Instruct-v0.1-GPTQ model using SFTTrainer from trl. However, when running my script in Google Colab, I encounter the following error:
TypeError: ...
0
votes
0
answers
219
views
LoRA Adapter Loading Issue with Llama 3.1 8B - Missing Keys Warning
I'm having trouble loading my LoRA adapters for inference after fine-tuning Llama 3.1 8B. When I try to load the adapter files in a new session, I get a warning about missing adapter keys:
/usr/local/...
3
votes
1
answer
2k
views
TypeError: SFTTrainer.__init__() got an unexpected keyword argument 'dataset_text_field'
I am trying to fine-tune a language model using SFTTrainer from the trl library in Google Colab. However, I am encountering the following error:
TypeError Traceback (...
1
vote
0
answers
85
views
Is there any difference between these two implementations of LoRA (Low-Rank Adaptation)?
We all know that LoRA is a low-rank adaptation method, which can be formulated as follows: x = W_0 * x + (A @ B) * x. I have two different code implementations of this. Are there any differences ...
1
vote
1
answer
172
views
How to merge pefted models to the base one in Transformers(Huggingface)?
I tried to merge a pefted model to the original one. Cause Hugging face APi only outputs "extra weights" of the fine-tuning as a .safetensors file. I try to merge but failed.
I wonder how ...
1
vote
1
answer
292
views
Do I have to write custom AutoModel transformers class in case "TypeError: NVEmbedModel.forward() got an unexpected keyword argument 'inputs_embeds'"
I am trying to finetune the nvidia/NV-Embed-v2 model from hugging face using lora from peft library. I am facing the "TypeError: NVEmbedModel.forward() got an unexpected keyword argument '...
2
votes
0
answers
648
views
PEFT library installed but PEFT is not identified at runtime
I am trying to inference a model ChemVLM (https://huggingface.co/AI4Chem/ChemVLM-26B). While trying to run the python code, I get error
ImportError: This modeling file requires the following packages ...