89 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
1
answer
64
views
Tuning starting and final learning rate
If you use cosine decay for example and you have starting learning rate and final learning rate, can you tune those hyperparameters so that final learning rate is in some ratio of starting learning ...
1
vote
1
answer
169
views
AttributeError When Updating Learning Rate in Keras Using K.set_value
I'm trying to update the learning rate of my Keras model dynamically during training. I'm using the following code:
import tensorflow as tf
from tensorflow.keras import backend as K
model = keras....
1
vote
0
answers
22
views
Trying to implement ReduceLrOnPlateau in darknet (alexeyab's version) but having issues with it
i have been trying to implement a reducelronplateau scheduler which will allow tracking of loss history, and automatically update the learning rate whenever a loss plateau is detected.
i have made ...
2
votes
1
answer
2k
views
learning rate in torch.optim.AdamW has no effect?
I am working on fine-tuning BLIP-2 on the RSICD dataset using LoRA. I am working on colab, using an A100. I am strangely finding that when I set the learning rate in the code below, it has no effect. ...
1
vote
0
answers
537
views
Optimal hyperparameters for fine tuning LLM
could I ask you for help? I am doing fine tuning of LLM model Llama3 8b (with LoRA) for text classification. I am using Trainer from Huggingface. I am looking for the optimal ...
1
vote
0
answers
246
views
Optimal Learning Rate and Batch Size for LLM Training
What are the best practices for optimizing batch size and learning rate in training Large Language Models (LLMs)?
How should these hyperparameters be adjusted relative to each other for efficient ...
2
votes
1
answer
894
views
How to print learning rate per epoch with pytorch lightning?
I am having a problem with printing (logging) learning rate per epoch in pytorch lightning (PL). TensorFlow logs the learning rate at default. As PL guide suggested, I wrote the following code:
class ...
2
votes
0
answers
74
views
tfjs: Adam learning rate decay
How do I train a Tensorflow model using Adam optimizer that decays learning rate during the trining with Tensorflow.JS! (not python) I cannot find that the library provides an exponential decay ...
1
vote
1
answer
942
views
PyTorch Lightning's ReduceLRonPlateau not working properly
I have been trying to write a lightning module using both a warmup and an annealing function ReduceLROnPlateau and something really odd is happening. If the program reduces the learning rate, the ...
1
vote
3
answers
6k
views
How to fix the learning-rate for Huggingface´s Trainer?
I'm training model with the following parameters:
Seq2SeqTrainingArguments(
output_dir = "./out",
overwrite_output_dir = True,
do_train ...
1
vote
1
answer
531
views
Unusual Learning Rate Finder Curve: Loss Lowest at Smallest Learning Rate
I'm using PyTorch Lightning's LR Finder but am getting an atypical curve. The loss starts at its lowest point when the learning rate is at its smallest, increases until it plateaus, and then exhibits ...
0
votes
2
answers
1k
views
Pytorch Lightning Learning Rate Tuners Giving unexpected results
I'm trying to find an optimal learning rate using python pl.tuner.Tuner but results aren't as expected
The model I am running is a linear classifier on top of a BertForSequenceClassification Automodel
...
5
votes
1
answer
5k
views
Why do we multiply learning rate by gradient accumulation steps in PyTorch?
Loss functions in pytorch use "mean" reduction. So it means that the model gradient will have roughly the same magnitude given any batch size. It makes sense that you want to scale the ...
1
vote
1
answer
460
views
Getting rid of the clutter of `.lr_find_` in pytorch lightning?
When using the Lightning’s built-in LR finder:
# Create a Tuner
tuner = Tuner(trainer)
# finds learning rate automatically
# sets hparams.lr or hparams.learning_rate to that learning rate
tuner....
2
votes
1
answer
3k
views
how MultiStepLR works in PyTorch
I'm new to PyTorch and am working on a toy example to understand how weight decay works in learning rate passed into the optimizer. When I use MultiStepLR , I was expecting to decrease the learning ...