20 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
3
votes
1
answer
1k
views
PyTorch Checkpointing Error: Recomputed Tensor Metadata Mismatch in Global Representation with Extra Sampling
I’m working on a PyTorch model where I compute a "global representation" through a forward pipeline. This pipeline is subsequently used in an extra sampling procedure later on in the network. When I ...
2
votes
0
answers
155
views
Does cuBLAS support mixed precision matrix multiplication in the form C[f32] = A[bf16] * B[f32]?
I'm concerning mixed precision in deep learning LLM. The intermediates are mostly F32 and weights could be any other type like BF16, F16, even quantized type Q8_0, Q4_0. it would be much useful if ...
1
vote
1
answer
203
views
I want to strictly use Tensor Cores for running inference of a pretrained full precision CNN model in Pytorch
I have been analyzing the maximum throughput I can get from my device for a specific CNN model using a GPU. My GPU has CUDA cores as well as Tensor cores. So I want to simultaneous run the model on ...
-1
votes
1
answer
1k
views
How to save memory using half precision while keeping the original weights in single?
I'm trying to save memory while training a model that uses single precision weights by doing the calculations in half precision.
I tried using autocast, and the model does prediction in half precision ...
0
votes
2
answers
859
views
Float16 mixed precision being slower than regular float32, keras, tensorflow 2.0
I am using Tensorflow 2.10 in windows with a NVIDIA RTX 2060 SUPER (with tensor cores) for deep learning. But when enabling mixed precision of float16 the time per epoch actually becomes slower than ...
3
votes
1
answer
707
views
What's the gradients dtype during mixed precision training?
I want to figure out how the torch.cuda.amp.autocast works. Therefore, I conducted an experiment. The code is as following:
class CustomModel(nn.Module):
def __init__(self, input_size, hidden_size,...
1
vote
0
answers
1k
views
Pytorch automatic mixed precision - cast whole code block to float32
I have a complex model that I would like to train in mixed precision. To do this, I use the torch.amp package. I can enable AMP for the whole model using with torch.cuda.amp.autocast(enabled=...
3
votes
1
answer
2k
views
Does Automatic MIXED PRECISION (AMP) half the paramters of a model?
Before I knew about automatic mixed precision, I manually halved the model and data using half() for training with half precision. But the training result is not good at all.
Then I used the automatic ...
2
votes
1
answer
6k
views
How to Enable Mixed precision training
i'm trying to train a deep learning model on vs code so i would like to use the GPU for that. I have cuda 11.6 , nvidia GeForce GTX 1650, TensorFlow-gpu==2.5.0 and pip version 21.2.3 for windows 10. ...
1
vote
1
answer
1k
views
Convert a trained model to use mixed precision in Tensorflow
In order to improve the latency of a trained model, I tried to use Tensorflow mixed-precision.
Just setting the policy as mentioned in https://www.tensorflow.org/guide/mixed_precision does not seem to ...
0
votes
1
answer
1k
views
PyTorch loading GradScaler from checkpoint
I am saving my model, optimizer, scheduler, and scaler in a general checkpoint.
Now when I load them, they load properly but after the first iteration the scaler.step(optimizer) throws this error:
...
0
votes
1
answer
3k
views
Sigmoid vs Binary Cross Entropy Loss
In my torch model, the last layer is a torch.nn.Sigmoid() and the loss is the torch.nn.BCELoss.
In the training step, the following error has occurred:
RuntimeError: torch.nn.functional....
0
votes
0
answers
298
views
How to use automatic mixed precision with TensorFlow?
I don't manage to use Automatic Mixed Precision with TensorFlow 2.3.2 (on Windows 10).
I have a TF_ENABLE_AUTO_MIXED_PRECISION variable set to 1 in my system environment. I have enabled memory growth ...
2
votes
1
answer
3k
views
Pytorch mixed precision learning, torch.cuda.amp running slower than normal
I am trying to infer results out of a normal resnet18 model present in torchvision.models attribute. The model is simply trained without any mixed precision learning, purely on FP32.
However, I want ...
1
vote
0
answers
429
views
Dtype error when using Mixed Precision and building EfficientNetB0 Model
System information
OS Platform and Distribution : MacOS
TensorFlow installed from : Colab
TensorFlow version : 2.5.0
Python version: python 3.7
GPU model and memory: Tesla T4
Error
TypeError: Input '...