-
Couldn't load subscription status.
- Fork 536
gradcheck fails for ot.emd2 #564
-
Hi,
thank you for making this great python package.
I'm currently trying to use emd as the loss function and I want to see if the gradcheck works fine.
However, the ot.emd2 fails to pass the gradcheck and causes Jacobian mismatch error.
Using gradcheck on ot.sinkhorn2 returns True.
from torch.autograd import gradcheck factory = torch.tensor([0, 1.0, 0, 0], requires_grad=True).to(device) center = torch.tensor([0.5, 0.0, 0.5, 0], requires_grad=True).to(device) M = torch.tensor([[0.0, 1.0, 1.0, 1.414], [1.0, 0.0, 1.414, 1.0], [1.0, 1.414, 0.0, 1.0], [1.414, 1.0, 1.0, 0.0]], requires_grad=True).to(device) print(gradcheck(ot.emd2, (factory, center, M))) # fail print(gradcheck(ot.sinkhorn2, (factory, center, M, reg))) # True, but with UserWarning: Warning: numerical errors at iteration 0 warnings.warn('Warning: numerical errors at iteration %d' % ii)
The output error for gradcheck of ot.emd2:
image
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment
-
Thanks for noting this. We definitely need to look into this. still please note that emd is not differentiable and the gradient is not unique so it is normal that numerical and analytical differ.
Also note that when the weight is 0, there is an infinity of possible sub-gradients wrt the weight so we just find one value that is fast to compute and do not violates the KKT at convergence. Here you can see that for the first weight the value of the gradient for the only non-zero value is close (around 1). We will look into it but keep in mind that due to the non differentiability you are better optimizing inside the simplex (which is teh case for output of a softmax)
Beta Was this translation helpful? Give feedback.