Faster solution for row-wise matrix subtraction

Asked 10 years, 8 months ago

Viewed 10k times

\$\begingroup\$

I have 2 matrices. I have to calculate the euclidean distance between every row of matrix A and every row of matrix B.

In the first solution I loop over the rows of the two matrices and for each row in matrix A, I take a row from B, take the square of the element wise subtraction, then sum it and take the square root.

I end up with a matrix of the size #rows in A x #rows in B:

import numpy as np
A = np.random.random((50, 3072))
B = np.random.random((500, 3072))
# First solution
d = np.zeros((50, 500))
for i in range(50):
 for j in range (500):
 d[i,j] = np.sqrt(np.sum(np.square(B[j] - A[i])))

In the second solution I did this by broadcasting, and it works. But when I increase the amount of rows in A and B ... it becomes very very slow. Is there a faster way without looping?

Solution 2:

# Second solution
test = np.sqrt(np.sum(np.square(A[:,np.newaxis]-B),axis=2))
#print check
print np.unique(test==d)

edited Jan 11, 2015 at 18:08

Jamal's user avatar

Jamal

35.2k13 gold badges134 silver badges238 bronze badges

asked Jan 11, 2015 at 14:57

Olivier_s_j's user avatar

Olivier_s_j Olivier_s_j

2032 silver badges6 bronze badges

\$\endgroup\$

Add a comment |

2 Answers 2

Sorted by: Reset to default

\$\begingroup\$

It's not pretty, but this gives a factor-3 speed improvement:

d = (A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)
d -= 2 * np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)
d **= 0.5

This is based off of the fact

$$ (a - b)^2 = a^2 + b^2 - 2ab $$

and so, ignoring the fudging with indices,

$$ \sum(a - b)^2 = \sum a^2 + \sum b^2 - 2\sum ab $$

The squared terms are just

(A**2).sum(axis=-1)[:, np.newaxis] + (B**2).sum(axis=-1)

and \$\sum ab = \vec A \cdot \vec B\$. This can be broadcast with a bit of fudging the axes:

np.squeeze(A.dot(B[..., np.newaxis]), axis=-1)

edited Jan 12, 2015 at 19:41

answered Jan 11, 2015 at 22:01

Veedrac's user avatar

Veedrac Veedrac

9,77323 silver badges38 bronze badges

\$\endgroup\$

Add a comment |

\$\begingroup\$

You may also try sklearn.metrics.pairwise_distances:

basically here's what it will look like:

import numpy as np
from sklearn.metrics import pairwise_distances
A = np.random.random((50, 3072))
B = np.random.random((500, 3072))
d = pairwise_distances(A, B, metric='euclidean')

answered Dec 12, 2019 at 5:26

Sina Mansour L.'s user avatar

Sina Mansour L. Sina Mansour L.

1111 bronze badge

\$\endgroup\$

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

Stack Exchange Network

Faster solution for row-wise matrix subtraction

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Faster solution for row-wise matrix subtraction

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions