This code does exactly what I want it to, however I want to try and get rid of the nested loop to make it more pythonic. I have been trying to somehow use broadcasting, including playing with np.newaxis
, but cannot produce the same result.
M1 = np.array([[11,11,11],[12,12,12],[13,13,13]])
M2 = np.array([[21,21,21],[22,22,22],[23,23,23],[24,24,24]])
m1_rows = M1.shape[0]
m2_rows = M2.shape[0]
d = np.empty((m1_rows,m2_rows))
for i in range(m1_rows):
for j in range(m2_rows):
d[i,j] = fun(M1[i],M2[j])
Some additional details:
M1
and M2
will always be 2 dimensional numpy arrays. They will always have the same number of columns but rows can vary.
[Edit]
def fun(a,b):
return np.sum(np.square(a-b))
1 Answer 1
Your instinct to use broadcasting to compute the answer was correct.
To figure out how to broadcast correctly, it can be helpful to do some "dimensional analysis", to borrow a term from Physics.
You have two arrays, of size [r1 x c] and [r2 x c], and your output is [r1 x r2]. Inside your loop, your function sums along the [c] axis.
If we reshaped the matrices to [r1 x 1 x c] and [1 x r2 x c] then results between them would broadcast to [r1 x r2 x c].
Summing along the c axis gives us [r1 x r2], which is what we want!
So, an educated guess might be the following:
# [r1 x 1 x c] - [1 x r2 x c]
diffs = (M1[:, np.newaxis, :] - M2)
# contract the c-axis
d = (diffs**2).sum(axis=2)
To check whether our intuition was correct, note that the first two indices of diffs
index a pair of rows, with values corresponding to a-b
in your code.
Hope this helps!
P.S. This solution has a major downside, in terms of memory use - the inputs are O(r1*c + r2*c)
and your solution is O(r1*r2)
. This broadcasted solution is O(r1*r2*c)
. If c is small, this doesn't matter much, so if your use case is 2D or 3D then it should be fine. But if your points are highly dimensional then this could become a serious problem.
I'm not aware of a loopless solution with nice memory use, but would be happy to be proven wrong!
-
\$\begingroup\$ this is an alternative solution, not a code review \$\endgroup\$Billal BEGUERADJ– Billal BEGUERADJ2023年02月20日 13:12:33 +00:00Commented Feb 20, 2023 at 13:12
-
\$\begingroup\$ dimensional analysis has to do with analysing the units of variables, not the dimensions of matrices \$\endgroup\$Reinderien– Reinderien2023年02月20日 13:59:09 +00:00Commented Feb 20, 2023 at 13:59
-
\$\begingroup\$ Your solution is correct, but yes, this isn't particularly a review. \$\endgroup\$Reinderien– Reinderien2023年02月20日 14:01:33 +00:00Commented Feb 20, 2023 at 14:01
-
\$\begingroup\$ Ah apologies, I seem to have misunderstood the point of this stack exchange! I'm happy to delete this answer if you would prefer. And yes, I'm aware that dimensional analysis is to do with analyzing units of variables - but I think the analogy here works and is informative. \$\endgroup\$BaileyA– BaileyA2023年02月20日 14:13:05 +00:00Commented Feb 20, 2023 at 14:13
fun
\$\endgroup\$