Here is the problem,
Given a 2D numpy array 'a' of sizes n×ばつm and a 1D numpy array 'b' of size m. You need to find the distance(Euclidean) of the 'b' vector from the rows of the 'a' matrix. Fill the results in the numpy array. Follow up: Could you solve it without loops?
a = np.array([[1, 1],
[0, 1],
[1, 3],
[4, 5]])
b = np.array([1, 1])
print(dist(a, b))
>>[0,1,2,5]
And here is my solution
import math
def dist(a, b):
distances = []
for i in a:
distances.append(math.sqrt((i[0] - b[0]) ** 2 + (i[1] - b[1]) ** 2))
return distances
a = np.array([[1, 1],
[0, 1],
[1, 3],
[4, 5]])
print(dist(a, [1, 1]))
I wonder how can this be solved more elegant, and how the additional task can be implemented.
1 Answer 1
Don't use math
in a Numpy context.
You should vectorise your loop. You had figured this out in a now-deleted answer, where you also passed axis=1. However, in context, I think the axis more likely to hold true for other array shapes is -1, not 1.
If you're writing this as a convenience function - which the question seems to suggest - then you can make it more generic by accepting array-likes that may or may not already be Numpy arrays; and still supporting a parametric axis that only defaults to the last one. subtract()
will do the array coercion and broadcasting for you.
import numpy as np
def dist(
a: np.typing.ArrayLike, b: np.typing.ArrayLike, axis: int = -1,
) -> np.ndarray:
"""
A wrapper for norm() that always operates on the difference between two vectors using
standard broadcasting rules, and by default aggregates on the last axis.
"""
diff = np.subtract(a, b)
return np.linalg.norm(diff, axis=axis)
def test() -> None:
a = (
(1, 1),
(0, 1),
(1, 3),
(4, 5),
)
b = (1, 1)
actual = dist(a, b)
assert np.allclose(actual, (0, 1, 2, 5), rtol=0, atol=1-14)
if __name__ == '__main__':
test()
numpy.linalg.norm()
\$\endgroup\$