I'm writing some code in Python that takes a set of [x,y] coordinates (in a list of tuples) and also an x-value. I need to find the two closest x coordinates to my x-value.
To find the closest x, I created a list of all the x-coordinates, used that list to create a lambda function that found the closest x to my x-value, and then found the index of that x.
Finding the 2nd closest value was a bit trickier. I created a new list that excluded the closest x and followed the same process.
Code is below. Any suggestions on how to improve this or make it more efficient?
def closest_2_points(list_of_tuples, x_value):
x_points = []
for i in list_of_tuples:
x_points.append(i[0])
# find closest point
closest_xpoint_1 = min(x_points, key=lambda z:abs(z-x_value))
closest_xpoint_1_idx = x_points.index(closest_xpoint_1)
# find 2nd closest point
x_points_2 = [x for x in x_points if x != closest_xpoint_1]
closest_xpoint_2 = min(x_points_2, key=lambda z:abs(z-x_value))
closest_xpoint_2_idx = x_points.index(closest_xpoint_2)
closest_points = [(list_of_tuples[closest_xpoint_1_idx][0], list_of_tuples[closest_xpoint_1_idx][1]), (list_of_tuples[closest_xpoint_2_idx][0], list_of_tuples[closest_xpoint_2_idx][1]) ]
return closest_points
When running the function, it should look something like this:
closest_2_points([(4,6),(2,5),(0,4),(-2,3)], 6)
And return something like this:
[(4, 6), (2, 5)]
2 Answers 2
If there are duplicate points in your list, and the closest point turns out to be one of these points, this line of code will remove all of the duplicates:
x_points_2 = [x for x in x_points if x != closest_xpoint_1]
So, instead of returning the 2 closest points, it may return just one of the closest, and some other nearby point.
Finding the n-smallest, n-closest, etc is usually best done using Python's heapq.nsmallest(n, iterable, key=None) function.
Return a list with the n smallest elements from the dataset defined by iterable. key, if provided, specifies a function of one argument that is used to extract a comparison key from each element in iterable (for example,
key=str.lower
). Equivalent to:sorted(iterable, key=key)[:n]
.[This function] perform best for smaller values of n. For larger values, it is more efficient to use the sorted() function. Also, when
n==1
, it is more efficient to use the built-in min() and max() functions. If repeated usage of these functions is required, consider turning the iterable into an actual heap.
Example: a closest n points function, with n defaulting to 2:
import heapq
def closest_points(list_of_tuples, x_value, n=2):
return heapq.nsmallest(n, list_of_tuples, lambda pnt:abs(pnt[0] - x_value))
Example:
>>> closest_points([(4, 6), (2, 5), (0, 4), (-2, 3)], 6)
[(4, 6), (2, 5)]
-
\$\begingroup\$ Thanks for the feedback. In this particular instance, removing duplicates would not be an issue (in fact I'd want to remove them), but I neglected to mention that particular assumption in the original post. Thanks for your answer; I was not aware of the heapq library. \$\endgroup\$Ragnar Lothbrok– Ragnar Lothbrok2020年04月24日 18:53:29 +00:00Commented Apr 24, 2020 at 18:53
Welcome to codereview Ragnar! This was a fun one
Using list comprehensions, you can create a list that contains a tuple with 3 values:
- Distance to x
- Index of the tuple
- x value
So then you can order the list, and take as many items from the tuple as desired
x = 6
points = [(4,6),(2,5),(0,4),(-2,3)]
closest_points = sorted([(x-xvalue[0], index, xvalue[0])
for index, xvalue in enumerate(points)])
Then you can slice the list and take values from there
closest_points[:2]
> [(2, 0, 4), (4, 1, 2)]
In this example, you would take:
- 0 as index of first element, 4 as the x value
- Then 1 as index of second element, 2 as value
-
3\$\begingroup\$ Please explain how this is better than the OPs. Can you also explain why \$O(n\log n)\$ time is better than \$O(n)\$. \$\endgroup\$2020年04月24日 12:59:42 +00:00Commented Apr 24, 2020 at 12:59