Find the closest number in a sorted list to a given target number

Question 1

Given a list of n integers sorted in ascending order, find the closest number (value) in the list to a given target number. How quickly can you do it?

I'm interested to know if there's any feedback for my code.

def getClosestValue(arr, target):
 n = len(arr)
 left = 0
 right = n - 1
 mid = 0
 # edge case
 if (target >= arr[n - 1]):
 return arr[n - 1]
 # BSearch solution: Time & Space: Log(N) 
 while (left < right):
 mid = (left + right) // 2 # find the mid
 if (arr[mid] == target):
 return arr[mid]
 if (target < arr[mid]):
 # If target is greater than previous
 # to mid, return closest of two
 if (mid > 0 and target > arr[mid - 1]):
 return findClosest(arr[mid - 1], arr[mid], target)
 # update right
 right = mid
 else:
 if (mid < n - 1 and target < arr[mid + 1]):
 return findClosest(arr[mid], arr[mid + 1], target)
 # update i
 left = mid + 1
 return arr[mid]
# findClosest
# We find the closest by taking the difference
# between the target and both values. It assumes
# that val2 is greater than val1 and target lies
# between these two.
def findClosest(val1, val2, target):
 if (target - val1 >= val2 - target):
 return val2
 else:
 return val1
# Sample code
arr = [1, 2, 3, 4, 5, 5, 8, 10]
target = 7
print(getClosestValue(arr, target))

Question 2

Conventions

The Style guide to follow in Python is PEP8. If you do so, your code blends right into the python community and it is immediately easier to follow. On that regard you'll see that the functions name's don't follow the recommended style.

getClosestValue

Should actually be:

get_closest_value

The relevant quote from PEP8 for this case

Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Still in this topic, all the parenthesis used in the ifs are redundant and break the style.

if (target >= arr[n - 1]):

Should be turned into:

if target >= arr[n - 1]:

Although i suspect that this may be a habit brought from other languages.

Edge cases

You contemplated the case where the value is the highest or above, which gives you a quick way out:

# edge case
if (target >= arr[n - 1]):
 return arr[n - 1]

However there is no corresponding edge case for when the value is the lowest or below. I believe you merely forgot it.

#edge case - first or below all
if target <= arr[0]:
 return arr[0]

Logic

Now both edge cases are covered, so the while can be simplified to only have the navigation to the corresponding item if there is one.

while (left < right):
 mid = (left + right) // 2 # find the mid
 if arr[mid] == target:
 return arr[mid]
 if target < arr[mid]:
 right = mid
 else:
 left = mid + 1

After this last while if there isn't an exact match, calculate the nearest to the one you ended on, and return it:

if target < arr[mid]:
 return find_closest(arr[mid - 1], arr[mid], target)
else:
 return find_closest(arr[mid], arr[mid + 1], target)

Note that i don't need to check if mid - 1 or mid + 1 is a valid index, because if it wasn't it would mean that the value was below or above all elements, and those cases were already covered in the beginning of the function. This is not only simpler but also more efficient since it only checks for the closest on the very end and not at each iteration.

Taking @bipll suggestion you can restructure the while a bit. Considering you are only getting if arr[mid] == target at the very last iteration you can first check for < or >. This avoids making one extra check that will fail most of the times:

while left < right:
 mid = (left + right) // 2 # find the mid
 if target < arr[mid]:
 right = mid
 elif target > arr[mid]:
 left = mid + 1
 else:
 return arr[mid]

When i have a simple condition with a return on both cases i rather write them inline since its easy to read and a bit more concise:

def find_closest(val1, val2, target):
 return val2 if target - val1 >= val2 - target else val1

But don't go over the top on this. It can easily be harder to read depending on the complexity of the condition.

Full Modified Version

For you to get a better picture of all the changes i mentioned, i'll leave you with the full modified version:

def get_closest_value(arr, target):
 n = len(arr)
 left = 0
 right = n - 1
 mid = 0
 # edge case - last or above all
 if target >= arr[n - 1]:
 return arr[n - 1]
 # edge case - first or below all
 if target <= arr[0]:
 return arr[0]
 # BSearch solution: Time & Space: Log(N)
 while left < right:
 mid = (left + right) // 2 # find the mid
 if target < arr[mid]:
 right = mid
 elif target > arr[mid]:
 left = mid + 1
 else:
 return arr[mid]
 if target < arr[mid]:
 return find_closest(arr[mid - 1], arr[mid], target)
 else:
 return find_closest(arr[mid], arr[mid + 1], target)
# findClosest
# We find the closest by taking the difference
# between the target and both values. It assumes
# that val2 is greater than val1 and target lies
# between these two. 
def find_closest(val1, val2, target):
 return val2 if target - val1 >= val2 - target else val1

Question 3

From how you move right it can be assumed that arr[right] is always greater than the key, so the loop's condition can be extended to while left < right - 1:, which can reduce the search by one iteration. And doesn't it make sense to first check for both inequalities, and only after that handle the exact match case? The probability to find an exact match is not that high for moderately sparse arrays.

Question 4

I mean, inside the lookup loop, on each iteration first check if arr[mid] < target:, then elif arr[mid] > target:, and then in the last else: clause handle the accidental exact match, as the latter seems the least expected case (even for sparse arrays).

Question 5

@bipll On my final code the first check in the while is for equality with if arr[mid] == target:return arr[mid]. After that i check if its below with if target < arr[mid]: and the only remaining case is above which is the corresponding else. So i'm not sure i follow what you are trying to say.

Question 6

I am trying to say that on your final code, the very first check is for the least probable case, so a) this operation is always performed; and b) most of the time it is performed for nothing. It would make more sense to first check if array element is not equal to target, and only if not, consider the (unlikely) exact match in an else: clause.

Question 7

@bipll Oh i see now what you are trying to say and it makes sense. Thanks for the comment.

Question 8

python has a binary search implementation: https://docs.python.org/2/library/bisect.html

so just:

import bisect
ind = bisect.bisect_left(arr, target)

if you get ind> len(arr)-2, then the solution is arr[-1]. Otherwise you just hceck arr[ind] and arr[ind+1] to find whichever is closest

Strictly speaking, if you implemented your own binary search it should be just as fast. However, having a two line implementation is easier to read than defining your own functions

Isac Isac 5744 silver badges12 bronze badges · Accepted Answer · 2018-03-22 13:07:05Z

Conventions

The Style guide to follow in Python is PEP8. If you do so, your code blends right into the python community and it is immediately easier to follow. On that regard you'll see that the functions name's don't follow the recommended style.

getClosestValue

Should actually be:

get_closest_value

The relevant quote from PEP8 for this case

Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Still in this topic, all the parenthesis used in the ifs are redundant and break the style.

if (target >= arr[n - 1]):

Should be turned into:

if target >= arr[n - 1]:

Although i suspect that this may be a habit brought from other languages.

Edge cases

You contemplated the case where the value is the highest or above, which gives you a quick way out:

# edge case
if (target >= arr[n - 1]):
 return arr[n - 1]

However there is no corresponding edge case for when the value is the lowest or below. I believe you merely forgot it.

#edge case - first or below all
if target <= arr[0]:
 return arr[0]

Logic

Now both edge cases are covered, so the while can be simplified to only have the navigation to the corresponding item if there is one.

while (left < right):
 mid = (left + right) // 2 # find the mid
 if arr[mid] == target:
 return arr[mid]
 if target < arr[mid]:
 right = mid
 else:
 left = mid + 1

After this last while if there isn't an exact match, calculate the nearest to the one you ended on, and return it:

if target < arr[mid]:
 return find_closest(arr[mid - 1], arr[mid], target)
else:
 return find_closest(arr[mid], arr[mid + 1], target)

Note that i don't need to check if mid - 1 or mid + 1 is a valid index, because if it wasn't it would mean that the value was below or above all elements, and those cases were already covered in the beginning of the function. This is not only simpler but also more efficient since it only checks for the closest on the very end and not at each iteration.

Taking @bipll suggestion you can restructure the while a bit. Considering you are only getting if arr[mid] == target at the very last iteration you can first check for < or >. This avoids making one extra check that will fail most of the times:

while left < right:
 mid = (left + right) // 2 # find the mid
 if target < arr[mid]:
 right = mid
 elif target > arr[mid]:
 left = mid + 1
 else:
 return arr[mid]

When i have a simple condition with a return on both cases i rather write them inline since its easy to read and a bit more concise:

def find_closest(val1, val2, target):
 return val2 if target - val1 >= val2 - target else val1

But don't go over the top on this. It can easily be harder to read depending on the complexity of the condition.

Full Modified Version

For you to get a better picture of all the changes i mentioned, i'll leave you with the full modified version:

def get_closest_value(arr, target):
 n = len(arr)
 left = 0
 right = n - 1
 mid = 0
 # edge case - last or above all
 if target >= arr[n - 1]:
 return arr[n - 1]
 # edge case - first or below all
 if target <= arr[0]:
 return arr[0]
 # BSearch solution: Time & Space: Log(N)
 while left < right:
 mid = (left + right) // 2 # find the mid
 if target < arr[mid]:
 right = mid
 elif target > arr[mid]:
 left = mid + 1
 else:
 return arr[mid]
 if target < arr[mid]:
 return find_closest(arr[mid - 1], arr[mid], target)
 else:
 return find_closest(arr[mid], arr[mid + 1], target)
# findClosest
# We find the closest by taking the difference
# between the target and both values. It assumes
# that val2 is greater than val1 and target lies
# between these two. 
def find_closest(val1, val2, target):
 return val2 if target - val1 >= val2 - target else val1

From how you move right it can be assumed that arr[right] is always greater than the key, so the loop's condition can be extended to while left < right - 1:, which can reduce the search by one iteration. And doesn't it make sense to first check for both inequalities, and only after that handle the exact match case? The probability to find an exact match is not that high for moderately sparse arrays.
I mean, inside the lookup loop, on each iteration first check if arr[mid] < target:, then elif arr[mid] > target:, and then in the last else: clause handle the accidental exact match, as the latter seems the least expected case (even for sparse arrays).
@bipll On my final code the first check in the while is for equality with if arr[mid] == target:return arr[mid]. After that i check if its below with if target < arr[mid]: and the only remaining case is above which is the corresponding else. So i'm not sure i follow what you are trying to say.
I am trying to say that on your final code, the very first check is for the least probable case, so a) this operation is always performed; and b) most of the time it is performed for nothing. It would make more sense to first check if array element is not equal to target, and only if not, consider the (unlikely) exact match in an else: clause.
@bipll Oh i see now what you are trying to say and it makes sense. Thanks for the comment.

Stack Exchange Network

Find the closest number in a sorted list to a given target number

2 Answers 2

Conventions

Edge cases

Logic

Full Modified Version

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find the closest number in a sorted list to a given target number

2 Answers 2

Conventions

Edge cases

Logic

Full Modified Version

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions