Find the value nearest to k in the sorted array

Question 1

Given a sorted array, return the 'closest' element to the input 'x'.

I do understand the merits of unit testing in separate files, but I deliberately added it to the main method for personal convenience, so don't consider that in your feedback.

I'm looking for request code review, optimizations and best practices.

public final class ClosestToK {
 private ClosestToK() { }
 /**
 * Given a sorted array returns the 'closest' element to the input 'x'.
 * 'closest' is defined as Math.min(Math.abs(x), Math.abs(y)) 
 * 
 * Expects a sorted array, and if array is not sorted then results are unpredictable.
 * 
 * If two values are equi-distant then the greater value is returned.
 * 
 * @param a The sorted array a
 * @return The nearest element
 */
 public static int getClosestK(int[] a, int x) {
 int low = 0;
 int high = a.length - 1;
 while (low <= high) {
 int mid = (low + high) / 2;
 // test lower case
 if (mid == 0) {
 if (a.length == 1) {
 return a[0];
 }
 return Math.min(Math.abs(a[0] - x), Math.abs(a[1] - x)) + x;
 }
 // test upper case
 if (mid == a.length - 1) {
 return a[a.length - 1];
 }
 // test equality
 if (a[mid] == x || a[mid + 1] == x) {
 return x;
 }
 // test perfect range.
 if (a[mid] < x && a[mid + 1] > x) {
 return Math.min(Math.abs(a[mid] - x), Math.abs(a[mid + 1] - x)) + x;
 }
 // keep searching.
 if (a[mid] < x) {
 low = mid + 1;
 } else { 
 high = mid;
 }
 }
 throw new IllegalArgumentException("The array cannot be empty");
 }
 public static void main(String[] args) {
 // normal case.
 int[] a1 = {10, 20, 30, 40};
 assertEquals(30, getClosestK(a1, 28));
 // equidistant
 int[] a2 = {10, 20, 30, 40};
 assertEquals(30, getClosestK(a2, 25));
 // edge case lower boundary
 int[] a3 = {10, 20, 30, 40};
 assertEquals(10, getClosestK(a3, 5));
 int[] a4 = {10, 20, 30, 40};
 assertEquals(10, getClosestK(a4, 10));
 // edge case higher boundary 
 int[] a5 = {10, 20, 30, 40};
 assertEquals(40, getClosestK(a5, 45));
 int[] a6 = {10, 20, 30, 40};
 assertEquals(40, getClosestK(a6, 40));
 // case equal to 
 int[] a7 = {10, 20, 30, 40};
 assertEquals(30, getClosestK(a7, 30));
 }
}

Question 2

your 'closest' definition seems queer. Are you sure you didn't mean Math.Min(Math.Abs(x-el[i]) where i is all elements of the array?

Question 3

yes thats what i meant

Question 4

Are values expected to be all different ?

Question 5

not sure what you mean ?

Question 6

May we have duplicate values in the array ? ie for i < j, do we have array[i] < array[j] or array[i] <= array[j] ?

Question 7

Here's my solution which seems to work fine :

public static int getClosestK(int[] a, int x) {
 int low = 0;
 int high = a.length - 1;
 if (high < 0)
 throw new IllegalArgumentException("The array cannot be empty");
 while (low < high) {
 int mid = (low + high) / 2;
 assert(mid < high);
 int d1 = Math.abs(a[mid ] - x);
 int d2 = Math.abs(a[mid+1] - x);
 if (d2 <= d1)
 {
 low = mid+1;
 }
 else
 {
 high = mid;
 }
 }
 return a[high];
}

Here's the principle : the interval [low, high] will contain the closest element to x at any time. At the beginning, this interval is the whole array ([low, high] = [0, length-1]). At each iteration, we'll make it strictly smaller. When the range is limited to a single element, this is the element we are looking for.

In order to make the range smaller, at each iteration, we'll consider mid the middle point of [low, high]. Because of the way mid is computed, mid+1 is also in the range. We'll check if the closest value is at mid or mid+1 and update high or low accordingly. One can check that the range actually gets smaller.

Edit to answer to comments:

As spotted by @Vick-Chijwani , this code doesn't handle perfectly the scenarios where an element appears multiple times in the input. One can add the following working tests to the code :

 // case similar elements
 int[] a8 = {10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10};
 assertEquals(10, getClosestK(a8, 9));
 assertEquals(10, getClosestK(a8, 10));
 assertEquals(10, getClosestK(a8, 11));

but this one will fail :

 int[] a9 = {1, 2, 100, 100, 101};
 assertEquals(3, getClosestK(a9, 2)); // actually returns 100

Question 8

What if a[mid] == a[mid+1]? I imagine you'd have to walk to the left or right until they are unequal, then proceed to compute d1 and d2. But then you lose the worst-case O(log n) guarantee.

Question 9

Hi @VickyChijwani, thanks for your feedback. I'm AFK for a few days but I'll try to think about it and let you know asap :-)

Question 10

Thanks! No hurry. The idea of walking left and right that I suggested, seems to work well enough for my use-case.

Question 11

@VickyChijwani Everything seems to work fine. I've added an edit to my answer. Please let me know if you need more details. Basically, when 2 (consecutive) elements are equal, we have d1 == d2 (which is equal to 0 if this is the element we where looking for) and we reduce the search range accordingly : low = mid+1;. Please let me know if you see an issue.

Question 12

Your solution returns 100 for a = [1, 2, 100, 100, 101] and x = 3, which is incorrect. The reason is that, when d1 == d2, you don't have enough information to decide which half of the array to check next, so you must somehow make them unequal before proceeding.

Question 13

Are you allowed to use java.util.Arrays.binarySearch ? If you don't need to detect whether or not the array is sorted, this will give you the one or two closest elements quickly.

public static int getClosestK(int[] a, int x) {
 int idx = java.util.Arrays.binarySearch(a, x);
 if ( idx < 0 ) {
 idx = -idx - 1;
 }
 if ( idx == 0 ) { // littler than any
 return a[idx];
 } else if ( idx == a.length ) { // greater than any
 return a[idx - 1];
 }
 return d(x, a[idx - 1]) < d(x, a[idx]) ? a[idx - 1] : a[idx];
}
private static int d(int lhs, int rhs) {
 return Math.abs(lhs - rhs);
}

Question 14

Your solution is complex because you are trying to mix at least two different problems in a single algorithm. Sorry my examples are in C++, but I hope you get the algorithmic idea.

Here is what a standard binary search looks like:

/**
 * \brief Find the first non-zero element of an ordered array
 *
 * @param a array to search in
 * @param s offset to start searching from
 *
 * @return offset of the first non-zero element, \e a.size() if none found
 */
template <class A>
size_t binary_search(const A& a, size_t s = 0)
{
 size_t from = s, to = a.size(), mid;
 while (from != to) {
 mid = (from + to) / 2;
 if (a[mid]) to = mid;
 else from = mid + 1;
 }
 return from;
}

If a is in ascending order, you will need to find the position pos of its first element that is greater than x, i.e. substitute

a[mid] > x

for a[mid], where x can be made an additional input parameter:

template <class A, class T>
size_t binary_search(const A& a, const T& x, size_t s = 0)

(in a more generic design, a would really be a customizable function object to be called by a(mid)).

Having found pos:

if pos == 0 (all elements > x), return 0 (this includes the case of a being empty);
if pos == a.size() (not found; all elements <= x), return a.size() - 1 (last element);
otherwise, return one of the two positions pos - 1 and pos depending on which element of a is closer to x.

That is:

template <class A, class T>
size_t find_closest(const A& a, const T& x)
{
 size_t pos = binary_search(a, x);
 return pos == 0 ? 0 :
 pos == a.size() ? a.size() - 1 :
 x - a[pos - 1] < a[pos] - x ? pos - 1 : pos;
}

Note that due to known ordering, you don't need Math.abs. Also note that in case of ambiguities (equalities), this approach returns the rightmost possible position of all equivalent ones.

This is clear, O(log n), and does not mess with Math.abs or anything problem-specific during the critical search process. It separates the two problems.

The function object approach could be

size_t pos = binary_search([&](size_t p){return a[p] > x;});

using a lambda in C++11, and leaving binary_search as originally defined, except for changing a[mid] to a(mid).

Question 15

I agree, this is much cleaner and clearer. The binary search algorithm should be abstracted away with the proper functor argument. (+1) However : template arguments are arguments and deserve proper names !

Question 16

A minor thing to consider is caching values you lookup, for example, in each step of your while you calculate a.length - 1, since this value does not change you could have create a
int last = a.length - 1; which would make the following fractionally faster and more readable:

 // test upper case <- unfortunate choice of comment ? upper bound perhaps?
 if (mid == last) {
 return a[last];
 }

I would also consider caching a[mid] and a[mid+1] into well named variables.

Finally, this:

if (a.length == 1) {
 return a[0];
}

Belongs in front of the while loop.

Question 17

You should avoid undefined behaviour: you say

Expects a sorted array, and if array is not sorted then results are unpredictable

but humans make errors all the time and giving a random result if the user forgets to enter a sorted input is not nice.

At the cost of some performance I would add (should be easy to translate into Java):

def is_sorted(array):
 for index, item in enumerate(array):
 if index != 0:
 item_before = array[index-1]
 if not item >= item_before:
 return False
 return True

and in your function:

if not is_sorted(array):
 raise IllegalArgumentException("Array is not sorted.")

If you really care about speed you can add a flag insecure so that when the user wants full speed and knows what he's doing the check can be skipped.

Question 18

The same again and again...

Do not sum endpoints' indices:

 int mid = (low + high) / 2;

Subtract them instead:

 int mid = low + (high - low) / 2;

This will prevent an arithmetic overflow and possible getting mid < 0 in case your array's length exceeds a half of the int type maximum value.

Question 19

+1 for @Josay's answer. In case the array a can contain equal entries, please consider this slight improvement:

public static int getClosestK(int[] a, int x) {
 int low = 0;
 int high = a.length - 1;
 if ( high < 0 )
 throw new IllegalArgumentException("The array cannot be empty");
 OUTER:
 while ( low < high ) {
 int mid = (low + high) / 2;
 assert ( mid < high );
 int d1 = Math.abs(a[mid ] - x);
 int d2 = Math.abs(a[mid+1] - x);
 if ( d2 < d1 )
 low = mid + 1;
 else if ( d2 > d1 )
 high = mid;
 else { // --- handling "d1 == d2" ---
 for ( int right=mid+2; right<=high; right++ ) {
 d2 = Math.abs(a[right] - x);
 if ( d2 < d1 ) {
 low = right;
 continue OUTER;
 } else if ( d2 > d1 ) {
 high = mid;
 continue OUTER;
 }
 }
 high = mid;
 }
 }
 return a[high];
}

--- UPDATE ---

Why the hate (-1)? Handling d1 == d2 is important as the array is subset wrongly otherwise.

E.g., consider a=[1,2,2,3] and x=0. Then d1=d2=2 and Josay's answer sets low=mid+1 which subsets the array to [2,3]. This subset does not contain the correct answer (1) anymore.

Question 20

You have presented an alternative solution, but haven't reviewed the code. Please explain your reasoning (how your solution works and how it improves upon the original) so that the author can learn from your thought process.

Question 21

The slight improvement is --- handling "d1 == d2" ---.

Question 22

Sure, but why is it an improvement? What advantages does it have over the original? You need to elaborate on your changes.

SylvainD SylvainD 29.7k1 gold badge49 silver badges93 bronze badges · Accepted Answer · 2014-04-16 10:47:23Z

Here's my solution which seems to work fine :

public static int getClosestK(int[] a, int x) {
 int low = 0;
 int high = a.length - 1;
 if (high < 0)
 throw new IllegalArgumentException("The array cannot be empty");
 while (low < high) {
 int mid = (low + high) / 2;
 assert(mid < high);
 int d1 = Math.abs(a[mid ] - x);
 int d2 = Math.abs(a[mid+1] - x);
 if (d2 <= d1)
 {
 low = mid+1;
 }
 else
 {
 high = mid;
 }
 }
 return a[high];
}

Here's the principle : the interval [low, high] will contain the closest element to x at any time. At the beginning, this interval is the whole array ([low, high] = [0, length-1]). At each iteration, we'll make it strictly smaller. When the range is limited to a single element, this is the element we are looking for.

In order to make the range smaller, at each iteration, we'll consider mid the middle point of [low, high]. Because of the way mid is computed, mid+1 is also in the range. We'll check if the closest value is at mid or mid+1 and update high or low accordingly. One can check that the range actually gets smaller.

Edit to answer to comments:

As spotted by @Vick-Chijwani , this code doesn't handle perfectly the scenarios where an element appears multiple times in the input. One can add the following working tests to the code :

 // case similar elements
 int[] a8 = {10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10};
 assertEquals(10, getClosestK(a8, 9));
 assertEquals(10, getClosestK(a8, 10));
 assertEquals(10, getClosestK(a8, 11));

but this one will fail :

 int[] a9 = {1, 2, 100, 100, 101};
 assertEquals(3, getClosestK(a9, 2)); // actually returns 100

What if a[mid] == a[mid+1]? I imagine you'd have to walk to the left or right until they are unequal, then proceed to compute d1 and d2. But then you lose the worst-case O(log n) guarantee.
Hi @VickyChijwani, thanks for your feedback. I'm AFK for a few days but I'll try to think about it and let you know asap :-)
Thanks! No hurry. The idea of walking left and right that I suggested, seems to work well enough for my use-case.
@VickyChijwani Everything seems to work fine. I've added an edit to my answer. Please let me know if you need more details. Basically, when 2 (consecutive) elements are equal, we have d1 == d2 (which is equal to 0 if this is the element we where looking for) and we reduce the search range accordingly : low = mid+1;. Please let me know if you see an issue.
Your solution returns 100 for a = [1, 2, 100, 100, 101] and x = 3, which is incorrect. The reason is that, when d1 == d2, you don't have enough information to decide which half of the array to check next, so you must somehow make them unequal before proceeding.

Stack Exchange Network

Find the value nearest to k in the sorted array

7 Answers 7

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find the value nearest to k in the sorted array

7 Answers 7

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions