I have implemented Binary search code for a given sorted array, written unit test too, Please review my BST unit test code and let me know your comments for improvements.
class BinarySearch {
public int binSearch(int[] sortedArray, int key) {
return search(0, sortedArray.length - 1, key, sortedArray);
}
private static int search(int start, int end, int key, int[] sortedArray) {
int mid = start + ((end-start)/2);
if (mid >= start && mid <= end) {
if (sortedArray[mid] == key) {
return mid;
} else if (sortedArray[mid] > key) {
return search(start, mid-1, key, sortedArray);
} else if(sortedArray[mid] < key){
return search(mid+1, end, key, sortedArray);
}
}
return -1;
}
}
public class BinarySearchTest {
private BinarySearch binarySearchCUT;
private static final int UNSUCCESSFUL = -1;
@Before
public void setUp(){
binarySearchCUT = new BinarySearch();
}
@Test
public void testShouldReturnUnsuccessfulOnEmptyArray() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{}, 0));
}
@Test
public void testShouldReturnUnsuccessfulOnLeftBound() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 0));
}
@Test
public void testShouldReturnUnsuccessfulOnRightBound() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 101));
}
@Test
public void testShouldReturnSuccessfulOnLeftBound() {
assertEquals(0, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 1));
}
@Test
public void testShouldReturnSuccessfulOnRightBound() {
assertEquals(12, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 100));
}
@Test
public void testShouldReturnSuccessfulOnMid() {
assertEquals(7, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 19));
}
@Test
public void testShouldReturnSuccessfulOnMidGreaterThanGivenNumber() {
assertEquals(5, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 12));
}
@Test
public void testShouldReturnSuccessfulOnMidLesserThanGivenNumber() {
assertEquals(10, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 69));
}
}
4 Answers 4
BinarySearch
The BinarySearch
class should be public
since it contains utility methods that are generally useful. To do this, write public class BinarySearch
instead of class BinarySearch
.
The binSearch
method should be static
since it does not access any fields from the BinarySearch
class. After all, that class doesn't have any fields that could be accessed. To do this, write public static int binSearch
instead of public int binSearch
.
When you make the binSearch
method static, you no longer need the new BinarySearch()
in the unit test. This is good since it sounds weird to have a "new binary search". It's confusing to hear this, since "binary search" is an algorithm and not a countable thing. It makes more sense to speak of "the binary search algorithm", avoiding the word "new". On the other hand, "new house" or "new sheet of paper" sounds much more natural.
The helper method currently is:
private static int search(int start, int end, int key, int[] sortedArray) {
The order of the parameters is not the usual one. Usually the array comes first, before its indexes. In this case, this would mean:
private static int search(int[] sortedArray, int start, int end, int key) {
When you compare your code with the predefined Arrays.binarySearch and the helper function, you can see that the code is quite similar, which is a good sign.
One useful thing that the predefined Arrays.binarySearch
does is that if the element is not found, it returns a hint about the index where it would be found. This is useful to see the closest existing value.
BinarySearchTest
Your existing unit tests look good. You can leave out the word test
at the beginning of the method names to make the names a little shorter. They are quite long, but that's ok.
There are test cases missing:
- An element from the middle cannot be found, for example 5 in
[1, 2, 6, 10]
. - The array contains duplicate values, for example 3 in
[1, 2, 3, 3, 3, 3, 3, 3, 3]
.
In the latter case, it is unspecified whether binSearch
should return 2 or 3 or any other index. All you can do in your unit test is to check that `array[binSearch(array, key)] == key.
-
2\$\begingroup\$
it returns the index where it would be found.
But then you lose information about whether or not the object is included or not. The C# standard library solves this by returning the binary complement of the index (which would be a negative number). \$\endgroup\$JAD– JAD2020年01月02日 15:08:13 +00:00Commented Jan 2, 2020 at 15:08 -
\$\begingroup\$ @JAD That's exactly what Java is doing as well. \$\endgroup\$Roland Illig– Roland Illig2020年01月02日 16:34:11 +00:00Commented Jan 2, 2020 at 16:34
-
\$\begingroup\$ Well look at that. I initially misread
return -(low + 1);
as if it would just return-1
. \$\endgroup\$JAD– JAD2020年01月03日 07:05:23 +00:00Commented Jan 3, 2020 at 7:05
Your code looks good but just wanted to suggest below
if (mid >= start && mid <= end)
- this condition will never fail becausemid
will either equal tostart
orend
so you can skip this because this is taking unnecessary condition check time- You can check if
start
is less than or equal toend
otherwise you will end up with wrongmid
value. By putting this condition method will not calculatemid
value but return -1 and it will improve performance.
`
private static int search(int start, int end, int key, int[] sortedArray) {
if(start <= end) {
int mid = start + ((end - start) / 2);
//if (mid >= start && mid <= end) {
if (sortedArray[mid] == key) {
return mid;
} else if (sortedArray[mid] > key) {
return search(start, mid - 1, key, sortedArray);
} else if (sortedArray[mid] < key) {
return search(mid + 1, end, key, sortedArray);
}
//}
}
return -1;
}
`
I have some suggestions for you.
1) In the search
method, extract the sortedArray[mid]
into a variable.
//[...]
int currentMid = sortedArray[mid];
if (currentMid == key) {
//[...]
} else if (currentMid > key) {
//[...]
} else if(currentMid < key){
//[...]
}
For the array part, instead of passing an array, you can use the varargs
. In my opinion, it will be easier to test since you don’t need to create a new array each time, but it can be uglier since the other parameters are also integers.
public class BinarySearchTest {
private BinarySearch binarySearchCUT;
private static final int UNSUCCESSFUL = -1;
@Before
public void setUp(){
binarySearchCUT = new BinarySearch();
}
@Test
public void testShouldReturnUnsuccessfulOnEmptyArray() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(0));
}
@Test
public void testShouldReturnUnsuccessfulOnLeftBound() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(0, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnUnsuccessfulOnRightBound() {
assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(101, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnSuccessfulOnLeftBound() {
assertEquals(0, binarySearchCUT.binSearch(1, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnSuccessfulOnRightBound() {
assertEquals(12, binarySearchCUT.binSearch(100, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnSuccessfulOnMid() {
assertEquals(7, binarySearchCUT.binSearch(19, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnSuccessfulOnMidGreaterThanGivenNumber() {
assertEquals(5, binarySearchCUT.binSearch(12, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
@Test
public void testShouldReturnSuccessfulOnMidLesserThanGivenNumber() {
assertEquals(10, binarySearchCUT.binSearch(69, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100));
}
static class BinarySearch {
public int binSearch(int key, int... sortedArray) {
return search(0, sortedArray.length - 1, key, sortedArray);
}
private static int search(int start, int end, int key, int... sortedArray) {
int mid = start + ((end-start)/2);
if (mid >= start && mid <= end) {
int currentMid = sortedArray[mid];
if (currentMid == key) {
return mid;
} else if (currentMid > key) {
return search(start, mid-1, key, sortedArray);
} else if(currentMid < key){
return search(mid+1, end, key, sortedArray);
}
}
return -1;
}
}
}
```
-
\$\begingroup\$ I think a vararg signature makes the whole method less easy to read and understand. And since he is using the same Array in all tests, he could just put it in a constant in his test-class. I think it is more error prone this way - e.g. counting of indexes in the array is harder. I would prefer the Tests to read like this:
arr=[]...; assert(binSearch(arr[10], arr)).is(10)
it is easier to see what the expected result represents. \$\endgroup\$Falco– Falco2020年01月03日 09:20:46 +00:00Commented Jan 3, 2020 at 9:20
One significant issue that has not been addressed in previous answers (all of which make excellent points), is that of deterministic behaviour.
Your code is going to produce unexpected, and inconsistent results if the data in the sorted array is not unique. You clearly indicate that the array is pre-sorted, but you do not specify any restraints on the uniqueness of the values.
It is standard in search routines to return the index of the first value in the array that matches the search term, but your code will return the index of "some" matching value.
When you find the match, scan backwards to find the first instance..., so your match code would change from:
if (sortedArray[mid] == key) { return mid; } else if
to more like:
if (sortedArray[mid] == key) {
// Ensure we return the first matching instance in the array.
while (mid > 0 && sortedArray[mid -1] == key) {
mid--;
}
return mid;
} else if
-
\$\begingroup\$ Interesting. Wouldn't that make the worst case running time O(N) [since if the array is all the same number, you'd loop through N/2 or so times] versus O(log(N))? \$\endgroup\$Foon– Foon2020年01月02日 22:10:17 +00:00Commented Jan 2, 2020 at 22:10
-
\$\begingroup\$ Note that Java at least specifically says: " If the list contains multiple elements equal to the specified object, there is no guarantee which one will be found. " It's a valid thing to think about, but I disagree it's a universal standard to return the first (docs.oracle.com/javase/7/docs/api/java/util/… for Java, though counter point is docs.python.org/2/library/bisect.html shows how to get the first (or last) with bisect) \$\endgroup\$Foon– Foon2020年01月02日 22:18:47 +00:00Commented Jan 2, 2020 at 22:18
-
\$\begingroup\$ (Check of github.com/python/cpython/blob/master/Lib/bisect.py suggests slightly tweaking the algorithm will support getting the first element for uniques but still be O(log(N))) \$\endgroup\$Foon– Foon2020年01月02日 22:32:05 +00:00Commented Jan 2, 2020 at 22:32
testSearch(int[] arr, int searchFor, int expected)
and put your test-array into a constant. The individual tests would then just betestSeach(normArray, 4, normArray[4])
much shorter to read and reason about. \$\endgroup\$Optional
s. \$\endgroup\$