I know that there is a function bsearch
present in stdlib.h
but still I want to implement this.
This is my code for binary search. Any and all reviews are welcome.
#include<stdio.h>
#include<stdlib.h>
int compare (const void * a, const void * b)
{
return (*(int*)a - *(int*)b);
}
int bin_search(int arr[], int min_index, int max_index, int element)
{
/*
Searches for an element in the array arr
Returns fist index of element if present else returns -1
*/
if (min_index > max_index){
return -1;
}
else
{
//Don't change this assignment of mid_point. It avoids overflow
int mid_point = min_index + (max_index - min_index)/2;
if (arr[mid_point] > element){
return bin_search(arr, min_index, mid_point - 1, element);
}
else if (arr[mid_point] < element){
return bin_search(arr, mid_point + 1, max_index, element);
}
else{
return mid_point;
}
}
}
int main()
{
int length;
while (1)
{
printf("Enter a positive length: ");
scanf("%d", &length);
if (length > 1){
break;
}
else{
printf("You entered length = %d\n\n", length);
}
}
int *arr = malloc(sizeof(int) * length);
if (arr == NULL)
{
perror("The following error occurred");
exit(-1);
}
for (int i = 0; i < length; i++){
scanf("%d", &arr[i]);
}
int element;
printf("\nEnter the element to be searched: ");
scanf("%d", &element);
qsort(arr, length, sizeof(int), compare);
int index = bin_search(arr, 0, length - 1, element);
if (index == -1){
printf("\nElement not in the array");
}
else{
printf("\nIndex of element is %d", index);
}
free(arr);
return 0;
}
3 Answers 3
The functions says it "Returns fist index of element if present". If I give it the numbers 2, 2, 2 and ask it to find 2 it says the 1, but the first index with value 2 is clearly 0.
Some minor comments on the code:
the
arr
parameter tobin_search
should beconst
. This tells the compiler and the reader that the array is not changed by the function. The compiler will then enforce this if you, by mistake, try to modify the array data. The reader/user knows that her data is unchanged after the call.the parameter names
min_index
andmax_index
could be shortened tomin
andmax
. Giving names an appropriate size is a service to the reader (auto-completion by the IDE is a service to you). In general, the shorter the names, the less dense the code and the easier it is to read. This can be taken too far of course, once names become meaningless.Note that it would be more normal to pass the start of the array and its size instead of the array plus two offsets.
functions could be static. This is of no significance in a one-file program but becomes important with bigger programs. Making functions and global variables static restricts their scope to the file which allows extra optimisation and reduces namespace pollution.
the output message needs a trailing \n
there is no prompt for the input values - ok that is trivial. Personally I find this sort of test better with values entered on the command line.
exit status is normally 1 (EXIT_FAILURE) on failure, not -1. On success it is 0 (EXIT_SUCCESS). These are UNIX conventions.
EDIT
You questioned the use of const
. Its utility when used on parameters passed by
reference - pointers/array passed into functions - is, I think clear.
However, its use with parameters passed by value is not so clear. It plays no part in the interface seen by callers of a function, as it can have no influence on the caller. You can declare a function prototype in a public header like this
int bin_search(const int arr[], int min, int max, int element);
and then define the function implementation like this:
int bin_search(const int arr[], const int min, const int max, const int element) {...}
And the compiler will be quite happy with the difference.
So it is purely an implementation issue. Hence you should definitely not use const
on pass-by-value parameters in public prototypes, only in the implementation (if at all). Used in the implementation, const
tells the reader and the compiler
that a parameter is not (and cannot be) changed. This gives the reader some
extra information and of course the compiler will enforce this read-only
behaviour.
So shouldn't you use it on all parameters that are not changed (and some might say that parameters should never be changed)? Good question! I never use const
on call parameters but would have no hesitation in adding const
to a local variable
int cirle_area(double radius)
{
const double pi = 3.14;
return pi * radius * radius;
}
A lot of good programming style concerns consistency, so my inconsistency here
is troubling. And my previous comment - that if you make the element
parameter const
, then you should be consistent and do the same for min
and
max
uses consistency as an argument!
I'm afraid I can do no better than that. I've programmed for 20 years in C
and have rarely seen const
applied to parameters. But maybe it should be.
-
\$\begingroup\$ I'll take care of the incorrect comment. About
arr
beingconst
why? I usedmin
andmax
butmax
is defined in C++ and the IDE that I use(Code:blocks) starts to highlight it as a keyword so I thought to change them. It has auto-completion so length of variables isn't a problem. Any particular reason why I should usestatic
functions? Can you tell me what it is this called so I can read up a bit(likestatic
for variables isstorage class
)? About exit status, is that a convention or what? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:19:58 +00:00Commented Jul 9, 2013 at 17:19 -
\$\begingroup\$ So
const
should be used whenever I am passing pointers but not changing the contents. Correct? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:55:04 +00:00Commented Jul 9, 2013 at 17:55 -
\$\begingroup\$ Yes, that is correct \$\endgroup\$William Morris– William Morris2013年07月09日 19:57:00 +00:00Commented Jul 9, 2013 at 19:57
-
\$\begingroup\$ Asked a question on programmers about use of the
const
keyword. Please give your comments and suggest improvements. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月11日 14:07:15 +00:00Commented Jul 11, 2013 at 14:07 -
\$\begingroup\$ Signaling the array as const is a good idea, for the other parameters const it's meaningless and just bloats the code. As for the the identifiers, I would argue that max_index is better than just max. A good rule of thumb (that do not work for every case) is to have a noun in the mix. \$\endgroup\$Marcelo De Zen– Marcelo De Zen2016年07月23日 10:47:23 +00:00Commented Jul 23, 2016 at 10:47
If your average case is that the looked-for element is distributed uniformly randomly (and that your array is pulled from the same distribution), you actually gain in the average case by not returning early if arr[mid] == element.
William Morris's comment about static
was most likely meant to apply to your compare
helper function; presumably bin_search
will be declared in a header file to be included in other translation units, in which case it must not be declared static
.
Combining these comments:
int bin_search(const int arr[], int min, int max, const int element)
{
/*
Searches for an element in the array arr
Returns index of element if present else returns -1
*/
if (min >= max) {
return arr[max] == element ? max : -1;
}
//Don't change this assignment of mid. It avoids overflow
const int mid = min + (max - min) / 2;
if (arr[mid] < element) {
return bin_search(arr, mid + 1, max, element);
} else {
return bin_search(arr, min, mid, element);
}
This has exactly 2 * lg(N) + 1 comparisons (2 for each halving of the size, plus 1 on the final return). Your version has worst-case 3 * lg(N) comparisons and is worse in the average case (2.5 average comparisons per layer, saving on average (N-1)*2^-N + (N-2)*2^(N-1) + ... + 1*2^-1 layers, means that you get about 2.5 * (lg(N)-2) comparisons on average).
-
1\$\begingroup\$ My solution has the downside that it accesses uninitialized memory when
max
< 0 initially, which may happen e.g. when code doesn't check for an empty array before callingbin_search
\$\endgroup\$ruds– ruds2013年07月09日 19:54:27 +00:00Commented Jul 9, 2013 at 19:54 -
\$\begingroup\$ Any function that needs to called in other files by including the header files shouldn't be
static
. But these functions can access thestatic
functions from the files from which they are called. Am I correct? As an example of what I think you have said, in this question themerge_parts
function should bestatic
but themerge_sort
shouldn't be. Declaringmerge_parts
asstatic
won't effect the use ofmerge_sort
in another file. Correct me if I am wrong. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 08:53:57 +00:00Commented Jul 10, 2013 at 8:53 -
\$\begingroup\$ About declaring
mid
andelement
asconst
, should any variable, that isn't changed in the called function, be decalredconst
? I understand that declaringarr
asconst
is relevant because it is a pointer that can be changed but the calling function's original value passed to the variableelement
cannot be effected here. Is this declaration ofconst
for the purposes of clarity and perhaps a security measure in the scope of the called function? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 09:06:15 +00:00Commented Jul 10, 2013 at 9:06 -
\$\begingroup\$ Your code is missing a brace after the last else and shouldn't it be
2 * lg(N) + 2
. In the final case there are 2 comparisons, 1 for theif
and 1 inside it. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 09:13:17 +00:00Commented Jul 10, 2013 at 9:13 -
\$\begingroup\$ After some thinking I think I might have found the best of both worlds. I'll update with a code that uses the same number of comparisons as you have suggested while still taking care of accessing uninitialized memory. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月12日 14:55:44 +00:00Commented Jul 12, 2013 at 14:55
Update bin_search
with a best-case scenario O(1), if the array has only 1 element.
int bin_search (int arr[], int min_index, int max_index, int element)
{
// this is the best case scenario
if(min_index == max_index)
{
if(arr[min_index] == element)
return min_index;
else
return -1;
}
if (min_index > max_index)
{
return -1;
}
else
{
int mid_index = (min_index + max_index) / 2;
if (arr[mid_index] > element)
{
return bin_search(arr, min_index, mid_index - 1, element);
}
else if (arr[mid_index] < element)
{
return bin_search(arr, mid_index + 1, max_index, element);
}
else
{
return mid_index;
}
}
}
There is also an iterative process.
int bin_search (const int arr[], int min, int max, int element)
{
int high = max+1;
int low = min;
while(low < (high-1))
{
int mid = (low + high)/2;
if(element < arr[mid])
high = mid;
else
low = mid;
}
if(arr[low] == element)
return low;
else
return -1; // not found
}
-
\$\begingroup\$ The complexity doesn't change a bit but performance suffers in this code. My implementation is tail-recursive so there shouldn't be much overhead for function calls. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:25:17 +00:00Commented Jul 9, 2013 at 17:25
-
\$\begingroup\$ @AseemBansal: Note, "tail-recursive" is still "recursive" -- and by default still has any other recursion's performance issues. You don't gain anything from it performancewise unless the compiler does tail-call optimizations (which it's not required to do). \$\endgroup\$cHao– cHao2013年07月09日 17:27:41 +00:00Commented Jul 9, 2013 at 17:27
-
\$\begingroup\$ @cHao What about gcc? I use Code:blocks IDE with gcc compiler. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:34:56 +00:00Commented Jul 9, 2013 at 17:34
-
\$\begingroup\$ @AseemBansal: Looks like if you compile with
-O2
,-O3
, or-Os
, or add-foptimize-sibling-calls
to the command line, it should at least try. At that point, the result depends on how good the detection is. \$\endgroup\$cHao– cHao2013年07月09日 18:05:27 +00:00Commented Jul 9, 2013 at 18:05 -
\$\begingroup\$ @AseemBansal and why my code is not "tail-recursive"? \$\endgroup\$Anirban Nag 'tintinmj'– Anirban Nag 'tintinmj'2013年07月09日 18:06:33 +00:00Commented Jul 9, 2013 at 18:06
element
aconst
and notmin
andmax
? Suchconst
markup on parameters passed by value does no harm but is not part of the function's interface - in other words the caller does not care. From an interface point of view,const
only has meaning for pointers, such asarr
. It could be considered to have some utility in telling the reader that the call parameters remain unchanged (but personally I rarely, if ever, use it for parameters passed by value). \$\endgroup\$const
. I commented on ruds' answer about this confusion as he added 1 moreconst
here. Perhaps you can answer my comments there. \$\endgroup\$