I know that there is a function bsearch present in stdlib.h but still I want to implement this.
This is my code for binary search. Any and all reviews are welcome.
#include<stdio.h>
#include<stdlib.h>
int compare (const void * a, const void * b)
{
return (*(int*)a - *(int*)b);
}
int bin_search(int arr[], int min_index, int max_index, int element)
{
/*
Searches for an element in the array arr
Returns fist index of element if present else returns -1
*/
if (min_index > max_index){
return -1;
}
else
{
//Don't change this assignment of mid_point. It avoids overflow
int mid_point = min_index + (max_index - min_index)/2;
if (arr[mid_point] > element){
return bin_search(arr, min_index, mid_point - 1, element);
}
else if (arr[mid_point] < element){
return bin_search(arr, mid_point + 1, max_index, element);
}
else{
return mid_point;
}
}
}
int main()
{
int length;
while (1)
{
printf("Enter a positive length: ");
scanf("%d", &length);
if (length > 1){
break;
}
else{
printf("You entered length = %d\n\n", length);
}
}
int *arr = malloc(sizeof(int) * length);
if (arr == NULL)
{
perror("The following error occurred");
exit(-1);
}
for (int i = 0; i < length; i++){
scanf("%d", &arr[i]);
}
int element;
printf("\nEnter the element to be searched: ");
scanf("%d", &element);
qsort(arr, length, sizeof(int), compare);
int index = bin_search(arr, 0, length - 1, element);
if (index == -1){
printf("\nElement not in the array");
}
else{
printf("\nIndex of element is %d", index);
}
free(arr);
return 0;
}
3 Answers 3
The functions says it "Returns fist index of element if present". If I give it the numbers 2, 2, 2 and ask it to find 2 it says the 1, but the first index with value 2 is clearly 0.
Some minor comments on the code:
the
arrparameter tobin_searchshould beconst. This tells the compiler and the reader that the array is not changed by the function. The compiler will then enforce this if you, by mistake, try to modify the array data. The reader/user knows that her data is unchanged after the call.the parameter names
min_indexandmax_indexcould be shortened tominandmax. Giving names an appropriate size is a service to the reader (auto-completion by the IDE is a service to you). In general, the shorter the names, the less dense the code and the easier it is to read. This can be taken too far of course, once names become meaningless.Note that it would be more normal to pass the start of the array and its size instead of the array plus two offsets.
functions could be static. This is of no significance in a one-file program but becomes important with bigger programs. Making functions and global variables static restricts their scope to the file which allows extra optimisation and reduces namespace pollution.
the output message needs a trailing \n
there is no prompt for the input values - ok that is trivial. Personally I find this sort of test better with values entered on the command line.
exit status is normally 1 (EXIT_FAILURE) on failure, not -1. On success it is 0 (EXIT_SUCCESS). These are UNIX conventions.
EDIT
You questioned the use of const. Its utility when used on parameters passed by
reference - pointers/array passed into functions - is, I think clear.
However, its use with parameters passed by value is not so clear. It plays no part in the interface seen by callers of a function, as it can have no influence on the caller. You can declare a function prototype in a public header like this
int bin_search(const int arr[], int min, int max, int element);
and then define the function implementation like this:
int bin_search(const int arr[], const int min, const int max, const int element) {...}
And the compiler will be quite happy with the difference.
So it is purely an implementation issue. Hence you should definitely not use const on pass-by-value parameters in public prototypes, only in the implementation (if at all). Used in the implementation, const tells the reader and the compiler
that a parameter is not (and cannot be) changed. This gives the reader some
extra information and of course the compiler will enforce this read-only
behaviour.
So shouldn't you use it on all parameters that are not changed (and some might say that parameters should never be changed)? Good question! I never use const on call parameters but would have no hesitation in adding const to a local variable
int cirle_area(double radius)
{
const double pi = 3.14;
return pi * radius * radius;
}
A lot of good programming style concerns consistency, so my inconsistency here
is troubling. And my previous comment - that if you make the element
parameter const, then you should be consistent and do the same for min and
max uses consistency as an argument!
I'm afraid I can do no better than that. I've programmed for 20 years in C
and have rarely seen const applied to parameters. But maybe it should be.
-
\$\begingroup\$ I'll take care of the incorrect comment. About
arrbeingconstwhy? I usedminandmaxbutmaxis defined in C++ and the IDE that I use(Code:blocks) starts to highlight it as a keyword so I thought to change them. It has auto-completion so length of variables isn't a problem. Any particular reason why I should usestaticfunctions? Can you tell me what it is this called so I can read up a bit(likestaticfor variables isstorage class)? About exit status, is that a convention or what? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:19:58 +00:00Commented Jul 9, 2013 at 17:19 -
\$\begingroup\$ So
constshould be used whenever I am passing pointers but not changing the contents. Correct? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:55:04 +00:00Commented Jul 9, 2013 at 17:55 -
\$\begingroup\$ Yes, that is correct \$\endgroup\$William Morris– William Morris2013年07月09日 19:57:00 +00:00Commented Jul 9, 2013 at 19:57
-
\$\begingroup\$ Asked a question on programmers about use of the
constkeyword. Please give your comments and suggest improvements. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月11日 14:07:15 +00:00Commented Jul 11, 2013 at 14:07 -
\$\begingroup\$ Signaling the array as const is a good idea, for the other parameters const it's meaningless and just bloats the code. As for the the identifiers, I would argue that max_index is better than just max. A good rule of thumb (that do not work for every case) is to have a noun in the mix. \$\endgroup\$Marcelo De Zen– Marcelo De Zen2016年07月23日 10:47:23 +00:00Commented Jul 23, 2016 at 10:47
If your average case is that the looked-for element is distributed uniformly randomly (and that your array is pulled from the same distribution), you actually gain in the average case by not returning early if arr[mid] == element.
William Morris's comment about static was most likely meant to apply to your compare helper function; presumably bin_search will be declared in a header file to be included in other translation units, in which case it must not be declared static.
Combining these comments:
int bin_search(const int arr[], int min, int max, const int element)
{
/*
Searches for an element in the array arr
Returns index of element if present else returns -1
*/
if (min >= max) {
return arr[max] == element ? max : -1;
}
//Don't change this assignment of mid. It avoids overflow
const int mid = min + (max - min) / 2;
if (arr[mid] < element) {
return bin_search(arr, mid + 1, max, element);
} else {
return bin_search(arr, min, mid, element);
}
This has exactly 2 * lg(N) + 1 comparisons (2 for each halving of the size, plus 1 on the final return). Your version has worst-case 3 * lg(N) comparisons and is worse in the average case (2.5 average comparisons per layer, saving on average (N-1)*2^-N + (N-2)*2^(N-1) + ... + 1*2^-1 layers, means that you get about 2.5 * (lg(N)-2) comparisons on average).
-
1\$\begingroup\$ My solution has the downside that it accesses uninitialized memory when
max< 0 initially, which may happen e.g. when code doesn't check for an empty array before callingbin_search\$\endgroup\$ruds– ruds2013年07月09日 19:54:27 +00:00Commented Jul 9, 2013 at 19:54 -
\$\begingroup\$ Any function that needs to called in other files by including the header files shouldn't be
static. But these functions can access thestaticfunctions from the files from which they are called. Am I correct? As an example of what I think you have said, in this question themerge_partsfunction should bestaticbut themerge_sortshouldn't be. Declaringmerge_partsasstaticwon't effect the use ofmerge_sortin another file. Correct me if I am wrong. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 08:53:57 +00:00Commented Jul 10, 2013 at 8:53 -
\$\begingroup\$ About declaring
midandelementasconst, should any variable, that isn't changed in the called function, be decalredconst? I understand that declaringarrasconstis relevant because it is a pointer that can be changed but the calling function's original value passed to the variableelementcannot be effected here. Is this declaration ofconstfor the purposes of clarity and perhaps a security measure in the scope of the called function? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 09:06:15 +00:00Commented Jul 10, 2013 at 9:06 -
\$\begingroup\$ Your code is missing a brace after the last else and shouldn't it be
2 * lg(N) + 2. In the final case there are 2 comparisons, 1 for theifand 1 inside it. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月10日 09:13:17 +00:00Commented Jul 10, 2013 at 9:13 -
\$\begingroup\$ After some thinking I think I might have found the best of both worlds. I'll update with a code that uses the same number of comparisons as you have suggested while still taking care of accessing uninitialized memory. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月12日 14:55:44 +00:00Commented Jul 12, 2013 at 14:55
Update bin_search with a best-case scenario O(1), if the array has only 1 element.
int bin_search (int arr[], int min_index, int max_index, int element)
{
// this is the best case scenario
if(min_index == max_index)
{
if(arr[min_index] == element)
return min_index;
else
return -1;
}
if (min_index > max_index)
{
return -1;
}
else
{
int mid_index = (min_index + max_index) / 2;
if (arr[mid_index] > element)
{
return bin_search(arr, min_index, mid_index - 1, element);
}
else if (arr[mid_index] < element)
{
return bin_search(arr, mid_index + 1, max_index, element);
}
else
{
return mid_index;
}
}
}
There is also an iterative process.
int bin_search (const int arr[], int min, int max, int element)
{
int high = max+1;
int low = min;
while(low < (high-1))
{
int mid = (low + high)/2;
if(element < arr[mid])
high = mid;
else
low = mid;
}
if(arr[low] == element)
return low;
else
return -1; // not found
}
-
\$\begingroup\$ The complexity doesn't change a bit but performance suffers in this code. My implementation is tail-recursive so there shouldn't be much overhead for function calls. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:25:17 +00:00Commented Jul 9, 2013 at 17:25
-
\$\begingroup\$ @AseemBansal: Note, "tail-recursive" is still "recursive" -- and by default still has any other recursion's performance issues. You don't gain anything from it performancewise unless the compiler does tail-call optimizations (which it's not required to do). \$\endgroup\$cHao– cHao2013年07月09日 17:27:41 +00:00Commented Jul 9, 2013 at 17:27
-
\$\begingroup\$ @cHao What about gcc? I use Code:blocks IDE with gcc compiler. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月09日 17:34:56 +00:00Commented Jul 9, 2013 at 17:34
-
\$\begingroup\$ @AseemBansal: Looks like if you compile with
-O2,-O3, or-Os, or add-foptimize-sibling-callsto the command line, it should at least try. At that point, the result depends on how good the detection is. \$\endgroup\$cHao– cHao2013年07月09日 18:05:27 +00:00Commented Jul 9, 2013 at 18:05 -
\$\begingroup\$ @AseemBansal and why my code is not "tail-recursive"? \$\endgroup\$Anirban Nag 'tintinmj'– Anirban Nag 'tintinmj'2013年07月09日 18:06:33 +00:00Commented Jul 9, 2013 at 18:06
elementaconstand notminandmax? Suchconstmarkup on parameters passed by value does no harm but is not part of the function's interface - in other words the caller does not care. From an interface point of view,constonly has meaning for pointers, such asarr. It could be considered to have some utility in telling the reader that the call parameters remain unchanged (but personally I rarely, if ever, use it for parameters passed by value). \$\endgroup\$const. I commented on ruds' answer about this confusion as he added 1 moreconsthere. Perhaps you can answer my comments there. \$\endgroup\$