I am upgrading my C++11 knowledge and repeating some essential algorithm. Here is binary search only in terms of iterators.
#include <iostream>
#include <vector>
#include <iterator>
using namespace std;
template<typename t_iter>
t_iter binarySearch(t_iter begin, t_iter end, typename iterator_traits<t_iter>::value_type q)
{
typedef iterator_traits<t_iter>::value_type t_elem;
size_t len = end - begin;
if (len < 2)
{
if (*begin == q)
{
return begin;
}
return end;
}
t_iter middleElem = begin + len / 2;
if (*middleElem < q)
{
return binarySearch(middleElem, end, q);
}
return binarySearch(begin, middleElem, q);
}
int main()
{
int tmp[] = {1, 2, 4, 5, 8, 10, 20, 35, 41, 42, 100};
size_t len = sizeof(tmp)/sizeof(tmp[0]);
vector<int> a(tmp, tmp + len);
int q = 35;
auto it = binarySearch(a.begin(), a.end(), q);
cout << distance(a.begin(), it) << endl;
}
2 Answers 2
Iterators may not be random access iterators which means the length calculation and the advance by half won't compile.
For this there is the distance
and advance
functions available which use the random access if it is available but fall back to counting and repeated incrementing when it isn't.
You never use the t_elem
type def.
In the recursion step both branches are equally likely and equally important, that is best signified and made more readable by using an else
branch rather than letting it fall through.
you can also implement as a loop to minimize recursion overhead (micro-optimization admittedly because automatic tail-call optimization would do the same).
template<typename t_iter>
t_iter binarySearch(t_iter begin, t_iter end, typename iterator_traits<t_iter>::value_type q)
{
auto len = std::distance(begin, end);
while(len>1)
{
t_iter middleElem = begin;
std::advance(middleElem, len / 2);
if (*middleElem < q)
{
begin = middleElem;
}
else
{
end = middleElem;
}
len = std::distance(begin, end);
}
if (*begin == q)
{
return begin;
}
return end;
}
-
\$\begingroup\$ You don't want to use binary_search with non-random iterators. So I suppose it is better to generate an error in that case: don't use distance and advance. \$\endgroup\$Emanuele Paolini– Emanuele Paolini2014年08月20日 11:34:11 +00:00Commented Aug 20, 2014 at 11:34
-
\$\begingroup\$ @EmanuelePaolini depends if the slowdown is in the compare or the increment, binary searching a linked list (with increments and counting) of strings is still faster than linear searching \$\endgroup\$ratchet freak– ratchet freak2014年08月20日 11:36:37 +00:00Commented Aug 20, 2014 at 11:36
-
\$\begingroup\$ I think that it may happen that a user does not realize that its container is not random access and uses binary search erroneously. Instead the case where a user really needs a binary search with a non random iterator is very uncommon and we could require the user to wrap his non-random iterator with a fake random one (by using advance to implement operator+) before performing binary search. \$\endgroup\$Emanuele Paolini– Emanuele Paolini2014年08月20日 11:53:29 +00:00Commented Aug 20, 2014 at 11:53
-
\$\begingroup\$ Thanks for such a detailed review! Ensuring that a random access iterator is passed would make sense IMHO, but I don't think there is a reasonable simple way to do this without concepts. \$\endgroup\$Nils– Nils2014年08月20日 12:04:31 +00:00Commented Aug 20, 2014 at 12:04
-
\$\begingroup\$ Ah and also the iterative solution is better of course :) \$\endgroup\$Nils– Nils2014年08月20日 12:05:51 +00:00Commented Aug 20, 2014 at 12:05
Undefined behavior
Your code will dereference an end iterator if the range is empty at this location:
if (len < 2)
{
if (*begin == q)
Because the range is empty, len == 0
and begin == end
. The end
should never be dereferenced for comparison (even if it might be a valid iterator).
Wrong difference type
The difference of two (random access) iterators results in a std::ptrdiff_t
not std::size_t
. The latter is unsigned while the former is a signed type!
Don't using namespace std;
using namespace std;
does more harm than good. Although programmers are lazy some std::
more won't hurt you. If you really want to avoid typing std::
multiple times then don't use using namespace
but for example using std::iterator_traits
. And make sure that these instances are as localized as possible (never outside of functions in header files).
Compiler error
You are using a dependent name and should be using typename
here:
typedef iterator_traits<t_iter>::value_type t_elem;
And this typedef
is not even used!
C++11
The only C++11 feature I can spot in your code is one instance of auto
on this line:
auto it = binarySearch(a.begin(), a.end(), q);
You have missed several opportunities to use C++11. E.g.:
int tmp[] = {1, 2, 4, 5, 8, 10, 20, 35, 41, 42, 100};
size_t len = sizeof(tmp)/sizeof(tmp[0]);
vector<int> a(tmp, tmp + len);
Why waste memory on tmp
when you could have done the initialization shorter and more secure:
vector<int> a = {1, 2, 4, 5, 8, 10, 20, 35, 41, 42, 100};
C++11 replaces typedefs
by using
declarations:
typedef typename iterator_traits<t_iter>::value_type t_elem;
becomes
using t_elem = typename iterator_traits<t_iter>::value_type;
Even more conveniently this using syntax allows for templates and does not need the indirection via member types (thereby reducing the need for typename
):
template <class Iterator>
using Value_type = typename std::iterator_traits<Iterator>::value_type;
template<typename t_iter>
t_iter binarySearch(t_iter begin, t_iter end, Value_type<t_iter> q)
There is another location for usage of auto
here:
t_iter middleElem = begin + len / 2;
Naming
t_iter
is not a very good type name, the usual convention would beiter_t
- one character names like
q
are not very helpful as well. How aboutwanted_value
middleElem
breaks with the naming ofbegin
andend
, to me it sounds like this variable should store the value of the element pointed to by an iteratormiddle
(which would IMHO be a more consistent name for this iterator)
-
\$\begingroup\$ Agreed with everything except the last point: This is just a simple example no real code. \$\endgroup\$Nils– Nils2014年08月20日 12:28:34 +00:00Commented Aug 20, 2014 at 12:28
-
\$\begingroup\$ @Nils: I have extended the list. There is no such thing as "no real code". Either it is code or it isn't. Having exceptions for "non real" code opens the door for letting this behavior creep into your daily practice ("nah this is just a quick hack, no real code") until it gets into production and trips you up there. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2014年08月20日 12:31:51 +00:00Commented Aug 20, 2014 at 12:31
-
\$\begingroup\$ Yes I know VS2012 lacks of some support, but why using using instead of typedef? I do not se how this is better. \$\endgroup\$Nils– Nils2014年08月20日 12:32:19 +00:00Commented Aug 20, 2014 at 12:32
-
\$\begingroup\$ As I explained the
using
declaration is more powerful than atypedef
(it can be templated and it works more like a function when used for template meta functions). I would recommend that you update your compiler to a more C++11 compliant one if you really want to learn and use C++11. \$\endgroup\$Nobody moving away from SE– Nobody moving away from SE2014年08月20日 12:34:24 +00:00Commented Aug 20, 2014 at 12:34 -
\$\begingroup\$ Nice review! I think you can do even better than the using statement:
decltype(*begin)
. I think it's easier to read and it works even for custom iterators that haven't defined a specialization ofstd::iterator_traits
. \$\endgroup\$ruds– ruds2014年08月20日 12:45:50 +00:00Commented Aug 20, 2014 at 12:45
return end;
is unlikely to be the actualend
iterator (it is the end of the current range which has been subdivided many times). eg: finding2
in[1, 3, 4, 5]
will return the iterator for3
. \$\endgroup\$