I am trying to solve 4. Two Sum from this set of C++ interview questions. As far as I can think this solution has a time complexity of \$O(n \log n)\$ can it be further optimized?
Write a function that, given a vector and a target sum, returns zero-based indices of any two distinct elements whose sum is equal to the target sum. If there are no such elements, the function should return
(-1, -1)
.For example,
findTwoSum({ 1, 3, 5, 7, 9 }, 12)
should return astd::pair<int, int>
containing any of the following pairs of indices:
- 1 and 4 (3 + 9 = 12)
- 2 and 3 (5 + 7 = 12)
- 3 and 2 (7 + 5 = 12)
- 4 and 1 (9 + 3 = 12)
Here is my approach considering that negative numbers cannot be present in vector
static std::pair<int, int> findTwoSum(const std::vector<int>& list, int sum)
{
std::vector<int> com;
std::vector<int>::iterator it;
for(unsigned count=0; count<list.size(); count++){
it = std::find(com.begin(), com.end(), list[count]);
if ( it != com.end() ){
return std::make_pair(it - com.begin() + 1, --count);
}
else{
if(sum >= list[count])
com.push_back(sum-list[count]);
}
}
return std::make_pair(-1,-1);
}
1 Answer 1
Normally, we'd use the vector's index type (std::vector<int>::size_type
, i.e. std::size_t
) for the return values. But unfortunately we're required to return negative values when the search fails, so I'd recommend a pair of std::ptrdiff_t
instead. And the question specifically asks for int
s, so I'd just insert a comment explaining that were limited to arrays of up to INT_MAX
elements (the O(n2) scaling probably reduces the practical range, anyway).
I don't see anywhere that says there can't be negative numbers present - if that's specified, then it would have been wise to quote that part. As it is, you've introduced a bug - because we're not storing numbers larger than sum
into com
, the index calculation it - com.begin() + 1
will be incorrect by the amount of omitted large numbers (also, where does the +1
come from? - did you misread zero-based indices in the question?).
The vector com
could grow to (in the worst case) the same size as the input vector. That's quite a lot of extra storage. It might be more efficient to leave the >= sum
elements in place, and just search the beginning half of the input vector (no extra storage needed).
That looks like the following (making a few other simplifications, such as using an iterator instead of count
, and reducing the scope of the find
result):
static std::pair<int, int> findTwoSum(const std::vector<int>& list, int sum)
{
for (auto it_b = list.begin(); it_b != list.end(); ++it_b) {
if (auto it_a = std::find(list.begin(), it_b, sum - *it_b); it_a != it_b) {
return {it_a - list.begin(), it_b - list.begin()};
}
}
return {-1, -1};
}
We still have a pretty inefficient algorithm - it's O(n2), where n is the length of list
, because for every element in list
, we perform a linear search for its complement. We can reduce that, at the cost of reintroducing extra storage, by maintaining a set of seen values. That may seem little different to the present approach of maintaining a vector, but the advantage is that search scales much better with size. What we actually need is a map, as we'll want to note the corresponding index to return as result; the best choice is std::unordered_map
:
Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity.
That gives us:
std::unordered_map<int, int> seen; // value -> index
for (auto it_b = list.begin(); it_b != list.end(); ++it_b) {
if (auto it_a = seen.find(sum - *it_b); it_a != seen.end()) {
return {it_a->second, it_b - list.begin()};
} else {
seen[*it_b] = it_b - list.begin();
}
}
Modified code
#include <unordered_map>
#include <utility>
#include <vector>
static std::pair<int, int> findTwoSum(const std::vector<int>& list, int sum)
{
std::unordered_map<int, int> seen; // value -> index
for (auto it_b = list.begin(); it_b != list.end(); ++it_b) {
if (auto it_a = seen.find(sum - *it_b); it_a != seen.end()) {
return {it_a->second, it_b - list.begin()};
} else {
seen[*it_b] = it_b - list.begin();
}
}
return {-1, -1};
}
And a very simple test program:
#include <iostream>
int main()
{
auto const [a, b] = findTwoSum({ 1, 3, 5, 7, 9 }, 12);
std::cout << a << ',' << b << '\n';
}
-
\$\begingroup\$ +1, although technically average \$O(1)\$ isn't necessarily \$O(1)\$. \$\endgroup\$L. F.– L. F.2019年06月16日 00:45:53 +00:00Commented Jun 16, 2019 at 0:45
-
1\$\begingroup\$ Is
ssize_t
in namespacestd
? (I thought adding stuff to namespacestd
was Not Allowed, though I guess Posix might ignore that?) Perhapsstd::ptrdiff_t
would be a more portable option. \$\endgroup\$user673679– user6736792019年06月17日 09:23:11 +00:00Commented Jun 17, 2019 at 9:23 -
\$\begingroup\$ @user673679, I thought
ssize_t
was standard, but seems I was mistaken.std::ptrdiff_t
does seem to be the most suitable choice. Thanks. \$\endgroup\$Toby Speight– Toby Speight2019年06月17日 09:28:50 +00:00Commented Jun 17, 2019 at 9:28
Explore related questions
See similar questions with these tags.
O(n^2)
\$\endgroup\$O(n^2)
. In worst case you will be traversing wholelist
, which gives youO(n)
for the traversal itself. In addition, in each iteration you do a linear search (std::find
) which gives youO(n^2)
in total. \$\endgroup\$O(n * lg n)
. Marching from the left and right to find a sum is likelyO(n)
. \$\endgroup\$