5
\$\begingroup\$

Suppose we have two sorted lists, and we want to find one element from the first, and the other element from the 2nd list, where the sum of the two elements equal to a given target. If there are multiple combinations, need to find all the combinations.

Any bugs, performance improvement ideas in terms of algorithm time complexity or code style advice is highly appreciated. My basic idea is to scan the first list from the beginning, and scan the second list from the end, a similar approach to how we resolve the 2-sum classic problem in one list.

def two_sum(list1, list2, target, result):
 i = 0
 j = len(list2) - 1
 while i < len(list1) and j >= 0:
 if list1[i] + list2[j] == target:
 result.append((list1[i], list2[j]))
 i += 1
 j -= 1
 while i < len(list1) and list1[i] == list1[i-1]:
 i += 1
 while j >= 0 and list2[j] == list2[j+1]:
 j -= 1
 elif list1[i] + list2[j] > target:
 j -= 1
 else:
 i += 1
if __name__ == "__main__":
 list1 = [1,3,5,7]
 list2 = [2,3,4,5]
 result = []
 two_sum(list1, list2, 6, result)
 print result
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jan 19, 2017 at 23:14
\$\endgroup\$

4 Answers 4

4
\$\begingroup\$

Assuming that your lists are sorted and this is something that comes out of problem description and the part where you need to find something in 99% of that kind of questions you want to use binary search. So basically all your code could be written like this:

from bisect import bisect_left
def binary_search(a, x):
 pos = bisect_left(a, x)
 return pos if pos != len(a) and a[pos] == x else -1
def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 result.append((num, b[index]))
 return result

Now if you want to save some memory, you might want to make two_sum a generator, which will make it look like this:

def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 yield num, b[index]

I cannot really call my answer a review to your code because I completely overwrote a solution for this. But as I mentioned in the beginning whenever a problem says something about sorted lists and searching on it, most likely you will use bsearch in your solution.

answered Jan 20, 2017 at 10:08
\$\endgroup\$
6
  • \$\begingroup\$ Nice idea, Alex, vote up. Your algorithm time complexity is O(n log n)? But I think my algorithm time complexity is only O(n)? \$\endgroup\$ Commented Jan 20, 2017 at 21:59
  • 1
    \$\begingroup\$ yes it is. This one is O(n*log(m)) where m is a length of list on which you will do bsearch. \$\endgroup\$ Commented Jan 22, 2017 at 9:24
  • \$\begingroup\$ Thanks Alex, but since log(m) > 1, so overall n * log(m) > n, correct? If so, a bit surprised binary search is slower in this case, since in most cases, binary search is the most efficient way. Any thoughts? \$\endgroup\$ Commented Jan 23, 2017 at 5:07
  • 1
    \$\begingroup\$ @LinMa it is slower in my current implementation, but you can actually modify it. so say for a number with index X in list A you've found a pair with index Y on list B. then for number X+1 in list A you do bsearch on slice Y:len(B) and so on. But in the worst case, you will still get your n*log(m). I think if I will care about code complexity I will just use something that you wrote, but if not I will go for bsearch solution because it's easier to read and understand even thou it's slower in this case. \$\endgroup\$ Commented Jan 23, 2017 at 9:45
  • \$\begingroup\$ Thanks Alex, I think the code I write (original post) time complexity is O(m+n), where m and n are length of two lists? \$\endgroup\$ Commented Feb 6, 2017 at 3:33
3
\$\begingroup\$

I re-constructed your algorithm using generators, which I find easier on the eyes (instead of indexing) for these sort of list traversals. Other than that, the code is the same.

def two_sum(list1, list2, target):
 # Get a generator for each list
 l1 = iter(list1)
 l2 = reversed(list2)
 # loop and append to results list
 result = []
 try:
 # get the first sample from each list
 x = next(l1)
 y = next(l2)
 while True:
 # If we find a match, record it
 if x + y == target:
 new_pair = x, y
 result.append(new_pair)
 # get next unique elements
 x = next(l1)
 while x == new_pair[0]:
 x = l1.next()
 y = next(l2)
 while y == new_pair[1]:
 y = next(l2)
 # if no match, then get new element from one list
 elif x + y > target:
 y = next(l2)
 else:
 x = next(l1)
 # when one of the generators runs out of elements it will assert
 except StopIteration:
 pass
 return result
print(two_sum([1, 3, 5, 7], [2, 3, 3, 5], 6))
answered Jan 20, 2017 at 0:04
\$\endgroup\$
6
  • \$\begingroup\$ Thanks Stephen, nice re-write and vote up. Since we are using same method, do you think any issues in our method -- issues I mean functional bugs, e.g. we may miss some pairs, we may return some wrong pairs, or return some duplicate pairs? \$\endgroup\$ Commented Jan 20, 2017 at 5:34
  • 1
    \$\begingroup\$ I think it is correct. I changed your test data to include a duplicate pair. \$\endgroup\$ Commented Jan 20, 2017 at 5:39
  • 2
    \$\begingroup\$ (xx for xx in list1) ==iter(list1), (yy for yy in reversed(list2))==reversed(list2) since reversed is already an iterator. Now you should use function next instead of method .next, since this is more common way in python. so l1.next becomes next(l1). \$\endgroup\$ Commented Jan 20, 2017 at 10:16
  • \$\begingroup\$ @StephenRauch, have you read Alex's binary search? Actually I might be wrong, I think our method time complexity is O(n), but using binary search need a loop (O(n)) and binary search inside the loop (O(log n)), which is O(n log n)? Which is even slower? How do you think? \$\endgroup\$ Commented Jan 20, 2017 at 22:06
  • \$\begingroup\$ Thanks Stephen, but I think binary search based solution time complexity is not O(n) and should be O(n log n), correct? \$\endgroup\$ Commented Jan 22, 2017 at 23:32
2
\$\begingroup\$

Hah, Alex was faster with the binary search version :) I submit my version because we can exploit the fact both lists are ordered, so you should use a second bsearch moreover you don't have to do binary search over the whole set, we can use the upper subset.

If the lists are large enough, this algorithm supposed to be much faster.

Please forgive me the C-ish code, I don't know python, I've just tried to find out the syntax from your code. :) Maybe you can Pythonize it. (Yes, that's why I implemented my binary search, I don't know the libs )

Especially I couldn't find an effective way for handle sublists. I think list[from:to] would memcpy the data which is a huge burden (maybe I'm not right). This way I had to simulate C pointers in the following code.

# find the greatest index of element 
# list where element is not greater than target
def bsearch(l, target, s, e):
 m = int((s+e)/2)
 if l[m] == target or e == m or s == m :
 return m
 if l[m] > target :
 return bsearch(l,target,s,m)
 return bsearch(l,target,m,e)
def two_sum(l1,l2,target):
 l2 = [target-x for x in reversed(l2)]
 b2 = 0
 b1 = 0
 t = l1[0]
 res = []
 while True:
 b2 = bsearch(l2, t, b2 ,len(l2)-1)
 if l2[b2] == t :
 res.append((t,target-t))
 b2 += 1
 if b2 >= len(l2): return res
 t = l2[b2]
 b1 = bsearch(l1, t, b1 ,len(l1)-1)
 if l1[b1] == t :
 res.append((target-t,t))
 b1 += 1
 if b1 >= len(l1): return res
 t = l1[b1]
l1=[1,3,6,7,10,18,19,22,28,30]
l2=[5,8,9,11,18,19,21,32,34]
print(two_sum(l1,l2,26))

Alternatively:

def two_sum(A,B,target):
 L = B = [target-x for x in reversed(B)]
 u = 0 # list pointers
 p = [0,0]
 resA = [] # results
 res = resB = []
 t = A[0]
 while True:
 print (L, u,':', p[u], t, resA, resB)
 p[u] = bsearch(L, t, p[u] ,len(L)-1)
 if L[p[u]] == t :
 res.append(t)
 p[u] += 1
 if p[u] >= len(L): 
 return [(target-x,x) for x in resA] + [(x,target-x) for x in resB]
 t = L[p[u]]
 # swap everything
 res = resA if res==resB else resB
 L = A if L==B else B
 u = 0 if u==1 else 1

However, in lack of real pointers it's a bit messy. I had to use the form p[u] instead of the clean p, because integers seems to be immutable in python. Of course len() should be factored out from the loop, it's just for simplicity.

answered Jan 20, 2017 at 14:59
\$\endgroup\$
5
  • \$\begingroup\$ Thanks goteguru, have a chance to study your code today. Confused by two things, (1) for line b2 = bsearch(l2, t, b2 ,len(l2)-1), you search inside the 2nd list? Why? I think we need one element from the first list and the other element from the 2nd list, whose sum equal to a given numbers, (2) for this line l2 = [target-x for x in reversed(l2)], I do not see a strong benefit of reverse. If I read your code wrong, please feel free to correct me. \$\endgroup\$ Commented Jan 22, 2017 at 23:34
  • \$\begingroup\$ BTW, vote up your posts, which gives me further thinking. \$\endgroup\$ Commented Jan 22, 2017 at 23:34
  • \$\begingroup\$ Transformation (2) is needed for transforming the problem space from "finding fixed sum" to "finding equal numbers". Reversion is needed to get similar lists (eg. both increasing). We could have used two different bsearch functions, but I think this way the algo is cleaner. \$\endgroup\$ Commented Jan 25, 2017 at 10:49
  • \$\begingroup\$ For (1) yes, the second list is searched. The algorithm is the following: take first element i from list A, bsearch for k | k∈B, k<=i in list B. If k=i report it. Drop starting elements from list B to k (you can be sure, there are no more solutions here, because of the ordering). Now let i =k, exchange the lists and repeat the bsearching/exchanging steps until there are any elements left. Here I didn't really exchange the lists but did the steps sequentially because of the simplification of result appending (I did a trick here :). \$\endgroup\$ Commented Jan 25, 2017 at 11:02
  • \$\begingroup\$ I've added an another implementation with the above mentioned list swap, but I had to tinker with those immutable integers to make them mutable :-) \$\endgroup\$ Commented Jan 25, 2017 at 12:15
2
\$\begingroup\$

Dictionary lookups in Python are usually \$O(1)\$ and going through a list once is \$O(n)\,ドル so I would do something like this:

def two_sum(list1, list2, target, result):
 results = []
 list2_dict = {n:1 for n in list2}
 for num in list1:
 if (list2_dict.get(target-num, None) != None):
 results.append([num, target-num])
 return results
answered Jan 25, 2017 at 13:42
\$\endgroup\$
2
  • \$\begingroup\$ Thanks ChatterOne for the response and vote up, but I think your implementation space complexity is O(n), which seems larger than the other solutions post here, including mine, correct. :) \$\endgroup\$ Commented Feb 6, 2017 at 3:34
  • \$\begingroup\$ Yes, its space complexity is \$O(n)\,ドル but as I said in the other comment about which to choose, "it depends" :-) There were no space constraints in the specifications and if you want, once you build the dictionary you can re-use it for the same values. \$\endgroup\$ Commented Feb 6, 2017 at 8:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.