Two sum with two lists

Question 1

Suppose we have two sorted lists, and we want to find one element from the first, and the other element from the 2nd list, where the sum of the two elements equal to a given target. If there are multiple combinations, need to find all the combinations.

Any bugs, performance improvement ideas in terms of algorithm time complexity or code style advice is highly appreciated. My basic idea is to scan the first list from the beginning, and scan the second list from the end, a similar approach to how we resolve the 2-sum classic problem in one list.

def two_sum(list1, list2, target, result):
 i = 0
 j = len(list2) - 1
 while i < len(list1) and j >= 0:
 if list1[i] + list2[j] == target:
 result.append((list1[i], list2[j]))
 i += 1
 j -= 1
 while i < len(list1) and list1[i] == list1[i-1]:
 i += 1
 while j >= 0 and list2[j] == list2[j+1]:
 j -= 1
 elif list1[i] + list2[j] > target:
 j -= 1
 else:
 i += 1
if __name__ == "__main__":
 list1 = [1,3,5,7]
 list2 = [2,3,4,5]
 result = []
 two_sum(list1, list2, 6, result)
 print result

Question 2

Assuming that your lists are sorted and this is something that comes out of problem description and the part where you need to find something in 99% of that kind of questions you want to use binary search. So basically all your code could be written like this:

from bisect import bisect_left
def binary_search(a, x):
 pos = bisect_left(a, x)
 return pos if pos != len(a) and a[pos] == x else -1
def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 result.append((num, b[index]))
 return result

Now if you want to save some memory, you might want to make two_sum a generator, which will make it look like this:

def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 yield num, b[index]

I cannot really call my answer a review to your code because I completely overwrote a solution for this. But as I mentioned in the beginning whenever a problem says something about sorted lists and searching on it, most likely you will use bsearch in your solution.

Question 3

Nice idea, Alex, vote up. Your algorithm time complexity is O(n log n)? But I think my algorithm time complexity is only O(n)?

Question 4

yes it is. This one is O(n*log(m)) where m is a length of list on which you will do bsearch.

Question 5

Thanks Alex, but since log(m) > 1, so overall n * log(m) > n, correct? If so, a bit surprised binary search is slower in this case, since in most cases, binary search is the most efficient way. Any thoughts?

Question 6

@LinMa it is slower in my current implementation, but you can actually modify it. so say for a number with index X in list A you've found a pair with index Y on list B. then for number X+1 in list A you do bsearch on slice Y:len(B) and so on. But in the worst case, you will still get your n*log(m). I think if I will care about code complexity I will just use something that you wrote, but if not I will go for bsearch solution because it's easier to read and understand even thou it's slower in this case.

Question 7

Thanks Alex, I think the code I write (original post) time complexity is O(m+n), where m and n are length of two lists?

Question 8

I re-constructed your algorithm using generators, which I find easier on the eyes (instead of indexing) for these sort of list traversals. Other than that, the code is the same.

def two_sum(list1, list2, target):
 # Get a generator for each list
 l1 = iter(list1)
 l2 = reversed(list2)
 # loop and append to results list
 result = []
 try:
 # get the first sample from each list
 x = next(l1)
 y = next(l2)
 while True:
 # If we find a match, record it
 if x + y == target:
 new_pair = x, y
 result.append(new_pair)
 # get next unique elements
 x = next(l1)
 while x == new_pair[0]:
 x = l1.next()
 y = next(l2)
 while y == new_pair[1]:
 y = next(l2)
 # if no match, then get new element from one list
 elif x + y > target:
 y = next(l2)
 else:
 x = next(l1)
 # when one of the generators runs out of elements it will assert
 except StopIteration:
 pass
 return result
print(two_sum([1, 3, 5, 7], [2, 3, 3, 5], 6))

Question 9

Thanks Stephen, nice re-write and vote up. Since we are using same method, do you think any issues in our method -- issues I mean functional bugs, e.g. we may miss some pairs, we may return some wrong pairs, or return some duplicate pairs?

Question 10

I think it is correct. I changed your test data to include a duplicate pair.

Question 11

(xx for xx in list1) ==iter(list1), (yy for yy in reversed(list2))==reversed(list2) since reversed is already an iterator. Now you should use function next instead of method .next, since this is more common way in python. so l1.next becomes next(l1).

Question 12

@StephenRauch, have you read Alex's binary search? Actually I might be wrong, I think our method time complexity is O(n), but using binary search need a loop (O(n)) and binary search inside the loop (O(log n)), which is O(n log n)? Which is even slower? How do you think?

Question 13

Thanks Stephen, but I think binary search based solution time complexity is not O(n) and should be O(n log n), correct?

Question 14

Hah, Alex was faster with the binary search version :) I submit my version because we can exploit the fact both lists are ordered, so you should use a second bsearch moreover you don't have to do binary search over the whole set, we can use the upper subset.

If the lists are large enough, this algorithm supposed to be much faster.

Please forgive me the C-ish code, I don't know python, I've just tried to find out the syntax from your code. :) Maybe you can Pythonize it. (Yes, that's why I implemented my binary search, I don't know the libs )

Especially I couldn't find an effective way for handle sublists. I think list[from:to] would memcpy the data which is a huge burden (maybe I'm not right). This way I had to simulate C pointers in the following code.

# find the greatest index of element 
# list where element is not greater than target
def bsearch(l, target, s, e):
 m = int((s+e)/2)
 if l[m] == target or e == m or s == m :
 return m
 if l[m] > target :
 return bsearch(l,target,s,m)
 return bsearch(l,target,m,e)
def two_sum(l1,l2,target):
 l2 = [target-x for x in reversed(l2)]
 b2 = 0
 b1 = 0
 t = l1[0]
 res = []
 while True:
 b2 = bsearch(l2, t, b2 ,len(l2)-1)
 if l2[b2] == t :
 res.append((t,target-t))
 b2 += 1
 if b2 >= len(l2): return res
 t = l2[b2]
 b1 = bsearch(l1, t, b1 ,len(l1)-1)
 if l1[b1] == t :
 res.append((target-t,t))
 b1 += 1
 if b1 >= len(l1): return res
 t = l1[b1]
l1=[1,3,6,7,10,18,19,22,28,30]
l2=[5,8,9,11,18,19,21,32,34]
print(two_sum(l1,l2,26))

Alternatively:

def two_sum(A,B,target):
 L = B = [target-x for x in reversed(B)]
 u = 0 # list pointers
 p = [0,0]
 resA = [] # results
 res = resB = []
 t = A[0]
 while True:
 print (L, u,':', p[u], t, resA, resB)
 p[u] = bsearch(L, t, p[u] ,len(L)-1)
 if L[p[u]] == t :
 res.append(t)
 p[u] += 1
 if p[u] >= len(L): 
 return [(target-x,x) for x in resA] + [(x,target-x) for x in resB]
 t = L[p[u]]
 # swap everything
 res = resA if res==resB else resB
 L = A if L==B else B
 u = 0 if u==1 else 1

However, in lack of real pointers it's a bit messy. I had to use the form p[u] instead of the clean p, because integers seems to be immutable in python. Of course len() should be factored out from the loop, it's just for simplicity.

Question 15

Thanks goteguru, have a chance to study your code today. Confused by two things, (1) for line b2 = bsearch(l2, t, b2 ,len(l2)-1), you search inside the 2nd list? Why? I think we need one element from the first list and the other element from the 2nd list, whose sum equal to a given numbers, (2) for this line l2 = [target-x for x in reversed(l2)], I do not see a strong benefit of reverse. If I read your code wrong, please feel free to correct me.

Question 16

BTW, vote up your posts, which gives me further thinking.

Question 17

Transformation (2) is needed for transforming the problem space from "finding fixed sum" to "finding equal numbers". Reversion is needed to get similar lists (eg. both increasing). We could have used two different bsearch functions, but I think this way the algo is cleaner.

Question 18

For (1) yes, the second list is searched. The algorithm is the following: take first element i from list A, bsearch for k | k∈B, k<=i in list B. If k=i report it. Drop starting elements from list B to k (you can be sure, there are no more solutions here, because of the ordering). Now let i =k, exchange the lists and repeat the bsearching/exchanging steps until there are any elements left. Here I didn't really exchange the lists but did the steps sequentially because of the simplification of result appending (I did a trick here :).

Question 19

I've added an another implementation with the above mentioned list swap, but I had to tinker with those immutable integers to make them mutable :-)

Question 20

Dictionary lookups in Python are usually \$O(1)\$ and going through a list once is \$O(n)\,ドル so I would do something like this:

def two_sum(list1, list2, target, result):
 results = []
 list2_dict = {n:1 for n in list2}
 for num in list1:
 if (list2_dict.get(target-num, None) != None):
 results.append([num, target-num])
 return results

Question 21

Thanks ChatterOne for the response and vote up, but I think your implementation space complexity is O(n), which seems larger than the other solutions post here, including mine, correct. :)

Question 22

Yes, its space complexity is \$O(n)\,ドル but as I said in the other comment about which to choose, "it depends" :-) There were no space constraints in the specifications and if you want, once you build the dictionary you can re-use it for the same values.

Alex Alex 1,6528 silver badges12 bronze badges · Answer 1 · 2017-01-20 10:08:03Z

Assuming that your lists are sorted and this is something that comes out of problem description and the part where you need to find something in 99% of that kind of questions you want to use binary search. So basically all your code could be written like this:

from bisect import bisect_left
def binary_search(a, x):
 pos = bisect_left(a, x)
 return pos if pos != len(a) and a[pos] == x else -1
def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 result.append((num, b[index]))
 return result

Now if you want to save some memory, you might want to make two_sum a generator, which will make it look like this:

def two_sum(a, b, target):
 result = []
 for num in a:
 index = binary_search(b, target-num)
 if index != -1:
 yield num, b[index]

I cannot really call my answer a review to your code because I completely overwrote a solution for this. But as I mentioned in the beginning whenever a problem says something about sorted lists and searching on it, most likely you will use bsearch in your solution.

Nice idea, Alex, vote up. Your algorithm time complexity is O(n log n)? But I think my algorithm time complexity is only O(n)?
yes it is. This one is O(n*log(m)) where m is a length of list on which you will do bsearch.
Thanks Alex, but since log(m) > 1, so overall n * log(m) > n, correct? If so, a bit surprised binary search is slower in this case, since in most cases, binary search is the most efficient way. Any thoughts?
@LinMa it is slower in my current implementation, but you can actually modify it. so say for a number with index X in list A you've found a pair with index Y on list B. then for number X+1 in list A you do bsearch on slice Y:len(B) and so on. But in the worst case, you will still get your n*log(m). I think if I will care about code complexity I will just use something that you wrote, but if not I will go for bsearch solution because it's easier to read and understand even thou it's slower in this case.
Thanks Alex, I think the code I write (original post) time complexity is O(m+n), where m and n are length of two lists?

Stephen Rauch Stephen Rauch 4,31412 gold badges24 silver badges36 bronze badges · Answer 2 · 2017-01-20 00:04:31Z

3

\$\begingroup\$

I re-constructed your algorithm using generators, which I find easier on the eyes (instead of indexing) for these sort of list traversals. Other than that, the code is the same.

def two_sum(list1, list2, target):
 # Get a generator for each list
 l1 = iter(list1)
 l2 = reversed(list2)
 # loop and append to results list
 result = []
 try:
 # get the first sample from each list
 x = next(l1)
 y = next(l2)
 while True:
 # If we find a match, record it
 if x + y == target:
 new_pair = x, y
 result.append(new_pair)
 # get next unique elements
 x = next(l1)
 while x == new_pair[0]:
 x = l1.next()
 y = next(l2)
 while y == new_pair[1]:
 y = next(l2)
 # if no match, then get new element from one list
 elif x + y > target:
 y = next(l2)
 else:
 x = next(l1)
 # when one of the generators runs out of elements it will assert
 except StopIteration:
 pass
 return result
print(two_sum([1, 3, 5, 7], [2, 3, 3, 5], 6))

Share

edited Jan 20, 2017 at 15:51

answered Jan 20, 2017 at 0:04

Stephen Rauch's user avatar

Stephen Rauch Stephen Rauch

4,31412 gold badges24 silver badges36 bronze badges

\$\endgroup\$

6

\$\begingroup\$ Thanks Stephen, nice re-write and vote up. Since we are using same method, do you think any issues in our method -- issues I mean functional bugs, e.g. we may miss some pairs, we may return some wrong pairs, or return some duplicate pairs? \$\endgroup\$

Lin Ma
– Lin Ma

2017年01月20日 05:34:05 +00:00
Commented Jan 20, 2017 at 5:34
1

\$\begingroup\$ I think it is correct. I changed your test data to include a duplicate pair. \$\endgroup\$

Stephen Rauch
– Stephen Rauch

2017年01月20日 05:39:16 +00:00
Commented Jan 20, 2017 at 5:39
2

\$\begingroup\$ (xx for xx in list1) ==iter(list1), (yy for yy in reversed(list2))==reversed(list2) since reversed is already an iterator. Now you should use function next instead of method .next, since this is more common way in python. so l1.next becomes next(l1). \$\endgroup\$

Alex
– Alex

2017年01月20日 10:16:56 +00:00
Commented Jan 20, 2017 at 10:16
\$\begingroup\$ @StephenRauch, have you read Alex's binary search? Actually I might be wrong, I think our method time complexity is O(n), but using binary search need a loop (O(n)) and binary search inside the loop (O(log n)), which is O(n log n)? Which is even slower? How do you think? \$\endgroup\$

Lin Ma
– Lin Ma

2017年01月20日 22:06:00 +00:00
Commented Jan 20, 2017 at 22:06
\$\begingroup\$ Thanks Stephen, but I think binary search based solution time complexity is not O(n) and should be O(n log n), correct? \$\endgroup\$

Lin Ma
– Lin Ma

2017年01月22日 23:32:15 +00:00
Commented Jan 22, 2017 at 23:32

| Show 1 more comment

goteguru goteguru 1214 bronze badges · Answer 3 · 2017-01-20 14:59:31Z

Hah, Alex was faster with the binary search version :) I submit my version because we can exploit the fact both lists are ordered, so you should use a second bsearch moreover you don't have to do binary search over the whole set, we can use the upper subset.

If the lists are large enough, this algorithm supposed to be much faster.

Please forgive me the C-ish code, I don't know python, I've just tried to find out the syntax from your code. :) Maybe you can Pythonize it. (Yes, that's why I implemented my binary search, I don't know the libs )

Especially I couldn't find an effective way for handle sublists. I think list[from:to] would memcpy the data which is a huge burden (maybe I'm not right). This way I had to simulate C pointers in the following code.

# find the greatest index of element 
# list where element is not greater than target
def bsearch(l, target, s, e):
 m = int((s+e)/2)
 if l[m] == target or e == m or s == m :
 return m
 if l[m] > target :
 return bsearch(l,target,s,m)
 return bsearch(l,target,m,e)
def two_sum(l1,l2,target):
 l2 = [target-x for x in reversed(l2)]
 b2 = 0
 b1 = 0
 t = l1[0]
 res = []
 while True:
 b2 = bsearch(l2, t, b2 ,len(l2)-1)
 if l2[b2] == t :
 res.append((t,target-t))
 b2 += 1
 if b2 >= len(l2): return res
 t = l2[b2]
 b1 = bsearch(l1, t, b1 ,len(l1)-1)
 if l1[b1] == t :
 res.append((target-t,t))
 b1 += 1
 if b1 >= len(l1): return res
 t = l1[b1]
l1=[1,3,6,7,10,18,19,22,28,30]
l2=[5,8,9,11,18,19,21,32,34]
print(two_sum(l1,l2,26))

Alternatively:

def two_sum(A,B,target):
 L = B = [target-x for x in reversed(B)]
 u = 0 # list pointers
 p = [0,0]
 resA = [] # results
 res = resB = []
 t = A[0]
 while True:
 print (L, u,':', p[u], t, resA, resB)
 p[u] = bsearch(L, t, p[u] ,len(L)-1)
 if L[p[u]] == t :
 res.append(t)
 p[u] += 1
 if p[u] >= len(L): 
 return [(target-x,x) for x in resA] + [(x,target-x) for x in resB]
 t = L[p[u]]
 # swap everything
 res = resA if res==resB else resB
 L = A if L==B else B
 u = 0 if u==1 else 1

However, in lack of real pointers it's a bit messy. I had to use the form p[u] instead of the clean p, because integers seems to be immutable in python. Of course len() should be factored out from the loop, it's just for simplicity.

Thanks goteguru, have a chance to study your code today. Confused by two things, (1) for line b2 = bsearch(l2, t, b2 ,len(l2)-1), you search inside the 2nd list? Why? I think we need one element from the first list and the other element from the 2nd list, whose sum equal to a given numbers, (2) for this line l2 = [target-x for x in reversed(l2)], I do not see a strong benefit of reverse. If I read your code wrong, please feel free to correct me.
Transformation (2) is needed for transforming the problem space from "finding fixed sum" to "finding equal numbers". Reversion is needed to get similar lists (eg. both increasing). We could have used two different bsearch functions, but I think this way the algo is cleaner.
For (1) yes, the second list is searched. The algorithm is the following: take first element i from list A, bsearch for k | k∈B, k<=i in list B. If k=i report it. Drop starting elements from list B to k (you can be sure, there are no more solutions here, because of the ordering). Now let i =k, exchange the lists and repeat the bsearching/exchanging steps until there are any elements left. Here I didn't really exchange the lists but did the steps sequentially because of the simplification of result appending (I did a trick here :).
I've added an another implementation with the above mentioned list swap, but I had to tinker with those immutable integers to make them mutable :-)

ChatterOne ChatterOne 2,85512 silver badges18 bronze badges · Answer 4 · 2017-01-25 13:42:55Z

2

\$\begingroup\$

Dictionary lookups in Python are usually \$O(1)\$ and going through a list once is \$O(n)\,ドル so I would do something like this:

def two_sum(list1, list2, target, result):
 results = []
 list2_dict = {n:1 for n in list2}
 for num in list1:
 if (list2_dict.get(target-num, None) != None):
 results.append([num, target-num])
 return results

Share

answered Jan 25, 2017 at 13:42

ChatterOne's user avatar

ChatterOne ChatterOne

2,85512 silver badges18 bronze badges

\$\endgroup\$

2

\$\begingroup\$ Thanks ChatterOne for the response and vote up, but I think your implementation space complexity is O(n), which seems larger than the other solutions post here, including mine, correct. :) \$\endgroup\$

Lin Ma
– Lin Ma

2017年02月06日 03:34:26 +00:00
Commented Feb 6, 2017 at 3:34
\$\begingroup\$ Yes, its space complexity is \$O(n)\,ドル but as I said in the other comment about which to choose, "it depends" :-) There were no space constraints in the specifications and if you want, once you build the dictionary you can re-use it for the same values. \$\endgroup\$

ChatterOne
– ChatterOne

2017年02月06日 08:00:54 +00:00
Commented Feb 6, 2017 at 8:00

Add a comment |

Stack Exchange Network

Two sum with two lists

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Two sum with two lists

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions