The following function sum_pairs(ints, s)
takes two inputs: ints
is a list with integer values and s
is a integer number. The function sum_pairs
returns the values (x, y)
in ints
such that \$s = x + y\$ and the index of y
is the lowest for the (x, y)
in ints
such that \$s = x + y\$. This is a problem taken from codewars. The function passes all the test cases. However, when I attempt to run the code for a big list, the server codewars gives me the following answer: "Process was terminated. It took longer than 12000ms to complete".
I am wondering whether the code I have written to solve the problem can be optimised or should I use a different approach to give a solution.
def sum_pairs(ints, s):
begin = 1
end = len(ints)
memo = []
while begin < end:
try:
memo.append([begin-1, ints.index(s - ints[begin-1], begin, end)])
begin += 1
end = memo[-1][1]
except:
begin +=1
if memo == []:
return None
else:
return [ints[memo[-1][0]], ints[memo[-1][1]]]
The function sum_pairs(ints, s)
searches for \$y = s - x\$ in the list ints
using the function index()
. If a value of y
is found the index for the pair (x,y)
is added to memo
, then the range for the search (begin, end)
are updated. Else only the lower limit begin
for the search is updated.
-
\$\begingroup\$ sort and you should be able to do in in O(n logn) \$\endgroup\$paparazzo– paparazzo2017年08月08日 14:58:17 +00:00Commented Aug 8, 2017 at 14:58
3 Answers 3
Your code takes too long, because to check if a number is in array you use .index
, effectively iterating over the whole array. However, since you need lots of these lookups, you can speed them up with more suitable data structure - set
will do:
def sum_pairs_lookup(ints, s):
waiting_for_pair = set()
for y in ints:
if (s - y) in waiting_for_pair:
return [s - y, y]
waiting_for_pair.add(y)
Also instead of looking for corresponding y
, you can look for corresponding x
, starting from the array's beginning. In this case the first match would give you the result.
Edit: changed to accomodate points raised in comments, current version passes tests
-
\$\begingroup\$ @ Daerdemandt Beautiful ! it works like a charm, as you have said. My first approach didn't perform very well not only because the data structure I used. I think the main problem was the O(^2) algorithm involved. If you need to look for a particular element on a data structure with a big number of elements do you think always it's a good idea to transform the problem in a
set
problem ? \$\endgroup\$Masaguaro– Masaguaro2017年08月10日 14:28:00 +00:00Commented Aug 10, 2017 at 14:28 -
\$\begingroup\$ @ Daerdemandt is there any faster that
a in set()
\$\endgroup\$Masaguaro– Masaguaro2017年08月10日 14:41:49 +00:00Commented Aug 10, 2017 at 14:41 -
\$\begingroup\$ @Masaguaro I wouldn't say
set
ordict
are always the best solution, but quite often they are. Under the hood they are hash-based, which gives us sweet O(1) lookups most of the time. More than that, you should expect any collaborator to be familiar with them, so I think they (and their common derivatives) should be the go-to solution. However, be aware of tradeoffs, sometimes, say, Bloom filter is what you need. \$\endgroup\$Daerdemandt– Daerdemandt2017年08月10日 17:56:49 +00:00Commented Aug 10, 2017 at 17:56
Your code's complexity is \$O(n^2)\,ドル this means that worst case it's on-par with looping through ints
in a nested loop, and returning the maximum for x
, or the minimum for y
. And so it's on-par with:
def sum_pairs(ints, s):
try:
return max(
(x, y)
for x in ints
for y in ints
if x + y = s
)
except ValueError:
return None
You could speed this up, if ints
is sorted, which your code implies, then you could change max
to next
, and flip the x and y. However, this still has \$O(n^2)\$ time.
The simplest way to force \$O(n)\$ time is to use a deque, and consume both ends of the sorted deque. This works by removing the tail of the deque to the point that \$x + y \le s\$. After this if \$x + y = s\,ドル then we return that. If not we take the next item at the head of the deque. Something like:
from collections import deque
def sum_pairs(ints, s):
q = deque(ints)
try:
while True:
y = q.popleft()
n = s - y
while q[-1] > n:
q.pop()
if q[-1] == n:
return q[-1], y
except IndexError:
return None
-
\$\begingroup\$ Is normal control flow with exceptions considered normal / good style in Python? \$\endgroup\$Daniel Jour– Daniel Jour2017年08月08日 14:40:53 +00:00Commented Aug 8, 2017 at 14:40
-
\$\begingroup\$ @DanielJour It's not bad style - "What is the EAFP principle in Python?". But they can be slow, however since we go to the except one, if there are no matches, I don't think the clean-up speed matters too much here. In the code above not using exceptions would be harder to read IMO. I think using LBYL would also be slower here. \$\endgroup\$2017年08月08日 14:47:18 +00:00Commented Aug 8, 2017 at 14:47
-
\$\begingroup\$ @peilonrayz is good idea to use more efficient data structures. However, If your run your script with
ints = [5, 3, 7, 7]
ands = 10
the result will be None, but the right result for this case is[3,7]
\$\endgroup\$Masaguaro– Masaguaro2017年08月09日 07:46:37 +00:00Commented Aug 9, 2017 at 7:46 -
\$\begingroup\$ @Masaguaro That as it assumes sorted input - "consume both ends of the sorted deque." \$\endgroup\$2017年08月09日 08:04:25 +00:00Commented Aug 9, 2017 at 8:04
-
\$\begingroup\$ @Peilonrayz it needs always to take (consume) the left hand side element
x
of the listints
to combine with the right hand side elementy
of the list such asy = s - x
. As you can see in the example that I gave you, for the first element5
there is not an elementy
on the right hand side of the list such asy = s -x
. So we need to keep those elements of the list to combine with the second element3
, and so on. \$\endgroup\$Masaguaro– Masaguaro2017年08月09日 08:19:14 +00:00Commented Aug 9, 2017 at 8:19
To expose more the logic, this is a list version code of the @Daerdemandt idea. Two lines shorter but not as fast as as it was required:
def sum_pairs(ints, s):
for i in range(1,len(ints)):
if (s - ints[i]) in ints[0:i]:
return [s - ints[i], ints[i]]
-
\$\begingroup\$ Sorry @TobySpeight, perhaps i didn't do it right. It doesn't improve the solution. It's just another version using another data structure.... I can delete if it's necessary .. \$\endgroup\$Masaguaro– Masaguaro2017年08月10日 15:09:05 +00:00Commented Aug 10, 2017 at 15:09
-
\$\begingroup\$ The reasoning seems clear to me. \$\endgroup\$janos– janos2017年08月10日 15:17:53 +00:00Commented Aug 10, 2017 at 15:17
Explore related questions
See similar questions with these tags.