2
\$\begingroup\$

Working on the 2 element subset sum problem (finding two numbers in a list that sum to a given value), I came up with a relatively short solution code-wise.

The code could fit on a single line but I'm not super familiar with python so is this is a naive way to solve/implement it?

How efficient is creating a set and intersecting it with a list vs using a loop?

l = [10, 7, 11, 3, 5, 11] 
k = 22
#create a set of the differences between K and L including duplicate elements that sum to K
diff = {k-x for i, x in enumerate(l) if k != 2*x or x in l[i+1:len(l)]}
#Return true if an element of Diff is in L
print(bool(diff.intersection(l)))
asked May 3, 2019 at 19:14
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

Usually Python is not terribly fast when it comes to "hand-written" loops. So my prediction (also supported by some preliminary profiling using timeit, see below) would be that a "hand-written" loop is slower than the C loop used in the implementation of set.intersection. The effect should/will become more significant for larger lists.

A quick non-performance related note: you can substitute l[i+1:len(l)] by l[i+1:]. This is because the end of a slice defaults to the end of the sequence.

import timeit
SETUP = """
l = [10, 7, 11, 3, 5, 11]
k = 22
diff = {k-x for i, x in enumerate(l) if k != 2*x or x in l[i+1:]}
"""
print(
 sum(timeit.repeat("bool(diff.intersection(l))", setup=SETUP,
 repeat=10, number=100000)) / 10
)
print(
 sum(timeit.repeat("bool([i for i in l if i in diff])", setup=SETUP,
 repeat=10, number=100000)) / 10
)
answered May 4, 2019 at 13:52
\$\endgroup\$
1
\$\begingroup\$
#create a set of the differences between K and L including duplicate elements that sum to K
diff = {k-x for i, x in enumerate(l) if k != 2*x or x in l[i+1:len(l)]}

The slice can be simplified:

diff = {k-x for i, x in enumerate(l) if k != 2*x or x in l[i+1:]}

In my opinion, the or x in l[i+1:] is a bit too tricky to use without further explanation. My immediate reaction on seeing it was that it undermined the whole point of using a set by creating a quadratic worst case. On further reflection I see that (assuming Python is sensibly implemented, which I think is a safe assumption here) it doesn't, but it's better to preempt the doubt with a comment.

Alternatively, handle the special case separately by checking whether k is even and if so counting instances of k // 2.


As an optimisation, you can keep only the half of each pair in diff:

diff = {k-x for x in l if 2*x < k}

#Return true if an element of Diff is in L
print(bool(diff.intersection(l)))

This is a bit cryptic. I think it might be more pythonic to use any:

print(any(x in l if x in diff))

That's also potentially faster, because it can abort on finding the first solution.

answered May 4, 2019 at 14:21
\$\endgroup\$
1
  • \$\begingroup\$ i is used in the second condition in the original set construction so it cannot be dropped so easily at this stage in the optimization. \$\endgroup\$ Commented May 4, 2019 at 14:29

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.