Here's my solution for the LeetCode's Two Sum problem.
Problem:
Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Example:
Given
nums = [2, 7, 11, 15], target = 9
Because
nums[0] + nums[1] = 2 + 7 = 9
,return [0, 1]
My solution:
def twoSum(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
num_lst = list(range(len(nums)))
for indx, num in enumerate(num_lst):
for num_other in num_lst[indx+1:]:
if nums[num] + nums[num_other] == target:
return [num, num_other]
else:
continue
return None
I would love feedback on code efficiency and style/formatting.
-
2\$\begingroup\$ Would any of the lists contain negative numbers? \$\endgroup\$MSeifert– MSeifert2019年01月26日 14:00:50 +00:00Commented Jan 26, 2019 at 14:00
-
\$\begingroup\$ @MSeifert I don't know, but feel free to assume yes \$\endgroup\$Mysterious Otter– Mysterious Otter2019年01月27日 18:06:39 +00:00Commented Jan 27, 2019 at 18:06
3 Answers 3
Code Style
Your code contains a few lines that accomplish nothing and obfuscate your intent:
else: continue
If the conditional is false, you'll automatically
continue
on to the next iteration without having to tell the program to do that.return None
All Python functions implicitly
return None
. While PEP 8 appears to endorse this practice ("explicit is better than implicit"), it seems noisy to me.num_lst = list(range(len(nums)))
effectively generates a list of all the indices in thenums
input list. Then, you immediatelyenumerate
this list, which produces pairs of identical indicesindx, num
. If all you're attempting to do is iterate, this is significant obfuscation; simply callenumerate
directly onnums
to produce index-element tuples:def twoSum(self, nums, target): for i, num in enumerate(nums): for j in range(i + 1, len(nums)): if num + nums[j] == target: return [i, j]
This makes the intent much clearer: there are no duplicate variables with different names representing the same thing. It also saves unnecessary space and overhead associated with creating a list from a range.
- Following on the previous item,
indx, num
andnum_lst
are confusing variable names, especially when they're all actually indices (which are technically numbers).
Efficiency
This code is inefficient, running in quadratic time, or \$\mathcal{O}(n^2)\$. Leetcode is generous to let this pass (but won't be so forgiving in the future!). The reason for this is the nested loop; for every element in your list, you iterate over every other element to draw comparisons. A linear solution should finish in ~65 ms, while this takes ~4400 ms.
Here is an efficient solution that runs in \$\mathcal{O}(n)\$ time:
hist = {} for i, n in enumerate(nums): if target - n in hist: return [hist[target-n], i] hist[n] = i
How does this work? The magic of hashing. The dictionary
hist
offers constant \$\mathcal{O}(1)\$ lookup time. Whenever we visit a new element innums
, we check to see if its sum complement is in the dictionary; else, we store it in the dictionary as anum => index
pair.This is the classic time-space tradeoff: the quadratic solution is slow but space efficient, while this solution takes more space but gains a huge boost in speed. In almost every case, choose speed over space.
For completeness, even if you were in a space-constrained environment, there is a fast solution that uses \$\mathcal{O}(1)\$ space and \$\mathcal{O}(n\log{}n)\$ time. This solution is worth knowing about for the practicality of the technique and the fact that it's a common interview follow-up. The way it works is:
- Sort
nums
. - Create two pointers representing an index at 0 and an index at
len(nums) - 1
. - Sum the elements at the pointers.
- If they produce the desired sum, return the pointer indices.
- Otherwise, if the sum is less than the target, increment the left pointer
- Otherwise, decrement the right pointer.
- Go back to step 3 unless the pointers are pointing to the same element, in which case return failure.
- Sort
Be wary of list slicing; it's often a hidden linear performance hit. Removing this slice as the nested loop code above illustrates doesn't improve the quadratic time complexity, but it does reduce overhead.
Now you're ready to try 3 Sum!
-
2\$\begingroup\$ As for returning
None
, see the relevant section of PEP 8. \$\endgroup\$Solomon Ucko– Solomon Ucko2019年01月26日 01:48:42 +00:00Commented Jan 26, 2019 at 1:48 -
\$\begingroup\$ That's true, Python does say "explicit is better than implicit". I can amend my recommendation to be "at least be aware that Python statements implicitly
return None
". Maybe Python should also putelse: continue
at the end of every loop, just to be explicit :-) \$\endgroup\$ggorlen– ggorlen2019年01月26日 01:58:35 +00:00Commented Jan 26, 2019 at 1:58 -
\$\begingroup\$ Python should, but doesn't know not to copy the entire thing. It has no such optimisation. \$\endgroup\$wizzwizz4– wizzwizz42019年01月26日 16:44:13 +00:00Commented Jan 26, 2019 at 16:44
-
\$\begingroup\$ @wizzwizz4 No lazy copying? E.g. return a pointer in O(1) to the slice element and then wait for mutation to perform a copy? I'd like to update if incorrect here. \$\endgroup\$ggorlen– ggorlen2019年01月26日 17:20:29 +00:00Commented Jan 26, 2019 at 17:20
-
\$\begingroup\$ @ggorlen Apparently not. Try it online! Rule of thumb: Python performs no optimisations at all. \$\endgroup\$wizzwizz4– wizzwizz42019年01月26日 17:54:08 +00:00Commented Jan 26, 2019 at 17:54
num_lst = list(range(len(nums)))
for indx, num in enumerate(num_lst):
I'm not sure if I'm missing something, but I think not. I ran this code
nums = [2,5,7,9]
num_lst = list(range(len(nums)))
list(enumerate(num_lst))
output : [(0, 0), (1, 1), (2, 2), (3, 3)]
So why do you create the list and then enumerate it? Maybe what you want to do is simply : enumerate(nums)
then enumerate(nums[index+1:])
on your other loop? A simpler way would be to only use the ranges, as I'll show below.
Also, given your input, there's a possibility that a single number would be higher than the target, in this case you shouldn't make the second iteration.
You also don't need the else: continue
, as it's going to continue
either way.
I'd end up with :
def twoSum(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
for i1 in range(len(nums)):
if nums[i1] > target:
continue
for i2 in range(i1 + 1, len(nums)):
if nums[i1] + nums[i2] == target:
return [i1, i2]
return None
Without knowing the expected input size, it's hard to focus on performance improvements. The main goal of my review was to remove what seemed like a misunderstanding in your code and in my opinion the code is clearer now.
-
1\$\begingroup\$ Your code is still
O(n**2)
, so I wouldn't say it offers any significant performance boost. \$\endgroup\$Eric Duminil– Eric Duminil2019年01月26日 15:12:51 +00:00Commented Jan 26, 2019 at 15:12 -
\$\begingroup\$ Also, your code has a few bugs. It doesn't work with negative numbers, it doesn't even work with
0
reliably (twoSum([2,0], 2)
) and it uses the same number twice (twoSum([1, 1], 2)
). :-/ \$\endgroup\$Eric Duminil– Eric Duminil2019年01月26日 15:16:52 +00:00Commented Jan 26, 2019 at 15:16 -
\$\begingroup\$ @EricDuminil The latter is fine; number != element. \$\endgroup\$wizzwizz4– wizzwizz42019年01月26日 16:46:20 +00:00Commented Jan 26, 2019 at 16:46
-
1\$\begingroup\$ @wizzwizz4: Thanks for the comment. You're right. I meant to write
twoSum([1], 2)
, which should returnNone
, not[0, 0]
. The bug is here, my description was incorrect. \$\endgroup\$Eric Duminil– Eric Duminil2019年01月26日 16:53:52 +00:00Commented Jan 26, 2019 at 16:53 -
\$\begingroup\$ @EricDuminil I mainly wanted to focus on some of the bloat to simplify it, went fast and introduced bugs, lol. And without the expected size of input it's hard to tell if there's a real performance value to my answer (and to any other one at that, if we always expect 4 numbers, performance isn't really an issue). I also wrongfully assumed that we dealt with positive non-zero integers. \$\endgroup\$IEatBagels– IEatBagels2019年01月29日 19:23:12 +00:00Commented Jan 29, 2019 at 19:23
You can use itertools.combinations for a more readable (and likely faster) for
loop. As long as returning a list
isn't a requirement, I would consider it better style to return a tuple
instead. (Especially since it allows you to convey the list length.) Also, as long as the current name isn't a requirement, it is preferable to use snake_case
for function and variable names.
from itertools import combinations
def twoSum(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: List[int]
"""
for (i, first), (j, second) in combinations(enumerate(nums), 2):
if first + second == target:
return [i, j]
return None
-
1\$\begingroup\$ You don't need to create
num_list
anymore. Also,combinations
requires (at least in Python 3.6) a second argumentr
which specifies the length of the combinations. Here,r
should be 2. \$\endgroup\$Schmuddi– Schmuddi2019年01月26日 09:31:00 +00:00Commented Jan 26, 2019 at 9:31