So I get a user input of a lot of numbers in a single line like this
100 100 200 300 300 300 400 500 500 700
.
I want to store the distinct ones in a list. A simple approach is to do this:
nums = input().split(' ')
myList = list(set([(int(i) for i in nums]))
My question is, is this a faster/better way:
nums = input().split(' ')
setList = set()
myList = []
for i in nums:
if int(i) not in setList:
myList.append(int(i))
setList.add(int(i))
1 Answer 1
No, it is not. The reason is that list comprehensions are in general as fast as or faster than the explicit loop, because they might be performed in C, while the explicit loop will always execute in Python. Also, the set
constructor already takes care of the i not in setList
part and you need to traverse the whole list anyways.
Here is a shorter one-liner:
numbers = list(set(map(int, input().split())))
I used map
instead of a list comprehension, but you could also use a generator expression:
numbers = list(set(int(i) for i in input().split()))
Note that str.split
by default splits on whitespace, so there is no need to specify split(' ')
, unless you want a different result for strings like 100\n200
(newline) or 100 200
(double space).
As a timing comparisons, here are the four proposals as functions taking an input string and returning the desired list:
def op1(s):
nums = s.split(' ')
return list(set([int(i) for i in nums]))
def op2(s):
nums = s.split(' ')
setList = set()
myList = []
for i in nums:
if int(i) not in setList:
myList.append(int(i))
setList.add(int(i))
return myList
def graipher1(s):
return list(set(map(int, s.split())))
def graipher2(s):
return list(set(int(i) for i in s.split()))
And this is how I generated a bunch of random inputs of increasing size:
import random
inputs = [" ".join(map(str, [random.randrange(1000) for _ in range(i)])) for i in range(1, 2000, 50)]
With this, I ran the four functions to get this timing graph as a function of the number of numbers in the input:
Note that your first function performs almost as good as a straight set comprehension. Your alternative proposal is always the worst of all possibilities. The version with map
is the fastest of the approaches I tried.
(Note that my timings were done using Python 2.7, so your mileage may vary, but I am confident that your second proposal will still loose.)
-
\$\begingroup\$ Hi, you are correct. The approach 1 was significantly faster and I even added a
sort()
method becauseset()
distorts the order and even after that, approach 1 (which is graphier1 in your answer) was the fastest. Thank you very much for the explanation. \$\endgroup\$Parth Bhoiwala– Parth Bhoiwala2017年07月06日 15:49:06 +00:00Commented Jul 6, 2017 at 15:49 -
\$\begingroup\$ @ParthBhoiwala Updated the graph for higher number of numbers \$\endgroup\$Graipher– Graipher2017年07月06日 15:52:38 +00:00Commented Jul 6, 2017 at 15:52
-
1\$\begingroup\$ After researching online, you were correct, list comprehensions are performed in C which makes it significantly faster than for-loops. This is very interesting! \$\endgroup\$Parth Bhoiwala– Parth Bhoiwala2017年07月06日 15:59:02 +00:00Commented Jul 6, 2017 at 15:59