Converting string to a set of integers

Question 1

So I get a user input of a lot of numbers in a single line like this

100 100 200 300 300 300 400 500 500 700.

I want to store the distinct ones in a list. A simple approach is to do this:

nums = input().split(' ')
myList = list(set([(int(i) for i in nums]))

My question is, is this a faster/better way:

nums = input().split(' ')
setList = set()
myList = []
for i in nums:
 if int(i) not in setList:
 myList.append(int(i))
 setList.add(int(i))

Question 2

No, it is not. The reason is that list comprehensions are in general as fast as or faster than the explicit loop, because they might be performed in C, while the explicit loop will always execute in Python. Also, the set constructor already takes care of the i not in setList part and you need to traverse the whole list anyways.

Here is a shorter one-liner:

numbers = list(set(map(int, input().split())))

I used map instead of a list comprehension, but you could also use a generator expression:

numbers = list(set(int(i) for i in input().split()))

Note that str.split by default splits on whitespace, so there is no need to specify split(' '), unless you want a different result for strings like 100\n200(newline) or 100 200 (double space).

As a timing comparisons, here are the four proposals as functions taking an input string and returning the desired list:

def op1(s):
 nums = s.split(' ')
 return list(set([int(i) for i in nums]))
def op2(s):
 nums = s.split(' ')
 setList = set()
 myList = []
 for i in nums:
 if int(i) not in setList:
 myList.append(int(i))
 setList.add(int(i))
 return myList
def graipher1(s):
 return list(set(map(int, s.split())))
def graipher2(s):
 return list(set(int(i) for i in s.split()))

And this is how I generated a bunch of random inputs of increasing size:

import random
inputs = [" ".join(map(str, [random.randrange(1000) for _ in range(i)])) for i in range(1, 2000, 50)]

With this, I ran the four functions to get this timing graph as a function of the number of numbers in the input:

enter image description here

Note that your first function performs almost as good as a straight set comprehension. Your alternative proposal is always the worst of all possibilities. The version with map is the fastest of the approaches I tried.

(Note that my timings were done using Python 2.7, so your mileage may vary, but I am confident that your second proposal will still loose.)

Question 3

Hi, you are correct. The approach 1 was significantly faster and I even added a sort() method because set() distorts the order and even after that, approach 1 (which is graphier1 in your answer) was the fastest. Thank you very much for the explanation.

Question 4

@ParthBhoiwala Updated the graph for higher number of numbers

Question 5

After researching online, you were correct, list comprehensions are performed in C which makes it significantly faster than for-loops. This is very interesting!

Graipher Graipher 41.6k7 gold badges70 silver badges134 bronze badges · Accepted Answer · 2017-07-06 15:44:25Z

No, it is not. The reason is that list comprehensions are in general as fast as or faster than the explicit loop, because they might be performed in C, while the explicit loop will always execute in Python. Also, the set constructor already takes care of the i not in setList part and you need to traverse the whole list anyways.

Here is a shorter one-liner:

numbers = list(set(map(int, input().split())))

I used map instead of a list comprehension, but you could also use a generator expression:

numbers = list(set(int(i) for i in input().split()))

Note that str.split by default splits on whitespace, so there is no need to specify split(' '), unless you want a different result for strings like 100\n200(newline) or 100 200 (double space).

As a timing comparisons, here are the four proposals as functions taking an input string and returning the desired list:

def op1(s):
 nums = s.split(' ')
 return list(set([int(i) for i in nums]))
def op2(s):
 nums = s.split(' ')
 setList = set()
 myList = []
 for i in nums:
 if int(i) not in setList:
 myList.append(int(i))
 setList.add(int(i))
 return myList
def graipher1(s):
 return list(set(map(int, s.split())))
def graipher2(s):
 return list(set(int(i) for i in s.split()))

And this is how I generated a bunch of random inputs of increasing size:

import random
inputs = [" ".join(map(str, [random.randrange(1000) for _ in range(i)])) for i in range(1, 2000, 50)]

With this, I ran the four functions to get this timing graph as a function of the number of numbers in the input:

enter image description here

Note that your first function performs almost as good as a straight set comprehension. Your alternative proposal is always the worst of all possibilities. The version with map is the fastest of the approaches I tried.

(Note that my timings were done using Python 2.7, so your mileage may vary, but I am confident that your second proposal will still loose.)

Hi, you are correct. The approach 1 was significantly faster and I even added a sort() method because set() distorts the order and even after that, approach 1 (which is graphier1 in your answer) was the fastest. Thank you very much for the explanation.
@ParthBhoiwala Updated the graph for higher number of numbers
After researching online, you were correct, list comprehensions are performed in C which makes it significantly faster than for-loops. This is very interesting!

Stack Exchange Network

Converting string to a set of integers

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Converting string to a set of integers

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions