Only using randint, create a random list of unique numbers

Question 1

There are many sorting algorithms, so how about unsorting algorithms?

But how could one go about creating a random list of unique elements while not using shuffle, NumPy or any other ready made tools to do it?

def get_un_sorted(limit):
 numbers = list(range(limit))
 shuffled = [0] * limit
 for i in range(limit):
 index = randint(0, len(numbers) - 1)
 shuffled[i] = numbers[index]
 numbers.pop(index)
 return shuffled
def main():
 not_sorted = get_un_sorted(2 * 10 ** 4)
if __name__ == '__main__':
 main()

It doesn't seem so nifty to me and it is very slow, but it gets the work done.

Question 2

What does it mean "only using randint"? The code above uses other functions too, like range, list, etc? What are the other allowed and not allowed functions?

Question 3

To only use randint or perhaps lower functionality objects for seeding the list.

Question 4

You aren't using the built-in random.shuffle(), and you're making a copy rather than working in-place, but that's still a shuffle.

Question 5

So what I really should be asking is "review my shuffle" not "using randint.." ?

Question 6

Usually to shuffle you just need to swap elements. For each element in numbers find an index and swap that element with the element with the extracted index.

Question 7

Your question reminds me of a beautiful page that visually explains the efficacy of shuffling.

But here are the two main problems with your algorithm:

Your memory usage is bad, this is as you're not performing the shuffle in-place. As you have to make a new list, to keep the old intact and to make the new you need \$O(n)\$ memory. If you instead performed the shuffle in-place then you'd only need \$O(1)\$ memory.
You're removing from the middle of an array. As Pythons lists are internally arrays, when you remove anything that is not the item at the end, you have to shift all the previous data. This leads to list.pop having a worst case of \$O(n)\$ performance.

If you take this with the fact that you have to perform this action \$n\$ times, then you can quickly see this algorithm is \$O(n^2)\$.

Both the problems above have the solution of performing the shuffle in-place. To do this you want to swap the chosen item with the current index. So if we're on the first item, and we randomly pick the index four, then we'll swap the item at index zero and four. Or in Python:

i, j = 0, 4
numbers[i], numbers[j] = numbers[j], numbers[i]

Now that we can perform the swapping efficiently, you'd have to change what indexes you pick, when using randint. If we're on index 4 of 8, then you'd not want to pick an index below four.

I'd change your program to have a function shuffle that takes an array as input, swaps it in-place and returns None. And for you to use this rather than get_un_sorted.

And so you should be able to get something like, the Fisher–Yates shuffle:

def shuffle(l):
 for i in range(len(l) - 1):
 j = randint(i, len(l) - 1)
 l[i], l[j] = l[j], l[i]

Finally you could use randrange rather than randint as it's a short hand for having the upper bound be non-inclusive.

Question 8

You don't have to construct shuffled up front using [0] * limit. Instead, you can initialize shuffled to the empty list, and then append to it.
You use pop to remove an element from an arbitrary position in the list. I believe this takes O(n) time with respect to the length of the list. Thus, get_un_sorted requires O(n^2) time. That is probably why you are seeing slow performance.

Unfortunately, there isn't any easy way to get around this using Python's builtin types. However, the blist package provides a list-like datastructure with better asymptotic performance (O(log(n))) in this case.

Using a list-comprehension might increase the "niftyness factor".

def get_un_sorted(limit):
 numbers = list(range(limit))
 def pick_one():
 index = randint(0, len(numbers) - 1)
 return numbers.pop(index)
 return [pick_one() for i in range(limit)]

Though I wouldn't particularly recommend it -- performing mutations within a list comprehension is generally frowned upon.

Question 9

You can use randrange(a, b) instead of randint(a, b-1).

score 4 · Answer 1 · 2016-11-30 11:14:39Z

Your question reminds me of a beautiful page that visually explains the efficacy of shuffling.

But here are the two main problems with your algorithm:

Your memory usage is bad, this is as you're not performing the shuffle in-place. As you have to make a new list, to keep the old intact and to make the new you need \$O(n)\$ memory. If you instead performed the shuffle in-place then you'd only need \$O(1)\$ memory.
You're removing from the middle of an array. As Pythons lists are internally arrays, when you remove anything that is not the item at the end, you have to shift all the previous data. This leads to list.pop having a worst case of \$O(n)\$ performance.

If you take this with the fact that you have to perform this action \$n\$ times, then you can quickly see this algorithm is \$O(n^2)\$.

Both the problems above have the solution of performing the shuffle in-place. To do this you want to swap the chosen item with the current index. So if we're on the first item, and we randomly pick the index four, then we'll swap the item at index zero and four. Or in Python:

i, j = 0, 4
numbers[i], numbers[j] = numbers[j], numbers[i]

Now that we can perform the swapping efficiently, you'd have to change what indexes you pick, when using randint. If we're on index 4 of 8, then you'd not want to pick an index below four.

I'd change your program to have a function shuffle that takes an array as input, swaps it in-place and returns None. And for you to use this rather than get_un_sorted.

And so you should be able to get something like, the Fisher–Yates shuffle:

def shuffle(l):
 for i in range(len(l) - 1):
 j = randint(i, len(l) - 1)
 l[i], l[j] = l[j], l[i]

Finally you could use randrange rather than randint as it's a short hand for having the upper bound be non-inclusive.

Nathan Davis Nathan DavisNathan Davis 9054 silver badges14 bronze badges · Answer 2 · 2016-11-30 06:52:33Z

You don't have to construct shuffled up front using [0] * limit. Instead, you can initialize shuffled to the empty list, and then append to it.
You use pop to remove an element from an arbitrary position in the list. I believe this takes O(n) time with respect to the length of the list. Thus, get_un_sorted requires O(n^2) time. That is probably why you are seeing slow performance.

Unfortunately, there isn't any easy way to get around this using Python's builtin types. However, the blist package provides a list-like datastructure with better asymptotic performance (O(log(n))) in this case.

Using a list-comprehension might increase the "niftyness factor".

def get_un_sorted(limit):
 numbers = list(range(limit))
 def pick_one():
 index = randint(0, len(numbers) - 1)
 return numbers.pop(index)
 return [pick_one() for i in range(limit)]

Though I wouldn't particularly recommend it -- performing mutations within a list comprehension is generally frowned upon.

\$\begingroup\$ You can use randrange(a, b) instead of randint(a, b-1). \$\endgroup\$

301_Moved_Permanently
– 301_Moved_Permanently

2016年11月30日 08:15:52 +00:00
Commented Nov 30, 2016 at 8:15

Stack Exchange Network

Only using randint, create a random list of unique numbers

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Only using randint, create a random list of unique numbers

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions