Simple sorting algorithm speed

Question 1

I was working on an assignment in my Python class to sort 10 numbers but I wanted to be able to sort any set of numbers be it large or small. The assignment didn't really cover any sorting algorithms other than bubble sort so I started working and decided to just do something simple find the smallest number and remove it then put it in a temp list and then assign that list to the original to make it sorted. I was able to compress this down into the first sorting.

But I wanted to see if I could make it faster by also looking for the max and popping them both out in separate lists. But I was told python runs on a single thread and there were far more things to run than the first.

So why is the second sort slightly faster despite having more instructions than the first?

I ran this on python 3.8

enter image description here

import time
import random
rand_list = []
for i in range(0,10000):
 n = random.randint(1,100000)
 rand_list.append(n)
print('Random List Creation Complete')
list_to_sort1 = rand_list[:]
list_to_sort2 = rand_list[:]
''' Sort Smallest to Largest'''
start_time = time.time()
templist=[]
while 0 < len(list_to_sort1):
 templist.append(list_to_sort1.pop(list_to_sort1.index(min(list_to_sort1))))
list_to_sort1 = templist
print('\nSorting Complete: My sort took', time.time() - start_time, "to run")
''' Sort Smallest to Largest'''
start_time = time.time()
min_half = []
max_half = []
max_pos = 0
min_pos = 0
while 0 < len(list_to_sort2):
 max_pos = list_to_sort2.index(max(list_to_sort2))
 min_pos = list_to_sort2.index(min(list_to_sort2))
 if len(list_to_sort2) != 1:
 if max_pos > min_pos:
 max_half.append(list_to_sort2.pop(max_pos))
 min_half.append(list_to_sort2.pop(min_pos))
 else:
 min_half.append(list_to_sort2.pop(min_pos))
 max_half.append(list_to_sort2.pop(max_pos))
 else:
 min_half.append(list_to_sort2.pop(0))
max_half.reverse()
list_to_sort2 = min_half + max_half
print('\nSorting Complete: My sort 2 took', time.time() - start_time, "to run")
time.sleep(20)

Question 2

If list_to_sort2 has an odd length, the code will throw an IndexError, "pop from empty list". This is because elements are popped from the list in pairs (min and max). When there is only one element remaining, the second pop will be from an empty list.

Question 3

@RootTwo there is a if / else statement to account for that already

Question 4

Welcome to CodeReview@SE, where the idea is to improve code and coders by peer review. You re-invented "double-ended selection sort" and present code that leaves room for improvement - fits perfectly. It is your explicit question (why is the second sort slightly faster despite having more processes?) that does not (quite) fit in here, as well as begging clarification: what processes are you referring to?

Question 5

@greybeard yes I just edited the question I meant to say why is it faster despite having more instructions.

Question 6

So why is the second sort slightly faster [...]

On my machine, the first algorithm is "faster" (1.588s vs. 1.62s). The number of samples you have taken (1) is too low to draw conclusions. For significant statistics, you should run at least 20 times and then compare the average execution time.

[...] slightly faster [...]

The difference between your two runs is 4 ms. The precision of the default clock on Windows is ~16 ms., so 4 ms is even within measuring tolerance.

[...] having more processes

You still only have 1 process. It's not that you find the minimum and maximum in parallel. It's done one after the another.

other than bubble sort

Bubble sort is typically an in-place algorithm, which means that it operates on one list only. Your implementation works on two lists, templist and list_to_sort1. I would not call this a bubble sort implementation.

BTW: Python's min() and index() functions seem to be pretty optimized. A real bubble sort is slower by a factor of ~10 on my machine:

list_to_sort3 = rand_list[:]
start_time = time.time()
for i in range(len(list_to_sort3)):
 for j in range(i):
 if list_to_sort3[j] > list_to_sort3[i]:
 list_to_sort3[j], list_to_sort3[i] = list_to_sort3[i], list_to_sort3[j]
print('\nSorting Complete: Bubble sort took', time.time() - start_time, "to run")

(comments) I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time.

In version 1 you have:

while 0 < len(list_to_sort1):
 templist.append(list_to_sort1.pop(list_to_sort1.index(min(list_to_sort1))))

This loop runs 10000 times, because it pop()s one item per loop. Thus it also calls min(), index() and append() 10000 times.

In version 2 you have:

while 0 < len(list_to_sort2):
 max_pos = list_to_sort2.index(max(list_to_sort2))
 min_pos = list_to_sort2.index(min(list_to_sort2))
 if len(list_to_sort2) != 1:
 if max_pos > min_pos:
 max_half.append(list_to_sort2.pop(max_pos))
 min_half.append(list_to_sort2.pop(min_pos))
 else:
 min_half.append(list_to_sort2.pop(min_pos))
 max_half.append(list_to_sort2.pop(max_pos))
 else:
 min_half.append(list_to_sort2.pop(0))

This loop only runs 5000 times, because you pop() twice per loop (either in if or in else). Thus it calls min(), max(), index() (twice), append() (twice) also 5000 times.

As you can see, the amount of method calls is the same, although it's more lines of written code.

Question 7

min() and index() are likely "optimized" by executing those loops in the interpreter implementation (written in C for the reference implementation), instead of running them as Python code. Python is known for its notoriously slow loops (I'm speaking in the context of numerical computations here).

Question 8

@AlexV Yes I noticed that they were much faster than writing a simple min function myself that is why I used them instead thanks for the clarification as to why they are faster

Question 9

@Thomas-Weller I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time. Sorry for the confusion on that I corrected my question.

Question 10

@MatthewLozoya: I've added an explanation for the amount of instructions.

Thomas Weller Thomas Weller 1,64714 silver badges25 bronze badges · Accepted Answer · 2020-02-04 07:19:40Z

So why is the second sort slightly faster [...]

On my machine, the first algorithm is "faster" (1.588s vs. 1.62s). The number of samples you have taken (1) is too low to draw conclusions. For significant statistics, you should run at least 20 times and then compare the average execution time.

[...] slightly faster [...]

The difference between your two runs is 4 ms. The precision of the default clock on Windows is ~16 ms., so 4 ms is even within measuring tolerance.

[...] having more processes

You still only have 1 process. It's not that you find the minimum and maximum in parallel. It's done one after the another.

other than bubble sort

Bubble sort is typically an in-place algorithm, which means that it operates on one list only. Your implementation works on two lists, templist and list_to_sort1. I would not call this a bubble sort implementation.

BTW: Python's min() and index() functions seem to be pretty optimized. A real bubble sort is slower by a factor of ~10 on my machine:

list_to_sort3 = rand_list[:]
start_time = time.time()
for i in range(len(list_to_sort3)):
 for j in range(i):
 if list_to_sort3[j] > list_to_sort3[i]:
 list_to_sort3[j], list_to_sort3[i] = list_to_sort3[i], list_to_sort3[j]
print('\nSorting Complete: Bubble sort took', time.time() - start_time, "to run")

(comments) I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time.

In version 1 you have:

while 0 < len(list_to_sort1):
 templist.append(list_to_sort1.pop(list_to_sort1.index(min(list_to_sort1))))

This loop runs 10000 times, because it pop()s one item per loop. Thus it also calls min(), index() and append() 10000 times.

In version 2 you have:

while 0 < len(list_to_sort2):
 max_pos = list_to_sort2.index(max(list_to_sort2))
 min_pos = list_to_sort2.index(min(list_to_sort2))
 if len(list_to_sort2) != 1:
 if max_pos > min_pos:
 max_half.append(list_to_sort2.pop(max_pos))
 min_half.append(list_to_sort2.pop(min_pos))
 else:
 min_half.append(list_to_sort2.pop(min_pos))
 max_half.append(list_to_sort2.pop(max_pos))
 else:
 min_half.append(list_to_sort2.pop(0))

This loop only runs 5000 times, because you pop() twice per loop (either in if or in else). Thus it calls min(), max(), index() (twice), append() (twice) also 5000 times.

As you can see, the amount of method calls is the same, although it's more lines of written code.

min() and index() are likely "optimized" by executing those loops in the interpreter implementation (written in C for the reference implementation), instead of running them as Python code. Python is known for its notoriously slow loops (I'm speaking in the context of numerical computations here).
@AlexV Yes I noticed that they were much faster than writing a simple min function myself that is why I used them instead thanks for the clarification as to why they are faster
@Thomas-Weller I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time. Sorry for the confusion on that I corrected my question.
@MatthewLozoya: I've added an explanation for the amount of instructions.

Stack Exchange Network

Simple sorting algorithm speed

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Simple sorting algorithm speed

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions