I was working on an assignment in my Python class to sort 10 numbers but I wanted to be able to sort any set of numbers be it large or small. The assignment didn't really cover any sorting algorithms other than bubble sort so I started working and decided to just do something simple find the smallest number and remove it then put it in a temp list and then assign that list to the original to make it sorted. I was able to compress this down into the first sorting.
But I wanted to see if I could make it faster by also looking for the max and popping them both out in separate lists. But I was told python runs on a single thread and there were far more things to run than the first.
So why is the second sort slightly faster despite having more instructions than the first?
I ran this on python 3.8
import time
import random
rand_list = []
for i in range(0,10000):
n = random.randint(1,100000)
rand_list.append(n)
print('Random List Creation Complete')
list_to_sort1 = rand_list[:]
list_to_sort2 = rand_list[:]
''' Sort Smallest to Largest'''
start_time = time.time()
templist=[]
while 0 < len(list_to_sort1):
templist.append(list_to_sort1.pop(list_to_sort1.index(min(list_to_sort1))))
list_to_sort1 = templist
print('\nSorting Complete: My sort took', time.time() - start_time, "to run")
''' Sort Smallest to Largest'''
start_time = time.time()
min_half = []
max_half = []
max_pos = 0
min_pos = 0
while 0 < len(list_to_sort2):
max_pos = list_to_sort2.index(max(list_to_sort2))
min_pos = list_to_sort2.index(min(list_to_sort2))
if len(list_to_sort2) != 1:
if max_pos > min_pos:
max_half.append(list_to_sort2.pop(max_pos))
min_half.append(list_to_sort2.pop(min_pos))
else:
min_half.append(list_to_sort2.pop(min_pos))
max_half.append(list_to_sort2.pop(max_pos))
else:
min_half.append(list_to_sort2.pop(0))
max_half.reverse()
list_to_sort2 = min_half + max_half
print('\nSorting Complete: My sort 2 took', time.time() - start_time, "to run")
time.sleep(20)
1 Answer 1
So why is the second sort slightly faster [...]
On my machine, the first algorithm is "faster" (1.588s vs. 1.62s). The number of samples you have taken (1) is too low to draw conclusions. For significant statistics, you should run at least 20 times and then compare the average execution time.
[...] slightly faster [...]
The difference between your two runs is 4 ms. The precision of the default clock on Windows is ~16 ms., so 4 ms is even within measuring tolerance.
[...] having more processes
You still only have 1 process. It's not that you find the minimum and maximum in parallel. It's done one after the another.
other than bubble sort
Bubble sort is typically an in-place algorithm, which means that it operates on one list only. Your implementation works on two lists, templist
and list_to_sort1
. I would not call this a bubble sort implementation.
BTW:
Python's min()
and index()
functions seem to be pretty optimized.
A real bubble sort is slower by a factor of ~10 on my machine:
list_to_sort3 = rand_list[:]
start_time = time.time()
for i in range(len(list_to_sort3)):
for j in range(i):
if list_to_sort3[j] > list_to_sort3[i]:
list_to_sort3[j], list_to_sort3[i] = list_to_sort3[i], list_to_sort3[j]
print('\nSorting Complete: Bubble sort took', time.time() - start_time, "to run")
(comments) I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time.
In version 1 you have:
while 0 < len(list_to_sort1):
templist.append(list_to_sort1.pop(list_to_sort1.index(min(list_to_sort1))))
This loop runs 10000 times, because it pop()
s one item per loop. Thus it also calls min()
, index()
and append()
10000 times.
In version 2 you have:
while 0 < len(list_to_sort2):
max_pos = list_to_sort2.index(max(list_to_sort2))
min_pos = list_to_sort2.index(min(list_to_sort2))
if len(list_to_sort2) != 1:
if max_pos > min_pos:
max_half.append(list_to_sort2.pop(max_pos))
min_half.append(list_to_sort2.pop(min_pos))
else:
min_half.append(list_to_sort2.pop(min_pos))
max_half.append(list_to_sort2.pop(max_pos))
else:
min_half.append(list_to_sort2.pop(0))
This loop only runs 5000 times, because you pop()
twice per loop (either in if
or in else
). Thus it calls min()
, max()
, index()
(twice), append()
(twice) also 5000 times.
As you can see, the amount of method calls is the same, although it's more lines of written code.
-
1\$\begingroup\$
min()
andindex()
are likely "optimized" by executing those loops in the interpreter implementation (written in C for the reference implementation), instead of running them as Python code. Python is known for its notoriously slow loops (I'm speaking in the context of numerical computations here). \$\endgroup\$AlexV– AlexV2020年02月04日 07:58:39 +00:00Commented Feb 4, 2020 at 7:58 -
\$\begingroup\$ @AlexV Yes I noticed that they were much faster than writing a simple min function myself that is why I used them instead thanks for the clarification as to why they are faster \$\endgroup\$Matt– Matt2020年02月04日 08:21:31 +00:00Commented Feb 4, 2020 at 8:21
-
\$\begingroup\$ @Thomas-Weller I actually meant instructions not processes in reference to the second sort there is a lot more going on yet it takes about the same time. Sorry for the confusion on that I corrected my question. \$\endgroup\$Matt– Matt2020年02月04日 08:31:46 +00:00Commented Feb 4, 2020 at 8:31
-
\$\begingroup\$ @MatthewLozoya: I've added an explanation for the amount of instructions. \$\endgroup\$Thomas Weller– Thomas Weller2020年02月04日 10:34:46 +00:00Commented Feb 4, 2020 at 10:34
Explore related questions
See similar questions with these tags.
why is the second sort slightly faster despite having more processes?
) that does not (quite) fit in here, as well as begging clarification: whatprocesses
are you referring to? \$\endgroup\$