Merge sort in Python

Question 1

I've written a implementation of a merge sort in Python. It seems correct and I have tested it with this:

l = list(range(1000))
random.shuffle(l) # random.shuffle is in-place
mergesort(l) == list(range(1000)) # returns True
# Stable sort
cards = list(itertools.product(range(4), range(13)))
random.shuffle(cards)
cards = mergesort(cards, key=lambda x: x[1]) # Sort by rank
cards = mergesort(cards, key=lambda x: x[0]) # Sort by suit
cards == list(itertools.product(range(4), range(13))) # returns True

I've also tested its performance, comparing it to sorted and the merge sort implementation in rosetta code:

rl = list(range(100))
random.shuffle(rl)
%timeit sorted(rl)
# 100000 loops, best of 3: 11.3 μs per loop
%timeit mergesort(rl) # My code
# 1000 loops, best of 3: 376 μs per loop
%timeit rcmerge_sort(rl) # From rosetta code
# 1000 loops, best of 3: 350 μs per loop

I'm looking for any suggestions on how to improve this code. I suspect there is a better way to do the mergelist function, particularly in how I tried to avoid code duplication like:

if top_a <= top_b:
 nl.append(top_a)
 try:
 top_a = next(it_a)
 except:
 ...
else:
 # duplicates above code

In my code I placed the iterators and first values in a list, then use the variable k as index, but this leads to hacks like abs(k-1) and placing magic numbers 0 and 1 in the code.

def mergesort(l, key=None):
 # Split the list into sublists of length 1
 sublists = [[x] for x in l]
 while len(sublists) > 1:
 new_sublists = []
 # Create a generator that yields two sublists at a time
 sublists_pairs = ((sublists[2*x], sublists[2*x+1]) 
 for x in range(len(sublists)//2))
 for a, b in sublists_pairs:
 new_sublists.append(mergelist(a, b, key))
 # If the length is odd, then there is one sublist that is not merged
 if len(sublists) % 2 != 0:
 new_sublists.append(sublists[-1])
 sublists = new_sublists
 return new_sublists[0]
def mergelist(a, b, key=None):
 nl = []
 # Iterators that yield values from a and b
 its = iter(a), iter(b)
 # The top of both lists
 tops = [next(it) for it in its]
 while True:
 # Determine the iterator that the next element should be taken from
 if key:
 k = 0 if key(tops[0]) <= key(tops[1]) else 1
 else:
 k = 0 if tops[0] <= tops[1] else 1
 nl.append(tops[k])
 try:
 # Update the top of the iterator
 tops[k] = next(its[k])
 except StopIteration:
 # Unless the iterator is empty, in which case get the rest of
 # the values from the other iterator
 # abs(k-1) is similar to (0 if k == 1 else 1)
 nl.append(tops[abs(k-1)])
 for e in its[abs(k-1)]:
 nl.append(e)
 return nl

Question 2

Using list(x) is better than [x] because it is more explicit.

Your while loop is weird, delete

try:
 # Update the top of the iterator
 tops[k] = next(its[k])
except StopIteration:
 # Unless the iterator is empty, in which case get the rest of
 # the values from the other iterator
 # abs(k-1) is similar to (0 if k == 1 else 1)
 nl.append(tops[abs(k-1)])
 for e in its[abs(k-1)]:
 nl.append(e)

and put something like

if (condition):
 #thing1
else:
 #thing2
 break

Question 3

Thanks for the reply. I think list(x) would make a list from the elements of x, but the program needs to create a list with a single element, which would be done with list as list((x,)) . Also, I don't understand what should I put in the if statement, can you elaborate?

Question 4

The if statement works in simpler cases, I think it is easier to read, in your case, I don't think it can be used, I'm not good at using iterators.

Question 5

The while loop can be written as follows (although I think it does not speed things up):

while True:
 # Determine the iterator that the next element should be taken from
 if key:
 low,high = (0,1) if key(tops[0]) <= key(tops[1]) else (1,0)
 else:
 low,high = (0,1) if tops[0] <= tops[1] else (1,0)
 nl.append(tops[low])
 # Update the top of the iterator
 tops[low] = next(its[low], None)
 if not tops[low]:
 # Unless the iterator is empty, in which case get the rest of
 # the values from the other iterator
 nl.append(tops[high])
 nl.extend(its[high])
 return nl

Question 6

Thank you for the reply. If does get rid of the ugly call to abs, but I'm not sure about replacing the try - catch block with an if statement. What if there's a 0 in the list to be sorted?

Question 7

Good point. Rewrite as: if tops[low]==None

Caridorc Caridorc 28.1k7 gold badges54 silver badges137 bronze badges · Answer 1 · 2014-10-26 18:48:27Z

Using list(x) is better than [x] because it is more explicit.

Your while loop is weird, delete

try:
 # Update the top of the iterator
 tops[k] = next(its[k])
except StopIteration:
 # Unless the iterator is empty, in which case get the rest of
 # the values from the other iterator
 # abs(k-1) is similar to (0 if k == 1 else 1)
 nl.append(tops[abs(k-1)])
 for e in its[abs(k-1)]:
 nl.append(e)

and put something like

if (condition):
 #thing1
else:
 #thing2
 break

Thanks for the reply. I think list(x) would make a list from the elements of x, but the program needs to create a list with a single element, which would be done with list as list((x,)) . Also, I don't understand what should I put in the if statement, can you elaborate?
The if statement works in simpler cases, I think it is easier to read, in your case, I don't think it can be used, I'm not good at using iterators.

Arvind Padmanabhan Arvind Padmanabhan 4292 silver badges5 bronze badges · Answer 2 · 2014-11-02 14:37:08Z

The while loop can be written as follows (although I think it does not speed things up):

while True:
 # Determine the iterator that the next element should be taken from
 if key:
 low,high = (0,1) if key(tops[0]) <= key(tops[1]) else (1,0)
 else:
 low,high = (0,1) if tops[0] <= tops[1] else (1,0)
 nl.append(tops[low])
 # Update the top of the iterator
 tops[low] = next(its[low], None)
 if not tops[low]:
 # Unless the iterator is empty, in which case get the rest of
 # the values from the other iterator
 nl.append(tops[high])
 nl.extend(its[high])
 return nl

Thank you for the reply. If does get rid of the ugly call to abs, but I'm not sure about replacing the try - catch block with an if statement. What if there's a 0 in the list to be sorted?

Stack Exchange Network

Merge sort in Python

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Merge sort in Python

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions