Mergesort optimization

Question 1

I am currently writing mergesort in Python and have implemented it to work. I have noticed, though, that it is very slow compared to the standard sorted() method. I understand it will always be faster, but I am open to suggestions in terms of optimization.

def merge(left,right):
 sortedArray=[]
 l_idx,r_idx=0,0 #set index for left and right array
 #compare values of left and right array
 while l_idx<len(left) and r_idx<len(right):
 if(left[l_idx] < right[r_idx]):
 sortedArray.append(left[l_idx])
 l_idx+=1
 else:
 sortedArray.append(right[r_idx])
 r_idx+=1
 if l_idx==len(left):
 sortedArray.extend(right[r_idx:])
 else:
 sortedArray.extend(left[l_idx:])
 return sortedArray
def mergesort(A):
 if len(A)<=1:
 return A
 middle=len(A)//2
 left,right=mergesort(A[:middle]),mergesort(A[middle:])
 return merge(left,right)

I welcome any feedback.

Question 2

One thing killing performance is bound to be allocating a fresh buffer in each and every call.

Question 3

Two issues:

It's not stable, for example mergesort([1, 1.0]) returns [1.0, 1]. Stability is desirable, and you could easily achieve it by using <= instead of <.
If the given list has more than one element, you return a new list. But if not, then you return the given list. That's inconsistent and dangerous, might trick someone into thinking you return a separate list, and then getting a problem when you don't.

And it's a simple algorithm, but I can barely see it because of the long and unusual variable names. I feel like I'm looking at a wall of characters. I'd use standard single letter names:

def merge(A, B):
 merged = []
 i = j = 0
 while i < len(A) and j < len(B):
 if A[i] <= B[j]:
 merged.append(A[i])
 i += 1
 else:
 merged.append(B[j])
 j += 1
 merged += A[i:] or B[j:]
 return merged

(I also shortened the part after the loop.)

And here's another way to make it faster, using a for loop for one of the inputs:

def merge(A, B):
 merged = []
 j = 0
 for a in A:
 while j < len(B) and B[j] < a:
 merged.append(B[j])
 j += 1
 merged.append(a)
 merged += B[j:]
 return merged

Or instead of j < len(B), we can make sure the last element comes from B (this modifies the merge inputs and assumes they're not empty, which is fine within the context of the mergesort function):

def merge(A, B):
 merged = []
 if B[-1] < A[-1]:
 B.append(A.pop())
 j = 0
 for a in A:
 while B[j] < a:
 merged.append(B[j])
 j += 1
 merged.append(a)
 merged += B[j:]
 return merged

Times for sorting 10000 random numbers using our various merge functions:

 31.7 ± 0.5 ms merge_yours
 31.8 ± 0.4 ms merge_my_first
 23.0 ± 0.5 ms merge_my_second
 17.3 ± 0.3 ms merge_my_third

Full code hidden in HTML comment here.

Attempt This Online!

Question 4

@greybeard They're temporary lists created by the sorting algorithm itself, which doesn't use them for anything else. It's fine to modify them.

Question 5

@greybeard I'd rather wrap the whole thing, which also lets me address the result inconsistency issue without creating singleton lists unnecessarily.

Question 6

Single letter variables, fine; but they shouldn't be in caps.

Question 7

@Reinderien Why not? Both CLRS and Wikipedia use the name A. It's a perfectly fine name. And it allows me to nicely write for a in A:.

Question 8

Nicely is subjective, but the objective measure here is PEP8 which reserves capitals for other purposes like class names and constants.

Question 9

Besides the optimisations which others have already suggested, you could also optimise your code by going beyond the standard merge sort algorithm.

A simple way to make your merge sort a bit more time-efficient is to turn it into a hybrid sorting algorithm, for example. For very small lists, the recursive function calls performed by merge sort can cause some unnecessary overhead. Some other non-recursive sorting algorithms, such as insertion sort, can be faster than merge sort on very small lists.

To implement a simple hybrid sorting algorithm based on merge sort, you can fix a threshold t for the length of the input list. If the length of the list is above the threshold t, you perform the standard split, recursive call, and merge as with merge sort, and if the length of the list is below the threshold, you just sort the list directly, using insertion sort. You might have to play around with different values for the threshold in order to get a function which performs better than the standard merge sort.

You could then also try implementing Timsort, a slightly more sophisticated hybrid algorithm which combines elements of merge sort and insertion sort, and which is used in the built-in sorted method in Python. Your implementation will likely not be as fast as the built-in implementation, but should still be faster than standard mergesort.

Question 10

Layout

Code is easier to read when there is whitespace around operators. The black program can be used to automatically reformat the code for readability.

It is common to omit the parentheses in if conditions like this:

 if(left[l_idx] < right[r_idx]):

black also helps with that.

Naming

The PEP 8 style guide recommends snake_case for function and variable names.

mergesort would be merge_sort.

It is common practice for array variables to be plural nouns. Also, the name A is not very descriptive. For example, if the array is a list of numbers, numbers would be better.

left could be lefts, and right could be rights.

sortedArray could be sorted_numbers.

Documentation

The PEP-8 guide also recommends adding docstrings and functions. It would be good to describe the input variable types as well as the return type. It would also be good for mergesort to mention that it is called recursively.

DRY

len(left) is repeated twice. It would be better to assign this to a variable before the while loop. This should be more efficient as well since you don't need to call len every time the while condition is evaluated. len(right) should also be assigned to a variable for efficiency.

Question 11

"For example, if the array is a list of numbers, numbers would be better" Where not type restricted, items is one neutral term.

Question 12

@greybeard Meh. I'd stick with the standard. Wikipedia calls it A. CLRS call it A. Sedgwick calls it a. And I find longer names distracting.

Question 13

Currently, the merge loop checks both indices before each iteration.

Initially, neither is "at the end".
The loop body changes only one: check this value.
When at end of the corresponding part, handle what's left of the other and return.

Question 14

I've recently seen this executed beautifully by rcgldr.

Question 15

lint

 sortedArray = []

PEP 8 asks that you spell it sorted_array.

Similarly, in the capitalized def mergesort(A):, prefer lowercase (a).

complexity

def mergesort(A): ...
 left, right = mergesort(A[:middle]), mergesort(A[middle:])

Please understand that slicing is expensive here. That is, you're making a copy of the entire array.

Better to create some class MergeSort: with a self.a array that might have a million entries, and pass around a single reference to the array.

Or if you stick with a fortran-style function, use a signature like

def mergesort(a: list[float], start_idx: int, end_idx: int):

no comment no comment 1,1006 silver badges10 bronze badges · Answer 1 · 2025-02-22 13:56:36Z

Two issues:

It's not stable, for example mergesort([1, 1.0]) returns [1.0, 1]. Stability is desirable, and you could easily achieve it by using <= instead of <.
If the given list has more than one element, you return a new list. But if not, then you return the given list. That's inconsistent and dangerous, might trick someone into thinking you return a separate list, and then getting a problem when you don't.

And it's a simple algorithm, but I can barely see it because of the long and unusual variable names. I feel like I'm looking at a wall of characters. I'd use standard single letter names:

def merge(A, B):
 merged = []
 i = j = 0
 while i < len(A) and j < len(B):
 if A[i] <= B[j]:
 merged.append(A[i])
 i += 1
 else:
 merged.append(B[j])
 j += 1
 merged += A[i:] or B[j:]
 return merged

(I also shortened the part after the loop.)

And here's another way to make it faster, using a for loop for one of the inputs:

def merge(A, B):
 merged = []
 j = 0
 for a in A:
 while j < len(B) and B[j] < a:
 merged.append(B[j])
 j += 1
 merged.append(a)
 merged += B[j:]
 return merged

Or instead of j < len(B), we can make sure the last element comes from B (this modifies the merge inputs and assumes they're not empty, which is fine within the context of the mergesort function):

def merge(A, B):
 merged = []
 if B[-1] < A[-1]:
 B.append(A.pop())
 j = 0
 for a in A:
 while B[j] < a:
 merged.append(B[j])
 j += 1
 merged.append(a)
 merged += B[j:]
 return merged

Times for sorting 10000 random numbers using our various merge functions:

 31.7 ± 0.5 ms merge_yours
 31.8 ± 0.4 ms merge_my_first
 23.0 ± 0.5 ms merge_my_second
 17.3 ± 0.3 ms merge_my_third

Full code hidden in HTML comment here.

Attempt This Online!

@greybeard They're temporary lists created by the sorting algorithm itself, which doesn't use them for anything else. It's fine to modify them.
@greybeard I'd rather wrap the whole thing, which also lets me address the result inconsistency issue without creating singleton lists unnecessarily.
Single letter variables, fine; but they shouldn't be in caps.
@Reinderien Why not? Both CLRS and Wikipedia use the name A. It's a perfectly fine name. And it allows me to nicely write for a in A:.
Nicely is subjective, but the objective measure here is PEP8 which reserves capitals for other purposes like class names and constants.

a3r a3r 412 bronze badges · Answer 2 · 2025-02-22 16:52:28Z

Besides the optimisations which others have already suggested, you could also optimise your code by going beyond the standard merge sort algorithm.

A simple way to make your merge sort a bit more time-efficient is to turn it into a hybrid sorting algorithm, for example. For very small lists, the recursive function calls performed by merge sort can cause some unnecessary overhead. Some other non-recursive sorting algorithms, such as insertion sort, can be faster than merge sort on very small lists.

To implement a simple hybrid sorting algorithm based on merge sort, you can fix a threshold t for the length of the input list. If the length of the list is above the threshold t, you perform the standard split, recursive call, and merge as with merge sort, and if the length of the list is below the threshold, you just sort the list directly, using insertion sort. You might have to play around with different values for the threshold in order to get a function which performs better than the standard merge sort.

You could then also try implementing Timsort, a slightly more sophisticated hybrid algorithm which combines elements of merge sort and insertion sort, and which is used in the built-in sorted method in Python. Your implementation will likely not be as fast as the built-in implementation, but should still be faster than standard mergesort.

toolic toolic 14.5k5 gold badges29 silver badges203 bronze badges · Answer 3 · 2025-02-20 12:41:14Z

Layout

Code is easier to read when there is whitespace around operators. The black program can be used to automatically reformat the code for readability.

It is common to omit the parentheses in if conditions like this:

 if(left[l_idx] < right[r_idx]):

black also helps with that.

Naming

The PEP 8 style guide recommends snake_case for function and variable names.

mergesort would be merge_sort.

It is common practice for array variables to be plural nouns. Also, the name A is not very descriptive. For example, if the array is a list of numbers, numbers would be better.

left could be lefts, and right could be rights.

sortedArray could be sorted_numbers.

Documentation

The PEP-8 guide also recommends adding docstrings and functions. It would be good to describe the input variable types as well as the return type. It would also be good for mergesort to mention that it is called recursively.

DRY

len(left) is repeated twice. It would be better to assign this to a variable before the while loop. This should be more efficient as well since you don't need to call len every time the while condition is evaluated. len(right) should also be assigned to a variable for efficiency.

"For example, if the array is a list of numbers, numbers would be better" Where not type restricted, items is one neutral term.
@greybeard Meh. I'd stick with the standard. Wikipedia calls it A. CLRS call it A. Sedgwick calls it a. And I find longer names distracting.

greybeard greybeard 7,3813 gold badges21 silver badges55 bronze badges · Answer 4 · 2025-02-22 10:48:41Z

3

\$\begingroup\$

Currently, the merge loop checks both indices before each iteration.

Initially, neither is "at the end".
The loop body changes only one: check this value.
When at end of the corresponding part, handle what's left of the other and return.

Share

answered Feb 22 at 10:48

greybeard's user avatar

greybeard greybeard

7,3813 gold badges21 silver badges55 bronze badges

\$\endgroup\$

1

1

\$\begingroup\$ I've recently seen this executed beautifully by rcgldr. \$\endgroup\$

greybeard
– greybeard

2025年02月22日 10:54:49 +00:00
Commented Feb 22 at 10:54

Add a comment |

J_H J_H 41.4k3 gold badges38 silver badges157 bronze badges · Answer 5 · 2025-02-22 01:47:30Z

lint

 sortedArray = []

PEP 8 asks that you spell it sorted_array.

Similarly, in the capitalized def mergesort(A):, prefer lowercase (a).

complexity

def mergesort(A): ...
 left, right = mergesort(A[:middle]), mergesort(A[middle:])

Please understand that slicing is expensive here. That is, you're making a copy of the entire array.

Better to create some class MergeSort: with a self.a array that might have a million entries, and pass around a single reference to the array.

Or if you stick with a fortran-style function, use a signature like

def mergesort(a: list[float], start_idx: int, end_idx: int):

Stack Exchange Network

Mergesort optimization

5 Answers 5

Layout

Naming

Documentation

DRY

lint

complexity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

5 Answers 5

Layout

Naming

Documentation

lint

complexity

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related