Classic merge sort

Question 1

Here is my code for classic merge sort. My question is, each time I create a new list result to merge sub-parts, is there a way to save the space -- not creating a new list result?

Besides my above major question, any advice for code improvement, bugs, performance (algorithm time complexity perspective) is highly appreciated.

def merge_sort(numbers, start, end):
 if not numbers:
 return None
 if start == end:
 return [numbers[start]]
 mid = (start + end) // 2
 sub_part1 = merge_sort(numbers, start, mid)
 sub_part2 = merge_sort(numbers, mid+1, end)
 result = []
 i = 0
 j = 0
 while i < len(sub_part1) and j < len(sub_part2):
 if sub_part1[i] <= sub_part2[j]:
 result.append(sub_part1[i])
 i += 1
 else:
 result.append(sub_part2[j])
 j += 1
 if i < len(sub_part1) and j == len(sub_part2):
 result.extend(sub_part1[i:])
 if j < len(sub_part2) and i == len(sub_part1):
 result.extend(sub_part2[j:])
 return result
if __name__ == "__main__":
 print merge_sort([1,4,2,5,6,8,3,4,0], 0, 8)

Question 2

Step 1:

I would change the computation of mid to:

mid = start + (start - end) // 2

Although python won't give you the wrong answer if you use yours, you theoretically lose performance with lots of elements (>2⁶³).

Also, the code when you are doing the final part of the merge can be simplified to:

if i < len(sub_part1):
 result.extend(sub_part1[i:])
else:
 result.extend(sub_part2[j:])

Which is just cleaner. This can be further improved to:

result.extend(sub_part1[i:])
result.extend(sub_part2[j:])

because the cost of the if statement is more than the cost of appending nothing.

Step 2:

One of the tricks most implementations of mergesort use is they call a slightly different version of themselves at the beginning who's signature is:

def merge_sort(numbers, start, end, result):
 ...

The advantage of this is that you can alternate which list is your numbers and which list stores the result to avoid allocation or appending to arrays. (because you merge onto an existing result array).

Finally, if you want a fast implementation that feels like cheating (why I love Python), you can use this version from wikibooks.

from heapq import merge
def mergesort(w):
 if len(w)<2:
 return w
 else: 
 mid=len(w)//2
 return merge(mergesort(w[:mid]), mergesort(w[mid:]))

Question 3

Thanks Dex' ter, love all of your comments and vote up. Could you elaborate a bit more what do you mean you theoretically lose performance with lots of elements (>2^63)? And why mid = start + (start - end) // 2 is faster?

Question 4

BTW, Dex' ter, followed your advice and write some new code and post here, still do not figure out how to use numbers without using result, if you have any ideas, it will be great. My new post here => codereview.stackexchange.com/questions/149722/…

Question 5

The short answer is that if you have a lot of elements, start+end can be greater than 2^64, which will make python switch to slower big integer math. The bigger reason this is important is because if you are in a language like C (start + end) // 2 can be negative leading to very bad things. This actually was a bug in Python's Sort for several years.

Question 6

Thanks Oscar, is it possible to have only one copy or in-place do merge sort? I am not sure if merge will create a copy -- each time when doing a merge from two smaller array into a a bigger one?

Question 7

It is possible to both do merge with one copy, and in place. In place mergesort is really hard, and tends to be O(nlog^2(n)) rather than O(nlog(n)). One copy on the other hand is the way most mergesorts used in real life work. The basic way is to have 2 arrays and copy back and forth when you are merging.

Question 8

It's hard to reduce memory usage when mergesorting an array. However you can reduce a number of allocations and handle the task with just one copy of the input array. The general outline:

for an input array A and indices range start through end create an array B and copy all items of A to B,
double-array-merge-sort from B range start through end to a respective range of A,
drop/forget/destroy array B;

where 'double-array-merge-sort from a range of S to D' is implemented as:

if a given range of S is longer than 1 item:
- find a midpoint of S range to define two halves,
- double-array-merge-sort from a left half of D to a left half of S,
- double-array-merge-sort from a right half of D to a right half of S,
- merge both halves of S range into the respective D range
else
- nothing to do.

Question 9

Thanks for the advice CiaPan, and vote up. It seems from the reply of Dex' ter, we can re-use the same array -- even no need to make another copy? Any comments on that?

Question 10

Read the whole answer, not just the first two sentences, and you'll see the 'double-array-merge-sort' mentioned in the first part, is a routine described in the second part.

Question 11

I gave it a name 'double-array-merge-sort` because it sorts with merging procedure and it is designed to sort arrays. And it utilizes two initially equivalent arrays (hence 'double array') to accomplish the task.

Question 12

I didn't say destroy – I said 'drop/forget/destroy'. Please note I didn't mention nor used any specific programming language. I have even explicitly pointed out it's a 'general outline'. You are an implementor, and it's up to you to choose appropriate way of allocating and disposing temporary objects (be it dispose() in Pascal, free() in C, delete[] in C++ or just dropping for a garbage collector in Java)...

Question 13

As for your first comment: 1. The other answer was given by Oscar Smith, not by Dex' ter; user Dex' ter improved formatting of it. 2. Oscar Smith never said you can reuse the same array. He shows some nice improvements of the code, but none of them works in situ.

Oscar Smith Oscar Smith 3,71718 silver badges31 bronze badges · Answer 1 · 2016-12-12 04:24:31Z

Step 1:

I would change the computation of mid to:

mid = start + (start - end) // 2

Although python won't give you the wrong answer if you use yours, you theoretically lose performance with lots of elements (>2⁶³).

Also, the code when you are doing the final part of the merge can be simplified to:

if i < len(sub_part1):
 result.extend(sub_part1[i:])
else:
 result.extend(sub_part2[j:])

Which is just cleaner. This can be further improved to:

result.extend(sub_part1[i:])
result.extend(sub_part2[j:])

because the cost of the if statement is more than the cost of appending nothing.

Step 2:

One of the tricks most implementations of mergesort use is they call a slightly different version of themselves at the beginning who's signature is:

def merge_sort(numbers, start, end, result):
 ...

The advantage of this is that you can alternate which list is your numbers and which list stores the result to avoid allocation or appending to arrays. (because you merge onto an existing result array).

Finally, if you want a fast implementation that feels like cheating (why I love Python), you can use this version from wikibooks.

from heapq import merge
def mergesort(w):
 if len(w)<2:
 return w
 else: 
 mid=len(w)//2
 return merge(mergesort(w[:mid]), mergesort(w[mid:]))

Thanks Dex' ter, love all of your comments and vote up. Could you elaborate a bit more what do you mean you theoretically lose performance with lots of elements (>2^63)? And why mid = start + (start - end) // 2 is faster?
BTW, Dex' ter, followed your advice and write some new code and post here, still do not figure out how to use numbers without using result, if you have any ideas, it will be great. My new post here => codereview.stackexchange.com/questions/149722/…
The short answer is that if you have a lot of elements, start+end can be greater than 2^64, which will make python switch to slower big integer math. The bigger reason this is important is because if you are in a language like C (start + end) // 2 can be negative leading to very bad things. This actually was a bug in Python's Sort for several years.
Thanks Oscar, is it possible to have only one copy or in-place do merge sort? I am not sure if merge will create a copy -- each time when doing a merge from two smaller array into a a bigger one?
It is possible to both do merge with one copy, and in place. In place mergesort is really hard, and tends to be O(nlog^2(n)) rather than O(nlog(n)). One copy on the other hand is the way most mergesorts used in real life work. The basic way is to have 2 arrays and copy back and forth when you are merging.

CiaPan CiaPan 1,9101 gold badge12 silver badges17 bronze badges · Answer 2 · 2016-12-12 10:56:18Z

1

\$\begingroup\$

It's hard to reduce memory usage when mergesorting an array. However you can reduce a number of allocations and handle the task with just one copy of the input array. The general outline:

for an input array A and indices range start through end create an array B and copy all items of A to B,
double-array-merge-sort from B range start through end to a respective range of A,
drop/forget/destroy array B;

where 'double-array-merge-sort from a range of S to D' is implemented as:

if a given range of S is longer than 1 item:
- find a midpoint of S range to define two halves,
- double-array-merge-sort from a left half of D to a left half of S,
- double-array-merge-sort from a right half of D to a right half of S,
- merge both halves of S range into the respective D range
else
- nothing to do.

Share

answered Dec 12, 2016 at 10:56

CiaPan's user avatar

CiaPan CiaPan

1,9101 gold badge12 silver badges17 bronze badges

\$\endgroup\$

9

\$\begingroup\$ Thanks for the advice CiaPan, and vote up. It seems from the reply of Dex' ter, we can re-use the same array -- even no need to make another copy? Any comments on that? \$\endgroup\$

Lin Ma
– Lin Ma

2016年12月13日 05:44:02 +00:00
Commented Dec 13, 2016 at 5:44
1

\$\begingroup\$ Read the whole answer, not just the first two sentences, and you'll see the 'double-array-merge-sort' mentioned in the first part, is a routine described in the second part. \$\endgroup\$

CiaPan
– CiaPan

2016年12月13日 07:09:55 +00:00
Commented Dec 13, 2016 at 7:09
1

\$\begingroup\$ I gave it a name 'double-array-merge-sort` because it sorts with merging procedure and it is designed to sort arrays. And it utilizes two initially equivalent arrays (hence 'double array') to accomplish the task. \$\endgroup\$

CiaPan
– CiaPan

2016年12月13日 07:11:57 +00:00
Commented Dec 13, 2016 at 7:11
1

\$\begingroup\$ I didn't say destroy – I said 'drop/forget/destroy'. Please note I didn't mention nor used any specific programming language. I have even explicitly pointed out it's a 'general outline'. You are an implementor, and it's up to you to choose appropriate way of allocating and disposing temporary objects (be it dispose() in Pascal, free() in C, delete[] in C++ or just dropping for a garbage collector in Java)... \$\endgroup\$

CiaPan
– CiaPan

2016年12月13日 07:16:04 +00:00
Commented Dec 13, 2016 at 7:16
1

\$\begingroup\$ As for your first comment: 1. The other answer was given by Oscar Smith, not by Dex' ter; user Dex' ter improved formatting of it. 2. Oscar Smith never said you can reuse the same array. He shows some nice improvements of the code, but none of them works in situ. \$\endgroup\$

CiaPan
– CiaPan

2016年12月13日 09:13:43 +00:00
Commented Dec 13, 2016 at 9:13

| Show 4 more comments

Stack Exchange Network

Classic merge sort

2 Answers 2

Step 1:

Step 2:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Classic merge sort

2 Answers 2

Step 1:

Step 2:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions