Here is my code for classic merge sort. My question is, each time I create a new list result
to merge sub-parts, is there a way to save the space -- not creating a new list result
?
Besides my above major question, any advice for code improvement, bugs, performance (algorithm time complexity perspective) is highly appreciated.
def merge_sort(numbers, start, end):
if not numbers:
return None
if start == end:
return [numbers[start]]
mid = (start + end) // 2
sub_part1 = merge_sort(numbers, start, mid)
sub_part2 = merge_sort(numbers, mid+1, end)
result = []
i = 0
j = 0
while i < len(sub_part1) and j < len(sub_part2):
if sub_part1[i] <= sub_part2[j]:
result.append(sub_part1[i])
i += 1
else:
result.append(sub_part2[j])
j += 1
if i < len(sub_part1) and j == len(sub_part2):
result.extend(sub_part1[i:])
if j < len(sub_part2) and i == len(sub_part1):
result.extend(sub_part2[j:])
return result
if __name__ == "__main__":
print merge_sort([1,4,2,5,6,8,3,4,0], 0, 8)
2 Answers 2
Step 1:
I would change the computation of mid
to:
mid = start + (start - end) // 2
Although python won't give you the wrong answer if you use yours, you theoretically lose performance with lots of elements (>263).
Also, the code when you are doing the final part of the merge can be simplified to:
if i < len(sub_part1):
result.extend(sub_part1[i:])
else:
result.extend(sub_part2[j:])
Which is just cleaner. This can be further improved to:
result.extend(sub_part1[i:])
result.extend(sub_part2[j:])
because the cost of the if statement is more than the cost of appending nothing.
Step 2:
One of the tricks most implementations of mergesort use is they call a slightly different version of themselves at the beginning who's signature is:
def merge_sort(numbers, start, end, result):
...
The advantage of this is that you can alternate which list is your numbers and which list stores the result to avoid allocation or appending to arrays. (because you merge onto an existing result array).
Finally, if you want a fast implementation that feels like cheating (why I love Python), you can use this version from wikibooks.
from heapq import merge
def mergesort(w):
if len(w)<2:
return w
else:
mid=len(w)//2
return merge(mergesort(w[:mid]), mergesort(w[mid:]))
-
\$\begingroup\$ Thanks Dex' ter, love all of your comments and vote up. Could you elaborate a bit more what do you mean
you theoretically lose performance with lots of elements (>2^63)
? And whymid = start + (start - end) // 2
is faster? \$\endgroup\$Lin Ma– Lin Ma2016年12月13日 05:43:23 +00:00Commented Dec 13, 2016 at 5:43 -
\$\begingroup\$ BTW, Dex' ter, followed your advice and write some new code and post here, still do not figure out how to use
numbers
without usingresult
, if you have any ideas, it will be great. My new post here => codereview.stackexchange.com/questions/149722/… \$\endgroup\$Lin Ma– Lin Ma2016年12月13日 06:29:23 +00:00Commented Dec 13, 2016 at 6:29 -
1\$\begingroup\$ The short answer is that if you have a lot of elements,
start+end
can be greater than2^64
, which will make python switch to slower big integer math. The bigger reason this is important is because if you are in a language like C(start + end) // 2
can be negative leading to very bad things. This actually was a bug in Python's Sort for several years. \$\endgroup\$Oscar Smith– Oscar Smith2016年12月13日 06:51:27 +00:00Commented Dec 13, 2016 at 6:51 -
\$\begingroup\$ Thanks Oscar, is it possible to have only one copy or in-place do merge sort? I am not sure if
merge
will create a copy -- each time when doing a merge from two smaller array into a a bigger one? \$\endgroup\$Lin Ma– Lin Ma2016年12月14日 07:16:40 +00:00Commented Dec 14, 2016 at 7:16 -
\$\begingroup\$ It is possible to both do
merge
with one copy, and in place. In placemergesort
is really hard, and tends to beO(nlog^2(n))
rather thanO(nlog(n))
. One copy on the other hand is the way mostmergesorts
used in real life work. The basic way is to have 2 arrays and copy back and forth when you are merging. \$\endgroup\$Oscar Smith– Oscar Smith2016年12月14日 14:03:54 +00:00Commented Dec 14, 2016 at 14:03
It's hard to reduce memory usage when mergesorting an array. However you can reduce a number of allocations and handle the task with just one copy of the input array. The general outline:
- for an input array
A
and indices rangestart
throughend
create an arrayB
and copy all items ofA
toB
, - double-array-merge-sort from
B
rangestart
throughend
to a respective range ofA
, - drop/forget/destroy array
B
;
where 'double-array-merge-sort from a range of S
to D
' is implemented as:
- if a given range of
S
is longer than 1 item:- find a midpoint of
S
range to define two halves, - double-array-merge-sort from a left half of
D
to a left half ofS
, - double-array-merge-sort from a right half of
D
to a right half ofS
, - merge both halves of
S
range into the respectiveD
range
- find a midpoint of
- else
- nothing to do.
-
\$\begingroup\$ Thanks for the advice CiaPan, and vote up. It seems from the reply of Dex' ter, we can re-use the same array -- even no need to make another copy? Any comments on that? \$\endgroup\$Lin Ma– Lin Ma2016年12月13日 05:44:02 +00:00Commented Dec 13, 2016 at 5:44
-
1\$\begingroup\$ Read the whole answer, not just the first two sentences, and you'll see the 'double-array-merge-sort' mentioned in the first part, is a routine described in the second part. \$\endgroup\$CiaPan– CiaPan2016年12月13日 07:09:55 +00:00Commented Dec 13, 2016 at 7:09
-
1\$\begingroup\$ I gave it a name 'double-array-merge-sort` because it sorts with merging procedure and it is designed to sort arrays. And it utilizes two initially equivalent arrays (hence 'double array') to accomplish the task. \$\endgroup\$CiaPan– CiaPan2016年12月13日 07:11:57 +00:00Commented Dec 13, 2016 at 7:11
-
1\$\begingroup\$ I didn't say
destroy
– I said 'drop/forget/destroy'. Please note I didn't mention nor used any specific programming language. I have even explicitly pointed out it's a 'general outline'. You are an implementor, and it's up to you to choose appropriate way of allocating and disposing temporary objects (be itdispose()
in Pascal,free()
in C,delete[]
in C++ or just dropping for a garbage collector in Java)... \$\endgroup\$CiaPan– CiaPan2016年12月13日 07:16:04 +00:00Commented Dec 13, 2016 at 7:16 -
1\$\begingroup\$ As for your first comment: 1. The other answer was given by Oscar Smith, not by Dex' ter; user Dex' ter improved formatting of it. 2. Oscar Smith never said you can reuse the same array. He shows some nice improvements of the code, but none of them works in situ. \$\endgroup\$CiaPan– CiaPan2016年12月13日 09:13:43 +00:00Commented Dec 13, 2016 at 9:13