7
\$\begingroup\$

(Initial discussion from Classic merge sort, since it is new code, I start a new thread)

Post my code below, my major question is, I have to create another array result to hold sub-parts merge sort result. Is there a way I can just use original number to save additional space in result?

Any other comments on code bugs, performance (in terms of algorithm time complexity), code style, etc. are appreciated.

Code written in Python 2.7.

def merge_sort(numbers, start, end):
 if start == end:
 return
 pivot_index = start + (end-start)//2
 merge_sort(numbers, start, pivot_index)
 merge_sort(numbers, pivot_index+1, end)
 i = start
 j = pivot_index+1
 result = []
 while i <= pivot_index and j <= end:
 if numbers[i] <= numbers[j]:
 result.append(numbers[i])
 i+=1
 else:
 result.append(numbers[j])
 j+=1
 if i <= pivot_index:
 result.extend(numbers[i:pivot_index+1])
 if j <= end:
 result.extend(numbers[j:end+1])
 k=0
 for i in range(start, end+1):
 numbers[i] = result[k]
 k+=1
if __name__ == "__main__":
 numbers = [1,4,2,5,6,8,3,4,0]
 merge_sort(numbers, 0, len(numbers)-1)
 print numbers
asked Dec 13, 2016 at 6:28
\$\endgroup\$
5
  • 1
    \$\begingroup\$ Is in-place merge sorting arrays possible without at least doubling run time? (in your wording, number lacks "the plural-s") Are you aware of Ford-Johnson merge-insertion sort (and improvements, e.g. by T. D. Bui & Mai Thanh), "Practical in-place mergesort"s by Katajainen, Pasanen & Teuhola (based on Kronrod) or Huang & Langston, and the relatively new, non-stable QuickMergeSort? \$\endgroup\$ Commented Dec 13, 2016 at 7:21
  • \$\begingroup\$ @greybeard, nice post and did some study, but sure if merge-insertion sort from time complexity perspective, less efficient than the solution coderodde posted -- which only use one additional copy? Thanks. \$\endgroup\$ Commented Dec 15, 2016 at 9:03
  • 1
    \$\begingroup\$ sure if merge-insertion sort from time complexity perspective, less efficient than [merge sort with a single buffer allocation] well, the attempts at in-place merge sort before "Practical in-place mergesort" were not practical due to increased run time, "the practical ones" have been complicated, QuickMergeSort may not be a merge sort in everybody's book. \$\endgroup\$ Commented Dec 15, 2016 at 12:49
  • \$\begingroup\$ Thanks greybeard, in what scenarios do you think merge quick sort is useful than other merge sort implementation mechanisms, like what we discussed below? \$\endgroup\$ Commented Dec 19, 2016 at 7:03
  • \$\begingroup\$ BTW greybeard, how do you think Ford-Johnson merge-insertion sort? Do you think it is more practical comparing to quick merge sort? \$\endgroup\$ Commented Dec 19, 2016 at 7:04

1 Answer 1

2
\$\begingroup\$

You can make your merge sort run 35% faster by allocating the auxiliary memory only once and reusing it throughout the algorithm:

def coderodde_merge(source,
 target,
 source_offset,
 target_offset,
 left_run_length,
 right_run_length):
 left_run_index = source_offset
 right_run_index = source_offset + left_run_length
 left_run_index_bound = right_run_index
 right_run_index_bound = right_run_index + right_run_length
 while left_run_index != left_run_index_bound and right_run_index != right_run_index_bound:
 if source[right_run_index] < source[left_run_index]:
 target[target_offset] = source[right_run_index]
 right_run_index += 1
 else:
 target[target_offset] = source[left_run_index]
 left_run_index += 1
 target_offset += 1
 while left_run_index != left_run_index_bound:
 target[target_offset] = source[left_run_index]
 target_offset += 1
 left_run_index += 1
 while right_run_index != right_run_index_bound:
 target[target_offset] = source[right_run_index]
 target_offset += 1
 right_run_index += 1
def coderodde_mergesort_impl(source,
 target,
 source_offset,
 target_offset,
 range_length):
 if range_length < 2:
 return
 left_run_length = range_length // 2
 right_run_length = range_length - left_run_length
 coderodde_mergesort_impl(target,
 source,
 target_offset,
 source_offset,
 left_run_length)
 coderodde_mergesort_impl(target,
 source,
 target_offset + left_run_length,
 source_offset + left_run_length,
 right_run_length)
 coderodde_merge(source,
 target,
 source_offset,
 target_offset,
 left_run_length,
 right_run_length)
def coderodde_mergesort(array, start, end):
 range_length = end - start
 if range_length < 2:
 return
 aux = [array[index] for index in range(start, end)]
 coderodde_mergesort_impl(aux, array, 0, start, range_length)
def coderodde_mergesort_all(array):
 coderodde_mergesort(array, 0, len(array))

When running this demonstration, I get the following results:

OP mergesort in 17628 milliseconds.
coderodde mergesort in 11411 milliseconds.
Algorithms agree: True

Also, as an additional nitpick, by PEP 8 you should separate binary operators by a space before and after, i.e., not i+=1, but i += 1.

Check you range

If I do something as crazy as

ar = [1, 2, 3]
merge_sort(ar, 0, -1)

you will get a stack overflow.

answered Dec 13, 2016 at 7:52
\$\endgroup\$
9
  • 1
    \$\begingroup\$ Sorry, I cannot help you with merge-insertion sort. Ask, say, Morwenn. \$\endgroup\$ Commented Dec 14, 2016 at 11:21
  • 1
    \$\begingroup\$ @LinMa If you replace the word 'target' with 'destination' I hope you'll find a correlation of 'using destination as source, and source as destination' with variable names S & D here... \$\endgroup\$ Commented Dec 14, 2016 at 11:42
  • 1
    \$\begingroup\$ @LinMa Precisely, except this implementation is a bit more general: it assumes the part of the array to be sorted needn't start at index 0. That's why it needs additional start parameter to coderodde_mergesortroutine and separate starting indices source_offset and target_offset in each recursive call to a helper routine coderodde_mergesort_impl (implementing the algorithm I called double-array-merge-sort). \$\endgroup\$ Commented Dec 15, 2016 at 9:58
  • 1
    \$\begingroup\$ @LinMa Morwenn can tell you about merge-insertion sort. :) \$\endgroup\$ Commented Dec 15, 2016 at 10:19
  • 2
    \$\begingroup\$ @greybeard No, the copying is unnecessary. The second array is created as a copy of the subarray to be sorted: aux = [array[index] for index in range(start, end)] and no data is modified during stepping down the recursion, so when it comes to one-item range, the corresponding data in both arrays are the same, whether you land in the original array or in a copy. \$\endgroup\$ Commented Dec 16, 2016 at 7:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.