Problem: Implement merge sort in python. Don’t use Python’s built in sort or sorted. Assume your inputs will be sufficient for the memory you have.
How should I make this code better?
def merge_sort(list_sort):
"""splits the list in two parts until each part is left with one member"""
if len(list_sort) == 1:
return list_sort
if len(list_sort)>= 2:
x= len(list_sort) / 2
part_a = list_sort[:x]
part_b = list_sort[x:]
sorted_part_a = merge_sort(part_a)
sorted_part_b = merge_sort(part_b)
return merge(sorted_part_a, sorted_part_b)
def merge(left , right):
"""merges the two parts of list after sorting them"""
sorted_list = []
i = 0
while left[:] and right[:] :
if left [i] > right [i]:
sorted_list.append(right[i])
right.remove(right[i])
else :
sorted_list.append(left[i])
left.remove(left[i])
if left[:]:
sorted_list.extend(left[:])
elif right[:] :
sorted_list.extend(right[:])
return sorted_list
details = [1, 127, 56, 2, 1, 5, 7, 9, 11, 65, 12, 24, 76, 87, 123, 65, 8,32, 86, 123, 67, 1, 67, 92, 72, 39, 49, 12, 98, 52, 45, 19, 37, 22, 1, 66, 943, 415, 21, 785, 12, 698, 26, 36, 18, 97, 0, 63, 25, 85, 24, 94, 150 ]
print "List to be sorted = ", details
print "Sorted List = ", merge_sort(details)
-
1\$\begingroup\$ it would be better if the memory is sufficient for the input you have. \$\endgroup\$miracle173– miracle1732018年01月19日 13:23:54 +00:00Commented Jan 19, 2018 at 13:23
2 Answers 2
You have some huge inefficiencies in your code.
Every time you right left[:]
or right[:]
, it duplicates the list! You do that several times in merge()
, unnecessarily.
i = 0
is a pointless variable, the way you have used it: it's just an obfuscated way to say 0. Furthermore, right.remove(right[i])
and left.remove(left[i])
are just right.pop(0)
and left.pop(0)
— but more verbose and inefficient. Finally, it's extremely inefficient to .remove()
or .pop()
anything from the head of a list, since it will need to copy every subsequent element over to fill in the void.
So, what you should be doing instead is iterating through the left
and right
lists non-destructively, using two variables i
and j
to keep track of how far you've gone.
Sort algorithm
- instead of
if len(list_sort)>= 2:
, just useelse:
- I would get rid of all the steps inbetween.
list_sort
sounds too much likemerge_sort
, a method, not like a variable- to prevent surprises when using this algorithm in Python 3, use
//
for the floor division - you can use the fact
1 // 2
evaluates toFalse
boolean-wise
so it'd end up like:
def merge_sort_list(lost_list):
"""splits the list in two parts until each part is left with one member"""
x = len(lost_list) // 2
if not x: # length is 1
return lost_list
else:
return merge(merge_sort(lost_list[:x]), merge_sort(lost_list[x:]))
Merge algorithm
For reuse later, I would use a more generic algorithm using generators
, which doesn't alter the items passed into it. This will also help to make it easier to adapt the sorting algorithm for other containers
def merge(left, right):
try:
left = iter(left)
right = iter(right)
l = next(left)
r = next(right)
except StopIteration as exc:
raise ValueError('left and right should be iterables with at least length 1', exc)
try:
while True:
if l <= r:
yield l
l = None
l = next(left)
else:
yield r
r = next(right)
except StopIteration:
yield r if l is None else l # l is None if left is exhausted
yield from left
yield from right
return
thanks to PeylonRayz for noticing the bug
Alternative version
def merge(left, right):
try:
left = iter(left)
right = iter(right)
iterators = (
(next(left), 'left', left),
(next(right), 'right', right),
)
except StopIteration as exc:
raise ValueError('left and right should be iterables with at least length 1', exc)
try:
while True:
(value, key, iterator), other = min(iterators), max(iterators)
yield value
iterators = other, (next(iterator), key, iterator)
except StopIteration:
value, key, iterator = other
yield value
yield from iterator
I find this alternative version is slightly more elegant, and in case of ex aequo, it keeps left before right. It can be easily adapted to support descending merging
sorting dicts
def merge_sort_dict_keys(items):
import collections
sorted_keys = merge_sort_list(list(items.items()))
return collections.OrderedDict((key, value) for key, value in sorted_keys)
def merge_sort_dict_values(items):
import collections
sorted_keys = merge_sort_list(list((value, key) for key, value in items.items()))
return collections.OrderedDict((key, value) for value, key in sorted_keys)
-
\$\begingroup\$ I'm confused why you don't move the
while True
into thetry
, also if the lastl
is zero, thenyield l if l else r
won't work. Couldn't you instead usemax(l, r)
? \$\endgroup\$2018年01月19日 16:00:50 +00:00Commented Jan 19, 2018 at 16:00