2
\$\begingroup\$
def merge_sort(mslist):
 if len(mslist) < 2:
 return mslist
 left = mslist[:int(len(mslist)/2)]
 right = mslist[int(len(mslist)/2):]
 left = merge_sort(left)
 right = merge_sort(right)
 return merge(left, right)
def merge(left,right):
 result = []
 while len(left) > 0 or len(right) > 0:
 if len(left) > 0 and len(right) > 0:
 if left[0] <= right[0]:
 result.append(left[0])
 left.pop(0)
 else:
 result.append(right[0])
 right.pop(0)
 elif len(left) > 0:
 result.append(left[0])
 left.pop(0)
 elif len(right) > 0:
 result.append(right[0])
 right.pop(0)
 return result
def main():
 mslist = [-9,1,1,2,3,2,7,3,6,4,5,4,3,5,6,7,8,9,0,0]
 print(merge_sort(mslist))
if __name__ == "__main__":
 main()
  1. Please give me some comments about the actuall algorithm, i used the pseudo code from the wikipedia page as a reference.

  2. Please give me some Pythonic advice or any other general coding paradigm you feel should be followed.

I already made changes to my code regarding the amount of times len(direction) is done.

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Aug 7, 2014 at 12:06
\$\endgroup\$

3 Answers 3

1
\$\begingroup\$

You took into account remarks from your other question and I couldn't be happier about this but I still have a few comments to make.

I usually find it quite hard to read code when the same variable gets assigned multiple times and this is not required. It makes me think "ok, at this particular point, this corresponds to this type of value while afterward, it corresponds to this other type of value". Typically, in your merge_sort function, left and right stands for "unsorted list" and then for "sorted list" making things a bit awkward. You could go for a solution with more variables or you could go for a solution with ... no variables :

def merge_sort(l):
 if len(l) <= 1:
 return l
 return merge(
 merge_sort(l[:int(len(l)/2)]),
 merge_sort(l[int(len(l)/2):]))

Also, you could make things clearer by adding a variable to denote the middle index.

def merge_sort(l):
 if len(l) <= 1:
 return l
 mid = len(l)//2
 return merge(
 merge_sort(l[:mid]),
 merge_sort(l[mid:]))

In your merging procedure, you spend a lot of times checking the same conditions again and again. This is purely personal but I'd rather go for a solution where conditions are checked a minimal number of times :

while True:
 if len(left) > 0:
 if len(right) > 0:
 if left[0] <= right[0]:
 result.append(left[0])
 left.pop(0)
 else:
 result.append(right[0])
 right.pop(0)
 else:
 result.append(left[0])
 left.pop(0)
 elif len(right) > 0:
 result.append(right[0])
 right.pop(0)
 else:
 break
return result

The main drawback is that it removes the symetry which is a bit of an issue from an aesthethic point of view.

Now, it is time to do things in a more efficient way. The traditional merge algorithm keeps track of the 2 indices going through the 2 lists. If we implement it that way, we have a huge boost of performance as we don't have to call pop anymore :

def merge(left, right):
 result = []
 left_ind, right_ind = 0, 0
 left_len, right_len = len(left), len(right)
 while True:
 if left_ind < left_len:
 left_val = left[left_ind]
 if right_ind < right_len:
 right_val = right[right_ind]
 if left_val <= right_val:
 result.append(left_val)
 left_ind+=1
 else:
 result.append(right_val)
 right_ind+=1
 else:
 result.append(left_val)
 left_ind+=1
 elif right_ind < right_len:
 result.append(right[right_ind])
 right_ind+=1
 else:
 break
 return result

Just doing this, timing goes from :

(1, 5.2928924560546875e-05)
(10, 0.0007579326629638672)
(100, 0.011490106582641602)
(1000, 0.15107107162475586)
(10000, 4.067746877670288)

to :

(1, 3.790855407714844e-05)
(10, 0.0004601478576660156)
(100, 0.005733966827392578)
(1000, 0.06619787216186523)
(10000, 0.7241289615631104)

(Number on the left hand side is the number I multiply your list by before sorting it so it is proportionnal to the number of items, number on the right is the time it took on my desktop computer).

Finally, a last optimisation is to use the fact that there is no need to loop when we've reached the end of one of the list :

def merge(left, right):
 result = []
 left_ind, right_ind = 0, 0
 left_len, right_len = len(left), len(right)
 while True:
 if left_ind < left_len:
 if right_ind < right_len:
 left_val = left[left_ind]
 right_val = right[right_ind]
 if left_val <= right_val:
 result.append(left_val)
 left_ind+=1
 else:
 result.append(right_val)
 right_ind+=1
 else:
 # right is empty
 return result + left[left_ind:]
 elif right_ind < right_len:
 # left is empty
 return result + right[right_ind:]
 else:
 return result

Timing is now :

(1, 3.504753112792969e-05)
(10, 0.00040793418884277344)
(100, 0.004902839660644531)
(1000, 0.06235790252685547)
(10000, 0.6899728775024414)

Last comment to be nitpicky, your merge_sort has a quite unusual behavior as it returns a new list object sometimes (when the original list was longer than 1) and the original list sometimes. It could be cool to be consistent and always return a new list : return list(l).

answered Aug 7, 2014 at 16:24
\$\endgroup\$
0
1
\$\begingroup\$

Here:

if len(mslist) < 2:

I would write len(mslist) <= 1 because 1 is the interesting number here.

Here:

int(len(mslist)/2)

just write len(mslist)//2 which is explicit integer division.

Here:

 result.append(left[0])
 left.pop(0)

you can write result.append(left.pop()) since pop returns the element removed.

The elifs in the merge function are not properly aligned.

Notice that the use of the pop method could alter algorithm complexity (because I think that pop is \$O(n)\$). It could be better to use two indices to keep track of the position in the two lists.

answered Aug 7, 2014 at 12:17
\$\endgroup\$
1
  • \$\begingroup\$ Thanks! Nice advices! I did a quickfix on the elif alignment. Thanks :) \$\endgroup\$ Commented Aug 7, 2014 at 12:24
1
\$\begingroup\$

Testing for a validity of an index is very anti-pythonic. Follow Grace Hopper principle:

try:
 while True:
 while left[left_idx] < right[right_idx]:
 result.append(left[left_idx])
 left_idx += 1
 while left[left_idx] >= right[right_idx]:
 result.append(right[right_idx])
 right_idx += 1
except IndexError:
 # append the rest of the lists; one of them is guaranteed to be empty 
answered Aug 7, 2014 at 19:19
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.