Return to Answer

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/

edited Apr 13, 2017 at 12:40

In the main loop you're not using elementNumber for anything. I like to rename these kind of variables to _. Although, an unused variable is usually a hint that perhaps there's a better way of doing things, like the algorithm suggested by @PranavRaj @PranavRaj.

added 1461 characters in body

Source Link

edited May 10, 2014 at 16:22

janos

edited May 10, 2014 at 16:22

janos

112.9k
15
154
396

Extra tricks for speed

Here's a slightly faster version of Pranav's solution:

left_i = 0
right_i = 0
left_sorted_len = len(left_sorted_seq)
right_sorted_len = len(right_sorted_seq)
merged = []
append = merged.append
while left_i < left_sorted_len and right_i < right_sorted_len:
 smallestInLeft = left_sorted_seq[left_i]
 smallestInRight = right_sorted_seq[right_i]
 if smallestInLeft < smallestInRight:
 append(smallestInLeft)
 left_i += 1
 else:
 append(smallestInRight)
 right_i += 1
if left_i < left_sorted_len:
 return merged + left_sorted_seq[left_i:]
return merged + right_sorted_seq[right_i:]

What makes this faster:

Pranav already precalculated the list lenghts (I renamed to left_sorted_len and right_sorted_len, following PEP8) but did not explain: doing so makes the loop faster, because len(...) is only evaluated once.
Notice the caching of merged.append method as append, to reduce the number of method lookups
Notice the return statements, split to two cases to avoid concatenating empty slices

I measured differences in speed using 10 iterations of sorting a random sequence of 1000 numbers. You probably don't have to optimize this to the maximum, that's why I made this the last point of my post, more as additional information rather than a real recommendation.

Extra tricks for speed

Here's a slightly faster version of Pranav's solution:

left_i = 0
right_i = 0
left_sorted_len = len(left_sorted_seq)
right_sorted_len = len(right_sorted_seq)
merged = []
append = merged.append
while left_i < left_sorted_len and right_i < right_sorted_len:
 smallestInLeft = left_sorted_seq[left_i]
 smallestInRight = right_sorted_seq[right_i]
 if smallestInLeft < smallestInRight:
 append(smallestInLeft)
 left_i += 1
 else:
 append(smallestInRight)
 right_i += 1
if left_i < left_sorted_len:
 return merged + left_sorted_seq[left_i:]
return merged + right_sorted_seq[right_i:]

What makes this faster:

Pranav already precalculated the list lenghts (I renamed to left_sorted_len and right_sorted_len, following PEP8) but did not explain: doing so makes the loop faster, because len(...) is only evaluated once.
Notice the caching of merged.append method as append, to reduce the number of method lookups
Notice the return statements, split to two cases to avoid concatenating empty slices

Source Link

answered May 10, 2014 at 16:12

janos

answered May 10, 2014 at 16:12

janos

112.9k
15
154
396

Empty lists are "false"

When you check if a list is empty or not, you don't need to write like this:

if len(left_sorted_seq) == 0:
 # empty list

You can write simply:

if not left_sorted_seq:
 # empty list

Exceptions are slow, only use for unexpected events

This is slower:

try:
 smallestInLeft = leftSortedSeq[leftPointer]
except IndexError:
 mergedSeq += rightSortedSeq[rightPointer:]
 return tuple(mergedSeq)

compared to this:

if leftPointer < len(leftSortedSeq):
 smallestInLeft = leftSortedSeq[leftPointer]
else:
 mergedSeq += rightSortedSeq[rightPointer:]
 return tuple(mergedSeq)

Exceptions are good for handling situations that normally should not happen. For example trying to insert a duplicate primary key into a database table is clearly some sort of mistaken operation, and exception handling is appropriate for that. In your sorting algorithm, reaching the end of the left or right sequence is perfectly normal, and is very much expected to happen during normal operations. If you benchmark your program with many iterations and sorting over larger sets, the time difference will be big.

Don't convert objects if you don't have to

This conversion to tuple is unnecessary:

return tuple(mergedSeq)

So don't do it. This will work just fine, and slightly faster:

return mergedSeq

Rename unused variables to `_`

Follow PEP8

PEP8 is the official style guide to Python. I recommend to follow it. You can install the pep8 command line tool with pip, it can detect all coding style violations recursively in your project directory.

lang-py

Return to Answer

Extra tricks for speed

Extra tricks for speed

Empty lists are "false"

Exceptions are slow, only use for unexpected events

Don't convert objects if you don't have to

Rename unused variables to _

Follow PEP8

Rename unused variables to `_`