In the main loop you're not using elementNumber
for anything. I like to rename these kind of variables to _
. Although, an unused variable is usually a hint that perhaps there's a better way of doing things, like the algorithm suggested by @PranavRaj @PranavRaj.
In the main loop you're not using elementNumber
for anything. I like to rename these kind of variables to _
. Although, an unused variable is usually a hint that perhaps there's a better way of doing things, like the algorithm suggested by @PranavRaj.
In the main loop you're not using elementNumber
for anything. I like to rename these kind of variables to _
. Although, an unused variable is usually a hint that perhaps there's a better way of doing things, like the algorithm suggested by @PranavRaj.
Extra tricks for speed
Here's a slightly faster version of Pranav's solution:
left_i = 0
right_i = 0
left_sorted_len = len(left_sorted_seq)
right_sorted_len = len(right_sorted_seq)
merged = []
append = merged.append
while left_i < left_sorted_len and right_i < right_sorted_len:
smallestInLeft = left_sorted_seq[left_i]
smallestInRight = right_sorted_seq[right_i]
if smallestInLeft < smallestInRight:
append(smallestInLeft)
left_i += 1
else:
append(smallestInRight)
right_i += 1
if left_i < left_sorted_len:
return merged + left_sorted_seq[left_i:]
return merged + right_sorted_seq[right_i:]
What makes this faster:
- Pranav already precalculated the list lenghts (I renamed to
left_sorted_len
andright_sorted_len
, following PEP8) but did not explain: doing so makes the loop faster, becauselen(...)
is only evaluated once. - Notice the caching of
merged.append
method asappend
, to reduce the number of method lookups - Notice the
return
statements, split to two cases to avoid concatenating empty slices
I measured differences in speed using 10 iterations of sorting a random sequence of 1000 numbers. You probably don't have to optimize this to the maximum, that's why I made this the last point of my post, more as additional information rather than a real recommendation.
Extra tricks for speed
Here's a slightly faster version of Pranav's solution:
left_i = 0
right_i = 0
left_sorted_len = len(left_sorted_seq)
right_sorted_len = len(right_sorted_seq)
merged = []
append = merged.append
while left_i < left_sorted_len and right_i < right_sorted_len:
smallestInLeft = left_sorted_seq[left_i]
smallestInRight = right_sorted_seq[right_i]
if smallestInLeft < smallestInRight:
append(smallestInLeft)
left_i += 1
else:
append(smallestInRight)
right_i += 1
if left_i < left_sorted_len:
return merged + left_sorted_seq[left_i:]
return merged + right_sorted_seq[right_i:]
What makes this faster:
- Pranav already precalculated the list lenghts (I renamed to
left_sorted_len
andright_sorted_len
, following PEP8) but did not explain: doing so makes the loop faster, becauselen(...)
is only evaluated once. - Notice the caching of
merged.append
method asappend
, to reduce the number of method lookups - Notice the
return
statements, split to two cases to avoid concatenating empty slices
I measured differences in speed using 10 iterations of sorting a random sequence of 1000 numbers. You probably don't have to optimize this to the maximum, that's why I made this the last point of my post, more as additional information rather than a real recommendation.
Empty lists are "false"
When you check if a list is empty or not, you don't need to write like this:
if len(left_sorted_seq) == 0:
# empty list
You can write simply:
if not left_sorted_seq:
# empty list
Exceptions are slow, only use for unexpected events
This is slower:
try: smallestInLeft = leftSortedSeq[leftPointer] except IndexError: mergedSeq += rightSortedSeq[rightPointer:] return tuple(mergedSeq)
compared to this:
if leftPointer < len(leftSortedSeq):
smallestInLeft = leftSortedSeq[leftPointer]
else:
mergedSeq += rightSortedSeq[rightPointer:]
return tuple(mergedSeq)
Exceptions are good for handling situations that normally should not happen. For example trying to insert a duplicate primary key into a database table is clearly some sort of mistaken operation, and exception handling is appropriate for that. In your sorting algorithm, reaching the end of the left or right sequence is perfectly normal, and is very much expected to happen during normal operations. If you benchmark your program with many iterations and sorting over larger sets, the time difference will be big.
Don't convert objects if you don't have to
This conversion to tuple is unnecessary:
return tuple(mergedSeq)
So don't do it. This will work just fine, and slightly faster:
return mergedSeq
Rename unused variables to _
In the main loop you're not using elementNumber
for anything. I like to rename these kind of variables to _
. Although, an unused variable is usually a hint that perhaps there's a better way of doing things, like the algorithm suggested by @PranavRaj.
Follow PEP8
PEP8 is the official style guide to Python. I recommend to follow it. You can install the pep8
command line tool with pip
, it can detect all coding style violations recursively in your project directory.