Merge Sort
Merge Sort algorithm is a general-purpose comparison-based sorting algorithm. Most implementations produce a stable sort, in which the order of equal elements is preserved.
Just for practicing, I've implemented the merge sort algorithm copied below and I'd appreciate it if you'd like to review any part of it for any small or big changes/improvements.
Code
from typing import List, TypeVar
import random
from scipy import stats
T = TypeVar("T")
def recursive_merge_sort(input_list: List[T]) -> List[T]:
"""
Recursive Merge Sort
-----------------------------------------------
Merge Sort is a Divide and Conquer algorithm.
It divides the input array in two halves,
calls itself for the two halves and then merges the two sorted halves.
The merge function is used for merging two halves.
Attributes:
- Time Complexity: O(N*Log N)
- Space Complexity: O(N)
- Stable Sort
"""
# Assigns the length of input list.
size_of_input_list = len(input_list)
# Creates a new list for sorted output list.
temp_output_list = [None] * size_of_input_list
# Sorts the input list.
recursive_sort(input_list, temp_output_list, 0, size_of_input_list - 1)
return temp_output_list
def recursive_sort(input_list: List[T], temp_output_list: List[T], first_index: int, last_index: int) -> List[T]:
"""
This method recursively sorts the divided sublists
"""
# Stops the recursion if there is only one element in the sublists.
if first_index == last_index:
return
# Otherwise, calculates the middle point.
mid_index = (first_index + last_index) // 2
# Then, calls the two sublists recursively.
recursive_sort(input_list, temp_output_list, first_index, mid_index)
recursive_sort(input_list, temp_output_list, mid_index + 1, last_index)
# Merges the two sublists.
merge_sublists(input_list, temp_output_list, first_index,
mid_index, mid_index + 1, last_index)
# Copies the sorted part into the temp_output_list.
copy_list(input_list, temp_output_list, first_index, last_index)
def merge_sublists(input_list: List[T], temp_output_list: List[T],
first_start_index: int, first_end_index: int,
second_start_index: int, second_end_index: int) -> List[T]:
"""
This method merges the two sorted sublists with three simple loops:
- If both sublists 1 and 2 have elements to be placed in the output merged list
- If sublists 1 has some elements left to be placed in the output merged list
- If sublists 2 has some elements left to be placed in the output merged list
e.g., sublist 1 [1, 3, 5, 7, 9]
e.g., sublist 2 [2, 4, 6, 8, 10, 12, 14]
- First while loop generates: [1, 2, 3, 4, 5, 6 , 7, 8, 9, 10]
- Second while loop just passes, since no elements left from the first sublist.
- Third while loop generates: [1, 2, 3, 4, 5, 6 , 7, 8, 9, 10, 12, 14]
"""
i = first_start_index
j = second_start_index
k = first_start_index
while i <= first_end_index and j <= second_end_index:
if input_list[i] <= input_list[j]:
temp_output_list[k] = input_list[i]
i += 1
else:
temp_output_list[k] = input_list[j]
j += 1
k += 1
while i <= first_end_index:
temp_output_list[k] = input_list[i]
i += 1
k += 1
while j <= second_end_index:
temp_output_list[k] = input_list[j]
j += 1
k += 1
def copy_list(input_list: List[T], temp_output_list: List[T], first_index: int, end_index: int) -> List[T]:
for i in range(first_index, end_index+1):
input_list[i] = temp_output_list[i]
if __name__ == "__main__":
# Creates a dash line string and a new line for in between the tests.
delimiter = "-" * 70 + "\n"
# Generates a random integer list.
TEST_LIST_INTEGER = random.sample(range(-100, 100), 15) * 3
print(f"""The unsorted integer array is:
{TEST_LIST_INTEGER}""")
print(delimiter)
# Generates a random float list.
TEST_LIST_FLOAT = stats.uniform(0, 100).rvs(45)
print(f"""The unsorted float array is:
{TEST_LIST_FLOAT}""")
print(delimiter)
# Sample float/integer test list for input.
INTEGER_FLOAT_INPUT = list(TEST_LIST_INTEGER + TEST_LIST_FLOAT)
# Sample float/integer test list for output.
INTEGER_FLOAT_OUTPUT = sorted(INTEGER_FLOAT_INPUT)
sorting_algorithms = [
("Recursive Merge Sort", recursive_merge_sort)
]
# Testing
for description, func in sorting_algorithms:
if (func(INTEGER_FLOAT_INPUT.copy()) == INTEGER_FLOAT_OUTPUT):
print(f"{description} Test was Successful.")
else:
print(f"{description} Test was not Successful.")
print(f"""{description} (Integer):
{func(TEST_LIST_INTEGER.copy())}""")
print(f"""{description} (Float):
{func(TEST_LIST_FLOAT.copy())}""")
print(delimiter)
References
1 Answer 1
Type hints and the truth
def recursive_sort(input_list: List[T], temp_output_list: List[T], first_index: int, last_index: int) -> List[T]:
This function actually returns None
. Same with merge_sublists
and copy_list
.
Loop like a native
while i <= first_end_index:
temp_output_list[k] = input_list[i]
i += 1
k += 1
should at least be using enumerate
on input_list
, which will give you both i
and the item from input_list
. If you use the index from enumerate
as an offset, you can avoid using k
as well.
All of that aside, you don't need to loop. Just do slice assignment; something like
temp_output_list[k: k+first_end_index-i] = input_list[i: i+first_end_index]
Many of your loops can use slices like that.
Variable names
No need to capitalize INTEGER_FLOAT_INPUT
and similar lists. They aren't constants.
-
\$\begingroup\$ Note that
input_list[i: i+first_end_index]
actually creates a copy of the slice of the input list. So it could sometimes lead to overhead in time / space usage. Meanwhile, usingitertools.islice
does not make copies but it would traverse from the beginning of the list until the starting index, therefore also adds overhead on running time. \$\endgroup\$GZ0– GZ02019年09月28日 15:19:51 +00:00Commented Sep 28, 2019 at 15:19
Explore related questions
See similar questions with these tags.