I've tried to implement merge sort in c with varying degrees of success, one was correctly sorting the array but leaking memory, one was neither sorting the array and leaking memory.
The following was an attempt at implementing the algorithm again with these points in mind:
- Memory must be freed in the scope it was allocated
- Pointers must not be reassigned without freeing the memory it was pointing to.
Is there any other good practices that I've missed to follow in my try? Are there any improvements that can be made in terms of readability, memory management, etc?
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void merge(int *sorted_array, const int *left_array, const int *right_array,
const size_t left_size, const size_t right_size) {
size_t l = 0;
size_t r = 0;
size_t i = 0;
while (l < left_size && r < right_size) {
if (left_array[l] >= right_array[r]) {
sorted_array[i++] = right_array[r++];
} else {
sorted_array[i++] = left_array[l++];
}
}
if (l < left_size) {
memcpy(sorted_array + i, left_array + l, sizeof(int) * (left_size - l));
}
if (r < right_size) {
memcpy(sorted_array + i, right_array + r, sizeof(int) * (right_size - r));
}
}
void merge_sort(int *sorted_array, const int *array, const size_t size) {
if (size == 1) {
memcpy(sorted_array, array, sizeof(int));
return;
}
const size_t middle = size / 2;
int *left_sorted = malloc(sizeof(int) * middle);
int *right_sorted = malloc(sizeof(int) * (size - middle));
merge_sort(left_sorted, array, middle);
merge_sort(right_sorted, array + middle, (size - middle));
merge(sorted_array, left_sorted, right_sorted, middle, size - middle);
free(left_sorted);
free(right_sorted);
}
int main(void) {
const int array[] = {1, 0, 2, 9, 3, 8, 4, 7, 5, 6};
int *sorted_array = malloc(sizeof(int) * (sizeof(array) / sizeof(int)));
merge_sort(sorted_array, array, sizeof(array) / sizeof(int));
for (int i = 0; i < sizeof(array) / sizeof(int); i++) {
printf("%d ", sorted_array[i]);
}
printf("\n");
free(sorted_array);
return 0;
}
-
\$\begingroup\$ Welcome to Code Review! "one was neither sorting the array and leaking memory" - was that occurring with the code above? If so, what input leads to that? \$\endgroup\$Sᴀᴍ Onᴇᴌᴀ– Sᴀᴍ Onᴇᴌᴀ ♦2023年11月13日 19:53:55 +00:00Commented Nov 13, 2023 at 19:53
-
\$\begingroup\$ The answer I wrote to a similar question would also apply here. In particular, you might find it easier to eliminate all the intermediate allocations if you use an array-slice datatype. \$\endgroup\$Davislor– Davislor2023年11月14日 23:45:32 +00:00Commented Nov 14, 2023 at 23:45
3 Answers 3
Three things pop out to me:
- This is not a stable sort.
Probably does not matter, as you currently only support integers. - This only support integer sorts.
If you look at the standard qsort you will see it takes a function that allows you to sort elements of any type. Would be nice to extend your function in that direction (stable sort becomes more important). - Memory management.
Looks correct, but not optimal.
The trouble is that you do a lot ofmalloc()
/free()
operations. You could allocate enough space at the top level and re-use that space in all your recursive calls. Just make sure different branches of the recursion use difference sections of the single allocated buffer.
-
\$\begingroup\$ Looks perfectly stable to me. \$\endgroup\$vnp– vnp2023年11月13日 23:55:35 +00:00Commented Nov 13, 2023 at 23:55
-
\$\begingroup\$ @Martin Agree on all 3, very good, points. \$\endgroup\$chux– chux2023年11月14日 03:42:35 +00:00Commented Nov 14, 2023 at 3:42
-
\$\begingroup\$ @vpn
if (left_array[l] >= right_array[r]) { sorted_array[i++] = right_array[r++];
If the values are equal, then it will take it from the right-hand side rather than left. This means you can re-order values that compare equal. Thus it is not a stable sort. \$\endgroup\$Loki Astari– Loki Astari2023年11月14日 17:14:13 +00:00Commented Nov 14, 2023 at 17:14
Is there any other good practices that I've missed to follow in my try?
Sort stability
Agree with @Martin York code has stable sort concerns.
A stable sort keeps the array elements in the same order when the values are equal.
This occurs when 2 array elements differ in binary content, yet compare equal. With type int
, that only happens with near extinct non-2's complement values +0 & -0. Yet it remains a useful consideration for future code development beyond int
.
I think left_array[l] >= right_array[r]
--> left_array[l] > right_array[r]
would fix the stability. Perhaps more is needed.
Future: Name space
merge()
deserves to be static
for at a later time, when merge()
, merge_sort()
are part of a larger code, only merge_sort()
needs public exposure.
Future: Includes
The set posted is OK, yet I'd move the includes needed only for main()
to after merge()
, merge_sort()
.
Missing error checking for malloc()
success
Size 0 almost OK
merge_sort(destination_maybe_as_null, source_maybe_as_null, 0)
looks OK except after fixing malloc()
checks, might need review.
restrict
Code fails if sorted_array
overlaps array
. Use restrict
to indicate caller should not do that.
// void merge_sort(int *sorted_array, const int *array, const size_t size)
void merge_sort(int * restrict sorted_array, const int * restrict array, const size_t size)
Future: Handle in place sort
// Return 0 on success
// Non-zero on allocation failure.
int merge_sort_int(size_t element_count, int a[element_count]);
Minor: sizeof
usage
Consider sizing to the object rather than the type. It is easier to code right, review and maintain.
// int *left_sorted = malloc(sizeof(int) * middle);
int *left_sorted = malloc(sizeof left_sorted[0] * middle);
and 11 other places.
Minor: avoid unneeded type changes
sizeof(array) / sizeof(int)
is a size_t
.
// for (int i = 0; i < sizeof(array) / sizeof(int); i++) {
for (size_t i = 0; i < sizeof(array) / sizeof(int); i++) {
Minor: Simplification
// int *sorted_array = malloc(sizeof(int) * (sizeof(array) / sizeof(int)));
int *sorted_array = malloc(sizeof array);
Soapbox: Not a big fan of const
with small functions
Why void merge_sort(int *sorted_array, const int *array, const size_t size)
instead of full const
with void merge_sort(int *const sorted_array, const int * const array, const size_t size)
as pointers sorted_array
and array
do not change either.
For me, with short functions, I like:
void merge_sort(int *sorted_array, const int *array, size_t size);
It is a style issue. Code to your group's standard.
Code risks invoking undefined behavior:
None of the calls to malloc()
have their return values checked.
#if 0
int *right_sorted = malloc(sizeof(int) * (size - middle));
#else
/* If the type of right_sorted changes, we won't need to make any changes
* to the function call. It is easier to maintain this way.
*/
int *right_sorted = malloc(sizeof *right_sorted * (size - middle);
#endif
Explore related questions
See similar questions with these tags.