I have two questions about my code:
- Are there any performance improvements in this implementation or generally (
register int
?) that I could make? - What could I improve in my coding style?
void mergesortArray(int data[], int amount){
if(amount == 1) return;
//Precomputing values
int sizeint = sizeof(int);
int amountLeft = amount / 2;
int amountRight = (amount % 2 == 0) ? amountLeft : (amountLeft + 1);
//Splitting the array in right and left
int *left = calloc(amountLeft, sizeint);
int *right = calloc(amountRight, sizeint);
if(left == NULL || right == NULL) return;
//Copying the splitted content
memcpy(left, data, amountLeft * sizeint);
memcpy(right, data + amountLeft, amountRight * sizeint);
//Recursive sorting the splitted arrays
mergesortArray(left,amountLeft);
mergesortArray(right,amountRight);
//Merging the numbers
int *pos1 = &left[0];
int *pos2 = &right[0];
int i = 0;
for(i = 0; i < amount; i++) {
if(*pos1 <= *pos2) {
data[i] = *pos1;
if (pos1 == &right[amountRight - 1])
break;
if(pos1 == &left[amountLeft - 1])
pos1 = &right[amountRight - 1];
else
pos1++;
}
else {
data[i] = *pos2;
if(pos2 == &right[amountRight - 1])
pos2 = &left[amountLeft - 1];
else
pos2++;
}
}
}
3 Answers 3
The register
keyword will do nothing. Current day compilers are smart enough to find a good register allocation themselves without you running in its way.
Why calloc
? you initialize the values through memcpy
anyway so no need for the zero initialization, malloc
will work fine. Speaking of you need to free what you allocate otherwise you'll get a memory leak.
You should only need to allocate a single extra array for the merge.
When the amount left to sort is small (amount < 5
) you can switch over to another sort that is more efficient for small numbers like insertion sort.
Finally brace position on your else is a bit odd:
if(*pos1 <= *pos2) {
}
else {
}
I'd expect either the opening braces to also be on its own line or else to be on the same as the closing brace of the if.
-
\$\begingroup\$ What do you mean exactly by "The amount left to sort"? \$\endgroup\$iPh1ps99– iPh1ps992015年11月17日 12:15:14 +00:00Commented Nov 17, 2015 at 12:15
You can optimize calloc
s away by allocating a buffer of equal size and contents as the input array data
at the very beginning of sorting. After that, you keep alternating the roles of the two arrays (data
and the buffer). See what I mean:
static void mergesortArrayImprovedImpl(int* source,
int* target,
size_t offset,
size_t length)
{
if (length < 2) return;
size_t left_range_length = length >> 1;
// Divide.
mergesortArrayImprovedImpl(target,
source,
offset,
left_range_length);
mergesortArrayImprovedImpl(target,
source,
offset + left_range_length,
length - left_range_length);
// Conquer.
size_t left_index = offset;
size_t left_index_bound = offset + left_range_length;
size_t right_index = left_index_bound;
size_t right_index_bound = offset + length;
size_t target_index = offset;
while (left_index < left_index_bound && right_index < right_index_bound)
{
target[target_index++] = source[left_index] < source[right_index] ?
source[left_index++] :
source[right_index++];
}
memcpy(target + target_index,
source + left_index,
sizeof(int) * (left_index_bound - left_index));
memcpy(target + target_index,
source + right_index,
sizeof(int) * (right_index_bound - right_index));
}
void mergesortArrayImproved(int data[], size_t amount)
{
if (amount < 2) return;
int* aux = malloc(sizeof(int) * amount);
memcpy(aux, data, sizeof(int) * amount);
mergesortArrayImprovedImpl(aux, data, 0, amount);
free(aux);
}
I leave profiling the two implementations, however, to you.
As usual, I recommend random pausing as a way to see what takes the most time.
That said (I hate saying "that said" because it implies guessing is a good thing - it's not) here's my strong suspicion:
A call to calloc
only takes a line of code. Does that mean it's cheap?
If you single-step it in the disassembly window you'd better bring lunch, because it will take quite a while.
So I suspect the samples will show that as your "bottleneck".
You don't really need any extra memory (as @ratchet said) except another temporary array of the same size as the original.
You can do your splitting in the original array.
Do the merge into the temporary array, and copy back to the original array.
That way you malloc
only once.
(And you should probably free
it when you're done.)
And also as @ratchet said, when you get down to small amount
, an unrolled bubble sort would probably be faster.
Example:
if (amount == 1){
} else if (amount == 2){
if (data[0] > data[1]) swap(data[0], data[1]);
} else if (amount == 3){
if (data[0] > data[1]) swap(data[0], data[1]);
if (data[0] > data[2]) swap(data[0], data[2]);
if (data[1] > data[2]) swap(data[1], data[2]);
} else if (amount == 4){
if (data[0] > data[1]) swap(data[0], data[1]);
if (data[0] > data[2]) swap(data[0], data[2]);
if (data[0] > data[3]) swap(data[0], data[3]);
if (data[1] > data[2]) swap(data[1], data[2]);
if (data[1] > data[3]) swap(data[1], data[3]);
if (data[2] > data[3]) swap(data[2], data[3]);
} else {
... your code ...
}
BTW: You could say:
int amountRight = amount - amountLeft;
-
\$\begingroup\$ Do you know
switch
? \$\endgroup\$Deduplicator– Deduplicator2015年11月19日 03:13:07 +00:00Commented Nov 19, 2015 at 3:13 -
\$\begingroup\$ @Deduplicator: Sure, but the only purpose of
switch
is to give the compiler permission to generate a jump table, and it only does that if the number of branches is sufficiently large, and the branch codes are compact. Otherwise it just generates an if-ladder. \$\endgroup\$Mike Dunlavey– Mike Dunlavey2015年11月19日 03:15:35 +00:00Commented Nov 19, 2015 at 3:15
register
modifier is that the address of the variable cannot be acquired. I.E. a total waste of effort. \$\endgroup\$calloc()
twice, but there is no calls tofree()
so this algorithm will leak memory like a sieve. \$\endgroup\$free()
for each of the pointers from the two local calls tocalloc()
just before returning from the recursion level \$\endgroup\$calloc()
andfree()
by sorting/merging the sub arrays in place rather than copying them and sorting the copies \$\endgroup\$