The task was as follows:
Write a function that will merge the contents of two sorted (ascending order) arrays of type double values, storing the result in an array output parameter (still in ascending order). The function should not assume that both its input parameter arrays are the same length but can assume that one array does not contain two copies of the same value. The result array should also contain no duplicate values.
Hint: When one of the input arrays has been exhausted, do not forget to copy the remaining data in the other array into the result array. Test your function with cases in which (1) the first array is exhausted first, (2) the second array is exhausted first, and (3) the two arrays are exhausted at the same time (i.e., they end with the same value). Remember that the arrays input to this function must already be sorted.
Please rate my solution.
I would have worked with "sizeof", but this concept has not been introduced in the book from where this task is taken.
#include <stdio.h>
#define LENGTH1 3
#define LENGTH2 4
int merge(int *output_array, int* array1, int* array2, int length1, int length2); //return counter
int main()
{
int array1[LENGTH1] = {-2, -1, 0},
array2[LENGTH2] = {0, 3, 4, 6},
array3[LENGTH1+LENGTH2] = {0};
int counter = merge(array3, array1, array2, LENGTH1, LENGTH2);
for(int i = 0; i < counter; i++)
printf("[%d]%d\n", i, array3[i]);
return 0;
}
int merge(int *output_array, int *array1, int *array2, int length1, int length2)
{
int i = 0, //array1 counter
j = 0, //array2 counter
k = 0, //output counter
counter = 0;
while(i < length1 && j < length2) //break when either array is fully checked
{
if(array1[i] < array2[j])
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
else if(array1[i] == array2[j]) //if element is same move to the next element in both arrays
{
output_array[k] = array1[i];
i++;
j++;
k++;
counter++;
}
else
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
}
while(i >= length1 && j < length2) //enter if second array is not fully checked
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
while(j >= length2 && i < length1) //enter if first array is not fully checked
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
return counter;
}
7 Answers 7
The implementation does not conform to the specification:
The specification explicitly mentioned "arrays of type double
values", yet you have used int
everywhere.
Extract the common code out of the branches:
In merge()
, these statements:
k++;
counter++;
run in every branch. Consider taking them out and executing them unconditionally.
Place main()
at the end of the source file:
This has the benefit that a separate prototype need not be provided for all the functions before main()
.
Array declarations:
- You need not specify the size when an initializer-list is present. Let the compiler determine the number of elements.
- Whilst it is always a good practice to initialize your variables, zeroing an array, especially a large one, is of no benefit when the next thing the program does it overwrite it.
I'd also call it size
, and not LENGTH
. LENGTH
is usually associated with
strings, not arrays (Okay, John, strings are char
arrays, but you get the point). As an example, C++'s std::array
have a size
member function, but no length
member function.
Choose more descriptive identifiers:
int counter = merge(array3, array1, array2, LENGTH1, LENGTH2);
What is counter
? What is a counter
of? As merge
returns the size of the final sorted array, why not name it to final_size
, total_size
, total_elems
, merged_count
, or something similar? As is, counter
is too vague.
In merge()
, you have k
and counter
, which both hold the same value. Eliminate one of them, and name it to one of the afore-mentioned identifiers.
Use standard exit status codes:
Prefer using EXIT_SUCCESS
and EXIT_FAILURE
from <stdlib.h>
instead of magic numbers like 0, 1, -1, et cetera.
Use const
for arguments that are not modified:
int merge(int *output_array, int* array1, int* array2, int length1, int length2); //return counter
is written better as:
int merge(int *output_array, const int* array1, const int* array2, int length1, int length2); //return counter
This signals that array1
and array2
are not modified by this function. Adding the const
qualifier to length1
and length2
is not useful (some people prefer to albeit, I am not looking for a debate) as these parameter are copies, and any modification made to them will not be reflected in the calling code. In other words, const
is pointless when the argument is passed by value since you will not be modifying the caller's object.
But now the line has become too long and the comment (which is not really useful) is not visible.
We could rewrite it as:
/* Returns the size of the final array. */
int merge(int *output_array,
const int* array1,
const int* array2,
int length1,
int length2);
But why does this return an int
? The specification does not mention anything about a return value.
Use static
keyword for array parameters:
In C99 and above, you can do array[static size]
in function parameters to specify that array
must have at least size
items.
A compiler has then the right to assume that argument is not nullptr
and therefore it could perform some optimizations.
For this program:
int main(void)
{
merge(10, nullptr, 20, nullptr, nullptr);
int array1[5];
int array2[10];
int array3[12];
merge(10, array1, 15, array2, array3);
int *p = nullptr;
merge(5, array1, 10, array2, p);
}
GCC 14.1 outputs:
<source>: In function 'main':
<source>:60:5: warning: argument 2 null where non-null expected [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^~~~~
<source>:4:8: note: in a call to function 'merge' declared 'nonnull'
4 | size_t merge(size_t size1,
| ^~~~~
<source>:60:5: warning: argument 4 null where non-null expected [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^~~~~
<source>:4:8: note: in a call to function 'merge' declared 'nonnull'
4 | size_t merge(size_t size1,
| ^~~~~
<source>:60:5: warning: argument 5 null where non-null expected [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^~~~~
<source>:4:8: note: in a call to function 'merge' declared 'nonnull'
4 | size_t merge(size_t size1,
| ^~~~~
<source>:66:5: warning: 'merge' reading 40 bytes from a region of size 20 [-Wstringop-overread]
66 | merge(10, array1, 15, array2, array3);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:66:5: note: referencing argument 2 of type 'const int[]'
<source>:66:5: warning: 'merge' reading 60 bytes from a region of size 40 [-Wstringop-overread]
<source>:66:5: note: referencing argument 4 of type 'const int[]'
<source>:66:5: note: referencing argument 5 of type 'int[]'
<source>:4:8: note: in a call to function 'merge'
4 | size_t merge(size_t size1,
| ^~~~~
<source>:69:5: warning: argument 5 null where non-null expected [-Wnonnull]
69 | merge(5, array1, 10, array2, p);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:4:8: note: in a call to function 'merge' declared 'nonnull'
4 | size_t merge(size_t size1,
| ^~~~~
<source>:66:5: warning: 'array1' may be used uninitialized [-Wmaybe-uninitialized]
66 | merge(10, array1, 15, array2, array3);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:4:8: note: by argument 2 of type 'const int[static]' to 'merge' declared here
4 | size_t merge(size_t size1,
| ^~~~~
<source>:62:9: note: 'array1' declared here
62 | int array1[5];
| ^~~~~~
<source>:66:5: warning: 'array2' may be used uninitialized [-Wmaybe-uninitialized]
66 | merge(10, array1, 15, array2, array3);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:4:8: note: by argument 4 of type 'const int[static]' to 'merge' declared here
4 | size_t merge(size_t size1,
| ^~~~~
<source>:63:9: note: 'array2' declared here
63 | int array2[10];
| ^~~~~~
<source>:69:5: warning: 'array1' may be used uninitialized [-Wmaybe-uninitialized]
69 | merge(5, array1, 10, array2, p);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:4:8: note: by argument 2 of type 'const int[static]' to 'merge' declared here
4 | size_t merge(size_t size1,
| ^~~~~
<source>:62:9: note: 'array1' declared here
62 | int array1[5];
| ^~~~~~
<source>:69:5: warning: 'array2' may be used uninitialized [-Wmaybe-uninitialized]
69 | merge(5, array1, 10, array2, p);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:4:8: note: by argument 4 of type 'const int[static]' to 'merge' declared here
4 | size_t merge(size_t size1,
| ^~~~~
<source>:63:9: note: 'array2' declared here
63 | int array2[10];
| ^~~~~~
Compiler returned: 0
whilst Clang 18.1.0 outputs:
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:5:24: note: callee declares array parameter as static here
5 | const int array1[static size1],
| ^ ~~~~~~~~~~~~~~
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:7:24: note: callee declares array parameter as static here
7 | const int array2[static size2],
| ^ ~~~~~~~~~~~~~~
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:8:18: note: callee declares array parameter as static here
8 | int output_array[static size1 + size2])
| ^ ~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.
Compiler returned: 0
and x64-64 icx 202400 outputs:
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:5:24: note: callee declares array parameter as static here
5 | const int array1[static size1],
| ^ ~~~~~~~~~~~~~~
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:7:24: note: callee declares array parameter as static here
7 | const int array2[static size2],
| ^ ~~~~~~~~~~~~~~
<source>:60:5: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
60 | merge(10, nullptr, 20, nullptr, nullptr);
| ^ ~~~~~~~
<source>:8:18: note: callee declares array parameter as static here
8 | int output_array[static size1 + size2])
| ^ ~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.
Compiler returned: 0
GCC detects all 3 wrong use cases, whilst Clang and ICX only detect the first one. A win for GCC.
MSVC does not support this C99 feature (who is suprised?), but MinGW gcc 13.1.0 and MinGW clang 16.0.2 have the same output as GCC and Clang as shown above.
Moreover, this also documents the function better. As the array must have at least size
items, it also means that a nullptr
is not a valid value for this argument.
It also eliminates the need of having to document that the function would invoke undefined behavior if nullptr
s are passed in.
Note that this feature is not usable for functions like memcpy
because arrays of void
are illegal, or for return values. A GCC extension, __attribute__((nonnull))
allows parameters to be marked as non-null. This extension is also supported by Intel's compiler and Clang. A Clang extension allows _Nullable
, _Nonnull
and _Null_unspecified
. This extension is not supported by GCC. A GCC extension, __attribute__((returns_nonnull))
can be used to specify that a function returns a nonnull value. This extension is also supported by Clang and Intel's compiler. I await an _Optional
type in the next revision of the C Standard.
After adding static
, our final prototype becomes:
/* Returns the size of the final merged array. */
size_t merge(size_t size1,
const int array1[static size1],
size_t size2,
const int array2[static size2],
int output_array[static size1 + size2]);
Note that I have renamed length1
and length2
to size1
and size2
, and have also changed their type to size_t
, because array sizes are not signed. I have also changed the return type to size_t
to be consistent.
output_array
has space for at least size1 + size2
for the case that there are no duplicates in the sorted arrays.
I'd also replace the while
loops:
while(i >= length1 && j < length2) //enter if second array is not fully checked
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
while(j >= length2 && i < length1) //enter if first array is not fully checked
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
with for
loops:
for (; j < length2; ++j, ++k, ++counter) //enter if second array is not fully checked
{
output_array[k] = array2[j];
}
for (; i < length1; ++i, ++k, ++counter) //enter if first array is not fully checked
{
output_array[k] = array1[i];
}
I find that this expresses the intent more clearly now, and the code is also less verbose.
Though, i
, j
, and k
do not hold any meaning to me. These too should be replaced with descriptive identifiers.
-
1\$\begingroup\$
merge
should return asize_t
too, for consistency. \$\endgroup\$Matthieu M.– Matthieu M.2024年05月28日 09:30:21 +00:00Commented May 28, 2024 at 9:30 -
\$\begingroup\$ @MatthieuM. Code updated. \$\endgroup\$Madagascar– Madagascar2024年05月28日 13:50:47 +00:00Commented May 28, 2024 at 13:50
-
\$\begingroup\$ I would move the length/size parameters to be just after their corresponding arrays instead of at the end for easier reading of the code. I might also change the returned size to be an out parameter using pointer indirection to make it more uniform. \$\endgroup\$penguin359– penguin3592024年05月28日 20:12:29 +00:00Commented May 28, 2024 at 20:12
-
1\$\begingroup\$ Apparently, I mentally checked out when I got to the static section as I've not seen that C99 language feature before. No, your last function prototype looks looks good. I was just recommending that they be bundled together unlike the prototype offered just before that section. You've already applied my comment as I intended if not wrote. :-) \$\endgroup\$penguin359– penguin3592024年05月28日 21:07:24 +00:00Commented May 28, 2024 at 21:07
-
1\$\begingroup\$ Note that MSVC does not support
[static N]
array parameters, and GCC, Clang and ICX all interpret[N]
array parameters as meaningN
or more, so making that change would only make the code less portable. You might defineAT_LEAST
to expand tostatic
, conditionally. \$\endgroup\$Davislor– Davislor2024年06月01日 08:48:16 +00:00Commented Jun 1, 2024 at 8:48
General Comments
It would be helpful if you mentioned what book you are using, and which version of C you are targeting. It might also be helpful if you mentioned whether you were programming on Linux or Windows.
If the program were taking user input, the input arrays should be sorted before they are merged. You would want a function for array sorting.
The program might be easier to read, write and maintain if the merge and the sorting were separate. The merge could be performed first and then the sort could be performed after, these would be two separate functions. In this instance the performance might be better as well.
Variable Declarations
A recommended programming practice to declare the variable as needed. In C the language doesn't provide a default initialization of the variable so variables should be initialized as part of the declaration. For readability and maintainability each variable should be declared and initialized on its own line. The method of variable declarations that you are currently using where the line ends in a comma rather than a semicolon will make the code more difficult to maintain.
C Array Declarations
There is no need to use a size in C array variable declarations if the array is being assigned values in the declaration. In this particular instance there is no need to assign all zeros to the output array since all values will be over written.
Instead of:
int array1[LENGTH1] = {-2, -1, 0},
array2[LENGTH2] = {0, 3, 4, 6},
array3[LENGTH1+LENGTH2] = {0};
This might be better:
double input_1[] = {-2, -1, 0};
double input_2[] = {0, 3, 4, 6};
double output[LENGTH1+LENGTH2];
Code Organization
Function prototypes are very useful in large programs that contain multiple source files, and that in case they will be in header files. In a single file program like this it is better to put the main()
function at the bottom of the file and all the functions that get used in the proper order above main()
. Keep in mind that every line of code written is another line of code where a bug can crawl into the code.
-
\$\begingroup\$ Re: "merge, then sort"... Will then have to scan & remove possible duplicates from combined array. OP's problem statement suggests both the "pre-merge" arrays contain sorted and unique values. \$\endgroup\$Fe2O3– Fe2O32024年05月28日 03:07:48 +00:00Commented May 28, 2024 at 3:07
-
2\$\begingroup\$ Re: "array declarations". OP's version ensures all 3 arrays are compatible. Suggestion to "let compiler count elements" (using
arr[ ] = ...
, then hardwired tokensLENGTHx
risks a mismatch that could lead to buffer overrun and the lawless wilds known as UB. Also, the cost of initialising the destination array is nothing when compared to the cost of NOT practising the habit of initialising variables when they're declared. Bad recommendation, imo... \$\endgroup\$Fe2O3– Fe2O32024年05月28日 03:12:08 +00:00Commented May 28, 2024 at 3:12 -
1\$\begingroup\$ @Fe2O3 This problem would be more real world if sizeof() was used to determine the length of each input array and then calloc() sufficient space. A better test of the code would be if the input array's were unsorted. The comma separated declarations are a bug waiting to happen. The code is already not using at least one location in the output array due. I agree that there is the possibility of overflow in my suggestion, but not as the code was written. \$\endgroup\$2024年05月28日 12:08:16 +00:00Commented May 28, 2024 at 12:08
-
\$\begingroup\$ In fact, it's all a bit of a minefield. OP's version (
arr[ LEN ] = ...
) risks careless definition of too few initialisers in one/both arrays. Result: multiple padding zeros, breaking the rule that there are no duplicate values in source arrays AND all array values are in ascending order... Part of the lot of a beginner is to have to "step on the Lego pieces scattered on the floor in the dark" just like all who've gone before... Cheers! :-) \$\endgroup\$Fe2O3– Fe2O32024年05月28日 23:32:25 +00:00Commented May 28, 2024 at 23:32
The task was to merge the contents of two sorted arrays of type double
.
In your solution you use int
type.
You also use a variable counter
, but in your code k
and counter
always have the same value.
So counter
is not useful, just return k
.
i
, j
& k
is for sure short name, but is kinda hard to follow.
You also know length is always >= 0, so you can use unsigned int
or size_t
for type.
As say by pacmaninbw, is recommended to use size_t
because he will have the proper word size for the system and that will improve performance.
In
while(i >= length1 && j < length2)
and
while(j >= length2 && i < length1)
You don't need to check if i >= length1
or j >= length2
cause that always the case if the other's condition (j < length2
or i < length1
) is true.
The code is clear overall. The comments are quite well used, which is good. Yet, a few things can be improved.
- First, if your manual has already covered
const
, you can use that on most of variables. - The macros are not well suited for this use. Don't get me wrong, they work fine here.
A
array1_length
int variable makes more sense, since you pass the lengths to yourmerge
function. Also theLENGTH
macros refer to the size of object within the code which can lead to more errors when you modify it. Macros are better suited for values that are quite independent (error codes for example). - Are you already familiar with
continue
concept? If so you should them in yourwhile
loop, it simplifies a little the code because you avoid theelse
. It is also clearer for the reader that there is nothing happening after theif
block is completed.
(削除) Advice I - a correctness bug (削除ここまで)
If your merge
works on source array with duplicate array components, it will omit (some?) duplicates. (See Advice III.)
Advice II - idiomatic C
I suggest you define all the array indices as of type size_t
. Also, you could switch the "lonely" tail while
loops to memcpy
. Finally, the array lengths may be passed via size_t
.
(削除) Advice III - alternative implementation (削除ここまで)
#include <stdio.h>
#include <string.h>
#define LENGTH1 3
#define LENGTH2 4
void merge_v2(int* target_array, int* source_array_1, int* source_array_2, size_t len1, size_t len2);
int main()
{
int array1[LENGTH1] = { 0, 2, 4 },
array2[LENGTH2] = { 1, 3, 4, 5 },
array3[LENGTH1 + LENGTH2]; // No need to initialize to zero all the array components: it will be overwritten by the merve_v2.
merge_v2(array3, array1, array2, LENGTH1, LENGTH2);
puts("merge_v2 result:");
for (int i = 0; i < LENGTH1 + LENGTH2; i++)
printf("[%d]%d\n", i, array3[i]);
puts("merge result:");
int array4[LENGTH1] = { 0, 2, 4 },
array5[LENGTH2] = { 1, 3, 4, 5 },
array6[LENGTH1 + LENGTH2];
int cnt = merge(array6, array4, array5, LENGTH1, LENGTH2);
for (int i = 0; i < cnt; i++)
printf("[%d]%d\n", i, array6[i]);
return 0;
}
void merge_v2(int* target_array, int* source_array_1, int* source_array_2, size_t len1, size_t len2)
{
size_t target_index = 0;
size_t source_index_1 = 0;
size_t source_index_2 = 0;
while (source_index_1 != len1 && source_index_2 != len2)
{
// Stable sort, but does not make difference for ints:
target_array[target_index++] =
source_array_2[source_index_2] < source_array_1[source_index_1] ?
source_array_2[source_index_2++] :
source_array_1[source_index_1++];
}
memcpy(&target_array[target_index], &source_array_1[source_index_1], sizeof(int) * (len1 - source_index_1));
memcpy(&target_array[target_index], &source_array_2[source_index_2], sizeof(int) * (len2 - source_index_2));
}
int merge(int* output_array, int* array1, int* array2, int length1, int length2)
{
int i = 0, //array1 counter
j = 0, //array2 counter
k = 0, //output counter
counter = 0;
while (i < length1 && j < length2) //break when either array is fully checked
{
if (array1[i] < array2[j])
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
else if (array1[i] == array2[j]) //if element is same move to the next element in both arrays
{
output_array[k] = array1[i];
i++;
j++;
k++;
counter++;
}
else
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
}
while (i >= length1 && j < length2) //enter if second array is not fully checked
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
while (j >= length2 && i < length1) //enter if first array is not fully checked
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
return counter;
}
Which prints:
merge_v2 result:
[0]0
[1]1
[2]2
[3]3
[4]4
[5]4
[6]5
merge result:
[0]0
[1]1
[2]2
[3]3
[4]4
[5]5
Edit
After being notified about the specs of this post, I had to rework my answer.
Advice IV - better naming
Instead of i
, j
and k
, I would deploy target_index
, source_index_1
, source_index_2
.
Advice V - size_t
I would declare the return type of merge
as size_t
. Also, I would define the arguments int length1
and int length2
as size_t len1
and size_t len2
.
Advice VI - alternative implementation: take two
#include <stdio.h>
#include <string.h>
#define LENGTH1 3
#define LENGTH2 4
size_t merge_v2(int* target_array, int* source_array_1, int* source_array_2, size_t len1, size_t len2);
int main()
{
int array1[LENGTH1] = { 0, 2, 4 },
array2[LENGTH2] = { 1, 3, 4, 5 },
array3[LENGTH1 + LENGTH2]; // No need to initialize to zero all the array components: it will be overwritten by the merve_v2.
size_t counter_v2 = merge_v2(array3, array1, array2, LENGTH1, LENGTH2);
puts("merge_v2 result:");
for (int i = 0; i < counter_v2; i++)
printf("[%d]%d\n", i, array3[i]);
puts("merge result:");
int array4[LENGTH1] = { 0, 2, 4 },
array5[LENGTH2] = { 1, 3, 4, 5 },
array6[LENGTH1 + LENGTH2] = { 0 };
int cnt = merge(array6, array4, array5, LENGTH1, LENGTH2);
for (int i = 0; i < cnt; i++)
printf("[%d]%d\n", i, array6[i]);
return 0;
}
size_t merge_v2(int* target_array, int* source_array_1, int* source_array_2, size_t len1, size_t len2)
{
size_t target_index = 0;
size_t source_index_1 = 0;
size_t source_index_2 = 0;
size_t counter = 0;
while (source_index_1 != len1 && source_index_2 != len2)
{
// Stable sort, but does not make difference for ints:
if (source_array_1[source_index_1] < source_array_2[source_index_2])
{
target_array[target_index++] = source_array_1[source_index_1++];
}
else if (source_array_1[source_index_1] > source_array_2[source_index_2])
{
target_array[target_index++] = source_array_2[source_index_2++];
}
else
{
target_array[target_index++] = source_array_1[source_index_1++];
source_index_2++;
}
counter++;
}
counter += len1 - source_index_1;
counter += len2 - source_index_2;
memcpy(&target_array[target_index], &source_array_1[source_index_1], sizeof(int) * (len1 - source_index_1));
memcpy(&target_array[target_index], &source_array_2[source_index_2], sizeof(int) * (len2 - source_index_2));
return counter;
}
int merge(int* output_array, int* array1, int* array2, int length1, int length2)
{
int i = 0, //array1 counter
j = 0, //array2 counter
k = 0, //output counter
counter = 0;
while (i < length1 && j < length2) //break when either array is fully checked
{
if (array1[i] < array2[j])
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
else if (array1[i] == array2[j]) //if element is same move to the next element in both arrays
{
output_array[k] = array1[i];
i++;
j++;
k++;
counter++;
}
else
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
}
while (i >= length1 && j < length2) //enter if second array is not fully checked
{
output_array[k] = array2[j];
j++;
k++;
counter++;
}
while (j >= length2 && i < length1) //enter if first array is not fully checked
{
output_array[k] = array1[i];
i++;
k++;
counter++;
}
return counter;
}
... which outputs:
merge_v2 result:
[0]0
[1]1
[2]2
[3]3
[4]4
[5]5
merge result:
[0]0
[1]1
[2]2
[3]3
[4]4
[5]5
Overall, I would rate your solution as adequate - and certainly good for someone who has not yet reached sizeof
. This is a question I use in job interviews and in a short, stressful period, something of about your level is the most that can be expected even from fairly experienced programmers.
Harith has already made many good comments, though I disagree about using static as that makes the code no longer valid as C++. I find that it is common to use both languages and therefore handy if your simple utilities are compatible with both. But I would put the array and its size adjacent in the parameter list as they go together. And do not replace "length" with "size". The number of items is a length, the storage space they occupy (usually in bytes) is a size. Many bugs have been caused by confusing these two. It is perfectly fine and idiomatic to return 0 from main().
Naming is important. Each name should be a good mnemonic (in contrast to being an attempt to avoid having to look at the declaration or documentation). In this code array1
can be considerably improved. It tells us that it is an array, which is not useful as when we see in each use (e.g. array1[i]
) that it is obviously an array. What is much more useful is a reminder of its purpose, so the names of the arrays in merge() could be in1
, in2
, and out
.
It's fine to use a very short name like i
in a very short area of code (such as you have main()), but in merge(), when we see it being used it does not really remind us of its meaning. I would suggest pos1
, pos2
, and outpos
instead of i
, j
, and k
. The name counter
is too generic. What is it counting? It would be better named something like numout
or even nout
as using 'n' like that is a common idiom. Of course, in this case (as another answer has mentioned) it is redundant as it always hold the same value as k
.
A valuable principle is Don't Repeat Yourself (DRY). In the first while loop, the code tests whether both arrays have elements, and when that test fails, these tests get repeated afterwards. It would probably be better to write the loop as a forever (for (;;)
) loop with clauses to cover the possibilities. This gives five cases in the loop instead of three.
Moving on to more advanced matters, since this is written in C, it can use the language more idiomatically to make the code shorter and clearer. Instead of
out[outpos] = in1[in1pos];
in1pos++;
outpos++;
we can write:
out[outpos++] = in1[in1pos++];
This has the added advantage in this function that the case of consuming from both arrays stands out more as the other cases do not have isolated increments.
And for even more advanced use of C, remember that, while the data is stored in an array, the merge() function gets pointers which are variables and can be advanced as elements are consumed. A first attempt at this would replace:
out[outpos++] = in1[in1pos++];
with
*out++ = *in1++;
--length1;
++outlen;
This removes the need for separate variable to hold the current position in each array, but leaves pairs of variables (e.g. out
and outlen
) which must always be modified together and it is easy to forget one when doing the other. This can be avoided by using only pointers. At the start of the function declare:
int *in1end = in1 + length1;
and replace the test for exhaustion:
if (in1pos < length1) {
with
if (in1 < in1end) {
In this case, if you want to return the length of the result (which you do, despite it not being in the requirements, since it's a quantity that cannot be otherwise known to the caller), you need to keep a record of the initial value of out
. Maybe make the parameter be called output
and declare:
int *out = output; // current position in the output
returning:
return out - output;
Putting everything together, gives the following:
#include <stddef.h>
#include <stdio.h>
#define LENGTH1 3
#define LENGTH2 4
size_t merge(int *out, const int *in1, size_t len1, const int *in2, size_t len2);
int main()
{
int test_in1[LENGTH1] = {-2, -1, 0},
test_in2[LENGTH2] = {0, 3, 4, 6},
test_out[LENGTH1+LENGTH2] = {0};
int outlen = merge(test_out, test_in1, LENGTH1, test_in2, LENGTH2);
for(int i = 0; i < outlen; i++) {
printf("[%d]%d\n", i, test_out[i]);
}
return 0;
}
size_t merge(int *output, const int *in1, size_t len1, const int *in2, size_t len2)
{
int *out = output; // current position in output
const int *const in1end = in1 + len1;
const int *const in2end = in2 + len2;
for (;;) {
if (in1 == in1end) {
while (in2 < in2end) *out++ = *in2++;
return out - output;
} else if (in2 == in2end) {
while (in1 < in1end) *out++ = *in1++;
return out - output;
} else if (*in1 < *in2) {
*out++ = *in1++;
} else if (*in1 > *in2) {
*out++ = *in2++;
} else { // identical elements
*out++ = *in1++;
in2++;
}
}
}
-
\$\begingroup\$ Curious why you included
<stddef.h>
. C99 (N1256), C11 (N1570) and C17 (N2176) section 7.21.1 all confirm thatstdio.h
declaressize_t
. \$\endgroup\$Madagascar– Madagascar2024年06月02日 23:04:48 +00:00Commented Jun 2, 2024 at 23:04 -
\$\begingroup\$ For C++ and MSVC, one can define
AT_LEAST
to expand tostatic
, conditionally, as suggested by @Davislor under my answer. \$\endgroup\$Madagascar– Madagascar2024年06月02日 23:06:39 +00:00Commented Jun 2, 2024 at 23:06 -
1\$\begingroup\$ stddef,h was originally instinct, but on reflection, if you put the merge function first in the file (with stddef included) it is self-contained. Then you can append #ifdef TEST, the stdio include, and main() with the test cases, giving you a minimal utility when built normally, but a complete self testing unit when built with -DTEST \$\endgroup\$c19– c192024年06月06日 01:33:56 +00:00Commented Jun 6, 2024 at 1:33
You've got a lot of separate branches at separate levels of indentation. Reversing the approach to "fill an array from two sorted arrays" makes this code a lot more readable, in my opinion.
#include <stdio.h>
#include <stdbool.h>
#define LENGTH1 3
#define LENGTH2 4
int merge(int *it_out, const int* it1, const int* it2, const int length1, const int length2); //return counter
int main()
{
const int array1[LENGTH1] = {-2, -1, 0};
const int array2[LENGTH2] = {0, 3, 4, 6};
int array3[LENGTH1+LENGTH2] = {0};
int counter = merge(array3, array1, array2, LENGTH1, LENGTH2);
for(int i = 0; i < counter; i++)
printf("[%d]%d\n", i, array3[i]);
return 0;
}
int merge(int *it_out, const int* it1, const int* it2, const int length1, const int length2)
{
const int* const it1_end = it1+length1;
const int* const it2_end = it2+length2;
int* const it_out_end = it_out + length1 + length2;
while (true) {
const bool it1_has_value = it1 != it1_end;
const bool it2_has_value = it2 != it2_end;
if (!it1_has_value && !it2_has_value) {
break;
}
if (!it1_has_value) {
*it_out = *it2;
it2++;
}
else if (!it2_has_value) {
*it_out = *it1;
it1++;
}
else if (*it1 == *it2) {
*it_out = *it1;
it1++;
it2++;
}
else if (*it1 < *it2) {
*it_out = *it1;
it1++;
}
else {
*it_out = *it2;
it2++;
}
it_out++;
}
return it_out_end - it_out;
}
The other comments of actually using double and using size_t still stand.
Also, addding some consts, especially in your function signature, can help catching errors later. It will help you make sure you don't accidentally alter input arrays.
-
\$\begingroup\$ This solution doesn't eliminate duplicates. I think it's also quite efficient in the case that one array is consumed long before the other array. It does a lot of work that is repetitive once one array is consumed. \$\endgroup\$Sinc– Sinc2024年05月29日 14:03:21 +00:00Commented May 29, 2024 at 14:03
-
\$\begingroup\$ @Sinc thanks, I misread the no duplicates. Fixed that. And yes, it does a lot of extra work if the array lengths are heavily skewed. But often that matters way less than the readability, especially when people start out in this field. Prevent premature optimization. \$\endgroup\$Daniël van den Berg– Daniël van den Berg2024年05月30日 11:59:42 +00:00Commented May 30, 2024 at 11:59
-
1\$\begingroup\$ Y'know Daniel... Two things... 1) Not many readers actually copy/paste/compile the code posted as answers. Sometimes I do. This "fixed" code produces a very unremarkable result... You may wish to try it yourself. 2) This is "Code Review", not "Stack Overflow". The focus is to review the OP's code, providing guidance; not a forum to post your own version of a solution to the OP's problem statement. Sorry about my DV, but, y'know... \$\endgroup\$Fe2O3– Fe2O32024年05月30日 13:25:51 +00:00Commented May 30, 2024 at 13:25
sizeof
useful here? To eliminate theLENGTH
macros? Or do you mean something else? \$\endgroup\$double
and notint
, consider the pitfalls of assuming equality between them. For example: floating-point-gui.de/errors/comparison It may be useful for your test cases to include floating points rather than integers. \$\endgroup\$