I have written logic to find the intersection of two arrays. Please review my logic and guide me if my logical approach is not correct.
#include<stdio.h>
void intersect(int a1[][2], int a2[][2],int row, int col)
{
int i = 0, j= 0, x = 0, y = 0;
for(i = 0; i <row ; i++)
{
for(j = 0 ; j < col ; j++)
{
for(x = 0; x < row ; x++)
{
for(y = 0; y < col ; y++)
{
if(a1[i][j] == a2[x][y])
printf("%d\t",a1[i][j]);
}
}
}
}
}
int main()
{
int arr1[2][2]={{2,5},{6,8}};
int arr2[2][2]={{1,2},{8,8}};
int row = (sizeof(arr1)/sizeof(arr1[0]));
int col = (sizeof(arr1[0])/sizeof(arr1[0][1]));
intersect(arr1,arr2,row,col);
}
2 Answers 2
Several useful suggestions have been made by toolic, but here are a few more.
int
vs. size_t
When dealing with sizes of arrays, it's more appropriate to use size_t
than int
as a type. Remember to use the correct %zu
format specifier if printing these values.
Algorithm
For the data sizes you're dealing with, your approach is unlikely to have significant issues. However, as you scale your data, a loop within a loop within a loop within a loop within a loop is O(n4) complexity. This is not ideal at all.
row | col | comparisons |
---|---|---|
2 | 2 | 16 |
2 | 3 | 36 |
4 | 4 | 256 |
10 | 10 | 10,000 |
50 | 50 | 6,250,000 |
100 | 100 | 100,000,000 |
Below I've demonstrated a way to improve this substantially. We flatten each array into dynamically allocated storage and then use qsort
to sort these arrays. The sorting has complexity O(n * log n). In terms of complexity, constants factor out, so it doesn't really matter that we're doing it twice (once for each input array).
Once we have these sorted, we only need to do a linear or O(n) complexity loop through them to find intersected values. We'll use two indices into the arrays and use them to advance separately through the two sorted arrays.
A side benefit of this approach is that we don't intersect more than once. intersecting {2, 5, 6, 8}
and {1, 2, 8, 8}
will only yield {2, 8}
and can never yield {2, 8, 8}
because the 8 is present twice in one and only once in the other. We would likely expect the same result whether calling intersect(arr1, arr2, row, col)
or intersect(arr2, arr1, row, col)
#include <stdio.h>
#include <stdlib.h>
int intcmp(const void *a, const void *b) {
return *(int *)a - *(int *)b;
}
void intersect(int a[][2], int b[][2], size_t row, size_t col) {
size_t overall_size = row * col;
int *c = malloc(overall_size * sizeof(int));
int *d = malloc(overall_size * sizeof(int));
if (!c || !d) {
fprintf(stderr, "Memory allocations failure!\n");
return;
}
for (size_t i = 0; i < row; i++) {
for (size_t j = 0; j < col; j++) {
c[i * col + j] = a[i][j];
d[i * col + j] = b[i][j];
}
}
qsort(c, overall_size, sizeof(int), intcmp);
qsort(d, overall_size, sizeof(int), intcmp);
size_t i = 0;
size_t j = 0;
while (i < overall_size && j < overall_size) {
if (c[i] == d[j]) {
printf("Common element: %d\n", c[i]);
i++;
j++;
}
else if (c[i] < d[j]) {
i++;
}
else {
j++;
}
}
free(c);
free(d);
}
int main(void) {
int arr1[2][2] = {{2, 5}, {6, 8}};
int arr2[2][2] = {{1, 2}, {8, 8}};
size_t row = sizeof(arr1) / sizeof(arr1[0]);
size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
intersect(arr1, arr2, row, col);
}
Hard-coding
You've hard-coded 2
as the column dimension of your input arrays, but the function also takes a column parameter. This is less than ideal.
By changing the parameter order we can side step this issue.
void intersect(size_t row, size_t col, int a[row][col], int b[row][col])
We could now easily change main
to:
int main(void) {
int arr1[2][3] = {{2, 5, 1}, {6, 8, 3}};
int arr2[2][3] = {{1, 2, 9}, {8, 8, 9}};
size_t row = sizeof(arr1) / sizeof(arr1[0]);
size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
intersect(row, col, arr1, arr2);
}
Array initialization
Speaking of main
, there's no need to specify the row dimension when declaring your arrays in this case. You could simply write:
int main(void) {
int arr1[][3] = {{2, 5, 1}, {6, 8, 3}};
int arr2[][3] = {{1, 2, 9}, {8, 8, 9}};
size_t row = sizeof(arr1) / sizeof(arr1[0]);
size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
intersect(row, col, arr1, arr2);
}
Your compiler will deduce the correct dimension. This does mean that if you get it wrong, your compiler also won't slap you on the wrist if the row dimensions of the two arrays don't match.
-
1\$\begingroup\$ You could always
static_assert(row == sizeof arr2 / sizeof *arr2)
etc. if you want your compiler to help diagnose dimensional mismatch. \$\endgroup\$Toby Speight– Toby Speight2025年06月06日 07:04:01 +00:00Commented Jun 6 at 7:04
Layout
The indentation of the code in the intersect
function is inconsistent
and misleading. I used a free online C code formatter to add consistency.
The new indentation is shown below. The formatter also adds consistency for
spacing around operators.
Documentation
The code should have a comment block at the top to summarize its purpose. For example:
/*
Calculate the intersection of two arrays and print the results.
*/
The intersect
function should have its own comment describing the
types of the inputs and the output. The comment can also
describe the algorithm.
Simpler
This line:
int i = 0, j = 0, x = 0, y = 0;
is simpler as:
int i, j, x, y;
Since the for
loops initialize these variables, there is no need to
initialize them in the declaration line, too.
In fact, the iterator variables can be locally scoped to each for
loop, as will be shown below.
Output
In the main
function, consider printing a newline after the call
to intersect
:
printf("\n");
Otherwise, when I run the code in my Unix shell, the output is merged with the the next shell prompt.
Also, consider giving the intersect
function a return type
other than void
. It could return a single formatted string or
a list of numeric values to allow the caller to format as desired.
Here is new code with some of the suggestions above:
#include<stdio.h>
void intersect(int a1[][2], int a2[][2], int row, int col) {
for (int i = 0; i < row; i++) {
for (int j = 0; j < col; j++) {
for (int x = 0; x < row; x++) {
for (int y = 0; y < col; y++) {
if (a1[i][j] == a2[x][y]) {
printf("%d\t", a1[i][j]);
}
}
}
}
}
}
int main() {
int arr1[2][2] = {
{
2,
5
},
{
6,
8
}
};
int arr2[2][2] = {
{
1,
2
},
{
8,
8
}
};
int row = (sizeof(arr1) / sizeof(arr1[0]));
int col = (sizeof(arr1[0]) / sizeof(arr1[0][1]));
intersect(arr1, arr2, row, col);
printf("\n");
}
a1
appears ina2
. Just convert the arrays to lists of numbers, sort them, and proceed. \$\endgroup\$intersect()
to modify in place. As usual, more memory allows for faster search. \$\endgroup\$