Calculate intersection of two arrays

Question 1

I have written logic to find the intersection of two arrays. Please review my logic and guide me if my logical approach is not correct.

#include<stdio.h>
void intersect(int a1[][2], int a2[][2],int row, int col) 
{
 int i = 0, j= 0, x = 0, y = 0;
for(i = 0; i <row ; i++)
{
 for(j = 0 ; j < col ; j++)
 {
 for(x = 0; x < row ; x++)
 {
 for(y = 0; y < col ; y++)
 {
 if(a1[i][j] == a2[x][y]) 
 printf("%d\t",a1[i][j]);
 }
 }
 }
}
}
int main()
{
 int arr1[2][2]={{2,5},{6,8}};
 int arr2[2][2]={{1,2},{8,8}};
 int row = (sizeof(arr1)/sizeof(arr1[0]));
 int col = (sizeof(arr1[0])/sizeof(arr1[0][1]));
 intersect(arr1,arr2,row,col);
}

Question 2

The complexity of your approach is quite significant. You can improve it by, for example, sorting the two arrays first

Question 3

@BillalBEGUERADJ A "sorting the two arrays first " is not trivial in this case . The array elements themselves are themselves arrays. It is akin to sorting complex numbers and then matching either real,imaginary part.

Question 4

@chux "sorting the two arrays first" is exactly trivial in this case. The code simply searches for any place where a value in a1 appears in a2. Just convert the arrays to lists of numbers, sort them, and proceed.

Question 5

@AustinHastings Fair enough. I guess it becomes a question of how much space to replicate the array - or allow intersect() to modify in place. As usual, more memory allows for faster search.

Question 6

Could be nice to have a definition of an 'intersection'. For example, what I'd like to know is whether for repeated values should they be paired one-to-one or each-to-each? For example should a pair and three of the same digit make a two-pairs or a six-pairs 'intersection'? The code given would go the latter way, but I think I would rather prefer the former...

Question 7

Several useful suggestions have been made by toolic, but here are a few more.

`int` vs. `size_t`

When dealing with sizes of arrays, it's more appropriate to use size_t than int as a type. Remember to use the correct %zu format specifier if printing these values.

Algorithm

For the data sizes you're dealing with, your approach is unlikely to have significant issues. However, as you scale your data, a loop within a loop within a loop within a loop within a loop is O(n⁴) complexity. This is not ideal at all.

row	col	comparisons
2	2	16
2	3	36
4	4	256
10	10	10,000
50	50	6,250,000
100	100	*100,000,000*

Below I've demonstrated a way to improve this substantially. We flatten each array into dynamically allocated storage and then use qsort to sort these arrays. The sorting has complexity O(n * log n). In terms of complexity, constants factor out, so it doesn't really matter that we're doing it twice (once for each input array).

Once we have these sorted, we only need to do a linear or O(n) complexity loop through them to find intersected values. We'll use two indices into the arrays and use them to advance separately through the two sorted arrays.

A side benefit of this approach is that we don't intersect more than once. intersecting {2, 5, 6, 8} and {1, 2, 8, 8} will only yield {2, 8} and can never yield {2, 8, 8} because the 8 is present twice in one and only once in the other. We would likely expect the same result whether calling intersect(arr1, arr2, row, col) or intersect(arr2, arr1, row, col)

#include <stdio.h>
#include <stdlib.h>
int intcmp(const void *a, const void *b) {
 return *(int *)a - *(int *)b;
}
void intersect(int a[][2], int b[][2], size_t row, size_t col) {
 size_t overall_size = row * col;
 int *c = malloc(overall_size * sizeof(int));
 int *d = malloc(overall_size * sizeof(int));
 if (!c || !d) {
 fprintf(stderr, "Memory allocations failure!\n");
 return;
 }
 for (size_t i = 0; i < row; i++) {
 for (size_t j = 0; j < col; j++) {
 c[i * col + j] = a[i][j];
 d[i * col + j] = b[i][j];
 }
 }
 qsort(c, overall_size, sizeof(int), intcmp);
 qsort(d, overall_size, sizeof(int), intcmp);
 size_t i = 0;
 size_t j = 0;
 while (i < overall_size && j < overall_size) {
 if (c[i] == d[j]) {
 printf("Common element: %d\n", c[i]);
 i++;
 j++;
 }
 else if (c[i] < d[j]) {
 i++;
 }
 else {
 j++;
 }
 }
 free(c);
 free(d);
}
int main(void) {
 int arr1[2][2] = {{2, 5}, {6, 8}};
 int arr2[2][2] = {{1, 2}, {8, 8}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(arr1, arr2, row, col);
}

Hard-coding

You've hard-coded 2 as the column dimension of your input arrays, but the function also takes a column parameter. This is less than ideal.

By changing the parameter order we can side step this issue.

void intersect(size_t row, size_t col, int a[row][col], int b[row][col])

We could now easily change main to:

int main(void) {
 int arr1[2][3] = {{2, 5, 1}, {6, 8, 3}};
 int arr2[2][3] = {{1, 2, 9}, {8, 8, 9}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(row, col, arr1, arr2);
}

Array initialization

Speaking of main, there's no need to specify the row dimension when declaring your arrays in this case. You could simply write:

int main(void) {
 int arr1[][3] = {{2, 5, 1}, {6, 8, 3}};
 int arr2[][3] = {{1, 2, 9}, {8, 8, 9}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(row, col, arr1, arr2);
}

Your compiler will deduce the correct dimension. This does mean that if you get it wrong, your compiler also won't slap you on the wrist if the row dimensions of the two arrays don't match.

Question 8

You could always static_assert(row == sizeof arr2 / sizeof *arr2) etc. if you want your compiler to help diagnose dimensional mismatch.

Question 9

Layout

The indentation of the code in the intersect function is inconsistent and misleading. I used a free online C code formatter to add consistency. The new indentation is shown below. The formatter also adds consistency for spacing around operators.

Documentation

The code should have a comment block at the top to summarize its purpose. For example:

/*
Calculate the intersection of two arrays and print the results.
*/

The intersect function should have its own comment describing the types of the inputs and the output. The comment can also describe the algorithm.

Simpler

This line:

int i = 0, j = 0, x = 0, y = 0;

is simpler as:

int i, j, x, y;

Since the for loops initialize these variables, there is no need to initialize them in the declaration line, too.

In fact, the iterator variables can be locally scoped to each for loop, as will be shown below.

Output

In the main function, consider printing a newline after the call to intersect:

 printf("\n");

Otherwise, when I run the code in my Unix shell, the output is merged with the the next shell prompt.

Also, consider giving the intersect function a return type other than void. It could return a single formatted string or a list of numeric values to allow the caller to format as desired.

Here is new code with some of the suggestions above:

#include<stdio.h>
void intersect(int a1[][2], int a2[][2], int row, int col) {
 for (int i = 0; i < row; i++) {
 for (int j = 0; j < col; j++) {
 for (int x = 0; x < row; x++) {
 for (int y = 0; y < col; y++) {
 if (a1[i][j] == a2[x][y]) {
 printf("%d\t", a1[i][j]);
 }
 }
 }
 }
 }
}
int main() {
 int arr1[2][2] = {
 {
 2,
 5
 },
 {
 6,
 8
 }
 };
 int arr2[2][2] = {
 {
 1,
 2
 },
 {
 8,
 8
 }
 };
 int row = (sizeof(arr1) / sizeof(arr1[0]));
 int col = (sizeof(arr1[0]) / sizeof(arr1[0][1]));
 intersect(arr1, arr2, row, col);
 printf("\n");
}

Chris Chris 3,3191 gold badge7 silver badges29 bronze badges · Answer 1 · 2025-06-04 17:31:27Z

Several useful suggestions have been made by toolic, but here are a few more.

`int` vs. `size_t`

When dealing with sizes of arrays, it's more appropriate to use size_t than int as a type. Remember to use the correct %zu format specifier if printing these values.

Algorithm

For the data sizes you're dealing with, your approach is unlikely to have significant issues. However, as you scale your data, a loop within a loop within a loop within a loop within a loop is O(n⁴) complexity. This is not ideal at all.

row	col	comparisons
2	2	16
2	3	36
4	4	256
10	10	10,000
50	50	6,250,000
100	100	*100,000,000*

Below I've demonstrated a way to improve this substantially. We flatten each array into dynamically allocated storage and then use qsort to sort these arrays. The sorting has complexity O(n * log n). In terms of complexity, constants factor out, so it doesn't really matter that we're doing it twice (once for each input array).

Once we have these sorted, we only need to do a linear or O(n) complexity loop through them to find intersected values. We'll use two indices into the arrays and use them to advance separately through the two sorted arrays.

A side benefit of this approach is that we don't intersect more than once. intersecting {2, 5, 6, 8} and {1, 2, 8, 8} will only yield {2, 8} and can never yield {2, 8, 8} because the 8 is present twice in one and only once in the other. We would likely expect the same result whether calling intersect(arr1, arr2, row, col) or intersect(arr2, arr1, row, col)

#include <stdio.h>
#include <stdlib.h>
int intcmp(const void *a, const void *b) {
 return *(int *)a - *(int *)b;
}
void intersect(int a[][2], int b[][2], size_t row, size_t col) {
 size_t overall_size = row * col;
 int *c = malloc(overall_size * sizeof(int));
 int *d = malloc(overall_size * sizeof(int));
 if (!c || !d) {
 fprintf(stderr, "Memory allocations failure!\n");
 return;
 }
 for (size_t i = 0; i < row; i++) {
 for (size_t j = 0; j < col; j++) {
 c[i * col + j] = a[i][j];
 d[i * col + j] = b[i][j];
 }
 }
 qsort(c, overall_size, sizeof(int), intcmp);
 qsort(d, overall_size, sizeof(int), intcmp);
 size_t i = 0;
 size_t j = 0;
 while (i < overall_size && j < overall_size) {
 if (c[i] == d[j]) {
 printf("Common element: %d\n", c[i]);
 i++;
 j++;
 }
 else if (c[i] < d[j]) {
 i++;
 }
 else {
 j++;
 }
 }
 free(c);
 free(d);
}
int main(void) {
 int arr1[2][2] = {{2, 5}, {6, 8}};
 int arr2[2][2] = {{1, 2}, {8, 8}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(arr1, arr2, row, col);
}

Hard-coding

You've hard-coded 2 as the column dimension of your input arrays, but the function also takes a column parameter. This is less than ideal.

By changing the parameter order we can side step this issue.

void intersect(size_t row, size_t col, int a[row][col], int b[row][col])

We could now easily change main to:

int main(void) {
 int arr1[2][3] = {{2, 5, 1}, {6, 8, 3}};
 int arr2[2][3] = {{1, 2, 9}, {8, 8, 9}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(row, col, arr1, arr2);
}

Array initialization

Speaking of main, there's no need to specify the row dimension when declaring your arrays in this case. You could simply write:

int main(void) {
 int arr1[][3] = {{2, 5, 1}, {6, 8, 3}};
 int arr2[][3] = {{1, 2, 9}, {8, 8, 9}};
 size_t row = sizeof(arr1) / sizeof(arr1[0]);
 size_t col = sizeof(arr1[0]) / sizeof(arr1[0][1]);
 intersect(row, col, arr1, arr2);
}

Your compiler will deduce the correct dimension. This does mean that if you get it wrong, your compiler also won't slap you on the wrist if the row dimensions of the two arrays don't match.

You could always static_assert(row == sizeof arr2 / sizeof *arr2) etc. if you want your compiler to help diagnose dimensional mismatch.

toolic toolic 14.5k5 gold badges29 silver badges203 bronze badges · Answer 2 · 2025-06-04 10:12:03Z

Layout

The indentation of the code in the intersect function is inconsistent and misleading. I used a free online C code formatter to add consistency. The new indentation is shown below. The formatter also adds consistency for spacing around operators.

Documentation

The code should have a comment block at the top to summarize its purpose. For example:

/*
Calculate the intersection of two arrays and print the results.
*/

The intersect function should have its own comment describing the types of the inputs and the output. The comment can also describe the algorithm.

Simpler

This line:

int i = 0, j = 0, x = 0, y = 0;

is simpler as:

int i, j, x, y;

Since the for loops initialize these variables, there is no need to initialize them in the declaration line, too.

In fact, the iterator variables can be locally scoped to each for loop, as will be shown below.

Output

In the main function, consider printing a newline after the call to intersect:

 printf("\n");

Otherwise, when I run the code in my Unix shell, the output is merged with the the next shell prompt.

Also, consider giving the intersect function a return type other than void. It could return a single formatted string or a list of numeric values to allow the caller to format as desired.

Here is new code with some of the suggestions above:

#include<stdio.h>
void intersect(int a1[][2], int a2[][2], int row, int col) {
 for (int i = 0; i < row; i++) {
 for (int j = 0; j < col; j++) {
 for (int x = 0; x < row; x++) {
 for (int y = 0; y < col; y++) {
 if (a1[i][j] == a2[x][y]) {
 printf("%d\t", a1[i][j]);
 }
 }
 }
 }
 }
}
int main() {
 int arr1[2][2] = {
 {
 2,
 5
 },
 {
 6,
 8
 }
 };
 int arr2[2][2] = {
 {
 1,
 2
 },
 {
 8,
 8
 }
 };
 int row = (sizeof(arr1) / sizeof(arr1[0]));
 int col = (sizeof(arr1[0]) / sizeof(arr1[0][1]));
 intersect(arr1, arr2, row, col);
 printf("\n");
}

Stack Exchange Network

Calculate intersection of two arrays

2 Answers 2

`int` vs. `size_t`

Algorithm

Hard-coding

Array initialization

Layout

Documentation

Simpler

Output

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Calculate intersection of two arrays

2 Answers 2

int vs. size_t

Algorithm

Hard-coding

Array initialization

Layout

Documentation

Simpler

Output

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`int` vs. `size_t`