Finding duplicates in arrays

Question 1

I'm given two sorted arrays, arr1 and arr2. They can either be of the same length or arr2 will be longer than arr1. I implemented a method findDuplicates that returns an array of all numbers that are both in arr1 and arr2. Note that the output array should be sorted in an ascending order.

I know there are better solutions than the one I came up with. What I'm trying to understand is the space complexity of the solution I came up with.

 static int[] findDuplicates(int[] arr1, int[] arr2) {
 // your code goes here
 HashMap<Integer, Integer> table = new HashMap<>();
 ArrayList<Integer> returnlist = new ArrayList<Integer>();
 int arr1Length = arr1.length;
 int arr2Length = arr2.length;
 // this is for the first array
 for(int i=0; i<arr1Length; i++){
 int n = arr1[i];
 table.put(n, 1);
 }
 for(int i=0; i<arr2Length; i++){
 int n = arr2[i];
 if(table.containsKey(n)){
 returnlist.add(n);
 }
 }
 int[] arr = new int[returnlist.size()];
 for(int i=0; i<=returnlist.size()-1; i++){
 arr[i] = returnlist.get(i);
 }
 return arr;
} // end of method

Since I created a temporary Hashtable and an ArrayList, and then another array to return the duplicates is my space complexity \$O(n^3)\$. I know its greater than \$O(n)\$. If someone can help me understand this better, I would really appreciate it.

Question 2

Why \$\mathcal O(n^3)\$? Since you are creating a list plus a map plus an array, I only see \$\mathcal O(n+n+n)\,ドル which is equivalent to \$\mathcal O(n)\$.

Question 3

@RolandIllig okay, that's what I was confused about. I thought it would be O(n^3) because I created three extra data structures to store the inputs in. It's not about the data structures but the size of the input. Thanks.

Question 4

I think this is little bit tweaked algorithm, I am not holding any temporary variables to find duplicate (like HashMap or HashSet). Only temporary variable I am using is List<Integer> duplicate later I am converting it to int array, or else you can use List<Integer> but it depends upon your use case.

Below code only works if the arr1 and arr2 are sorted arrays, as per you problem statement.

static int[] findDuplicates(int[] arr1, int[] arr2) {
 List<Integer> duplicate = new ArrayList<>();
 int[] largeArray;
 int[] smallArray;
 if(arr1.length == arr2.length) {
 largeArray = arr1;
 smallArray = arr2;
 } else {
 largeArray = arr1.length > arr2.length ? arr1 : arr2;
 smallArray = arr1.length < arr2.length ? arr1 : arr2;
 }
 int largeArrayIndex = 0;
 for(int smallArrayIndex = 0; smallArrayIndex < smallArray.length; smallArrayIndex++) {
 for(; largeArrayIndex < largeArray.length; largeArrayIndex++) {
 if(largeArray[largeArrayIndex] == smallArray[smallArrayIndex]) {
 duplicate.add(largeArray[largeArrayIndex]);
 largeArrayIndex++;
 break; 
 } else if(smallArray[smallArrayIndex] < largeArray[largeArrayIndex]) {
 break;
 }
 }
 if(largeArrayIndex > largeArray.length) {
 break;
 }
 }
 return duplicate.stream().mapToInt(Integer::intValue).toArray();
}

Question 5

static int[] findDuplicates(int[] arr1, int[] arr2) {

"Duplicates" is probably not the best name here. You would expect a duplicate to occur twice in the same collection of values or objects, not at least once in two collections. Maybe findPairs would be better, however still not perfect.

You might consider using an interface like List or even Iterable for the parameter types, to be able to support different types of collections instead of just arrays.

HashMap<Integer, Integer> table = new HashMap<>();

Do you really need a HashMap? It seems like you always use 1 as value and actually just want the keys. Use HashSet in that case. Call it uniqueValuesFirstArray or similar, instead of table. Use a name that tells you what you use it for.

ArrayList<Integer> returnlist = new ArrayList<Integer>();

Call the list result or better pairwiseOccuring or duplicates, because that is what you are saving in it. The name should not imply what the code will be doing with it later, e. g. returning it.

int arr1Length = arr1.length;
int arr2Length = arr2.length;

This is obsolete. There is no reason to save these values, instead just access them directly.

// this is for the first array
for(int i=0; i<arr1Length; i++){
 int n = arr1[i];
 table.put(n, 1);
}

This comment is obsolete. The code tells you that it is for array1, even better if you improve your variable names and access the array's length directly, like for (int i=0; i < array1.length; i++). Note the spaces that I added, to make it more readable.

Saving the value in a variable an then accessing it is obsolete. Just do table.put(arr1[i], 1);.

If you use HashSet instead of HashMap, you can replace the whole loop with HashSet<Integer> uniqueValuesFirstArray = new HashSet<>(Arrays.asList(array1));

for (int i=0; i < arr2Length; i++) {
 int n = arr2[i];
 if (table.containsKey(n)) {
 returnlist.add(n);
 }
}

This is what the indentation and whitespaces in this part of your code should look like.

As above, you do not need the variable n. Just use the value from the array directly.

If you change HashMap to HashSet, replace containsKey with contains.

int[] arr = new int[returnlist.size()];
for(int i=0; i<=returnlist.size()-1; i++){
 arr[i] = returnlist.get(i);
}

This can be done much easier. Instead of the for-loop, just use returnlist.toArray(arr);. It is probably faster, and definitely shorter and more readable.

Pavan Kumar Jorrigala Pavan Kumar Jorrigala 1263 bronze badges · Accepted Answer · 2018-02-10 19:51:06Z

I think this is little bit tweaked algorithm, I am not holding any temporary variables to find duplicate (like HashMap or HashSet). Only temporary variable I am using is List<Integer> duplicate later I am converting it to int array, or else you can use List<Integer> but it depends upon your use case.

Below code only works if the arr1 and arr2 are sorted arrays, as per you problem statement.

static int[] findDuplicates(int[] arr1, int[] arr2) {
 List<Integer> duplicate = new ArrayList<>();
 int[] largeArray;
 int[] smallArray;
 if(arr1.length == arr2.length) {
 largeArray = arr1;
 smallArray = arr2;
 } else {
 largeArray = arr1.length > arr2.length ? arr1 : arr2;
 smallArray = arr1.length < arr2.length ? arr1 : arr2;
 }
 int largeArrayIndex = 0;
 for(int smallArrayIndex = 0; smallArrayIndex < smallArray.length; smallArrayIndex++) {
 for(; largeArrayIndex < largeArray.length; largeArrayIndex++) {
 if(largeArray[largeArrayIndex] == smallArray[smallArrayIndex]) {
 duplicate.add(largeArray[largeArrayIndex]);
 largeArrayIndex++;
 break; 
 } else if(smallArray[smallArrayIndex] < largeArray[largeArrayIndex]) {
 break;
 }
 }
 if(largeArrayIndex > largeArray.length) {
 break;
 }
 }
 return duplicate.stream().mapToInt(Integer::intValue).toArray();
}

Stack Exchange Network

Finding duplicates in arrays

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Finding duplicates in arrays

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions