Finding a Particular Missing Array Element: O(n+k) Complexity

Question 1

I have to create an algorithm which returns the largest missing element in an array of size n, which ranges from values 1-k.

Example: If the array contained 1,1,3,1,3 then k=3, n=5, and the return should be 2.

If time complexity isn't taken into account, this is relatively straightforward; simply iterate through the array searching for k-1, k-2,...2, and the first time a value isn't found, return that value (or -1 if no such value exists). However, this will be a loop inside a loop, resulting in O(n*k) complexity.

I'm specifically asked to achieve O(n+k) in my result, and while I'm new to algorithm design, I believe this means my algorithm is not allowed to contain any nested loops whatsoever. None of the array sorting methods I've learned have linear worst-case times, so I can't use those, and I can't see any other way to keep n and k separate in all segments.

I've been stuck on this concept for several hours now. I'm not necessarily looking for a full answer, but if anyone could point me in the right direction, or even explain how I should be approaching O(n+k) complexity in a problem like this, I'd be very grateful.

Question 2

C# sample code with comments:

int k = 3;
int[] data = new int[] {1, 1, 3, 1, 3};
// Step 1: collecting items presented
// O(n): we have to scan "data" array == O(n), adding item - n * O(1) == O(n)
// O(n) + n * O(1) == O(n) + O(n) == O(n) for the step 1
HashSet<int> present = new HashSet<int>(data);
// Step 2: scan 1..k items and find items missing:
// O(k) - 1..k scan + k * O(1) - checking each item
// O(k) + k * O(1) == O(k) + O(k) = O(k) for the step 2 
var missing = Enumerable
 .Range(1, k) // O(k) - scan 
 .Where(item => !present.Contains(item)) // k * O(1) == O(k) 
 .ToList(); // materialize 
// Test
Console.Write(string.Join(" ", missing));

Finally, we have step 1 followed by step 2:

O(n) + O(k) = O(n + k)

Edit: Step 2 neither Linq nor lambda, but good old for loop:

for (int i = 1; i <= k; ++i) // O(k) - scan
 if (!present.Contains(i)) // O(1) - check
 Console.WriteLine(i);

Question 3

Is it possible not to use hash? I am curious.

Question 4

@javaLover: you can use a boolean array [1..k] instead of hash: just mark k-th item with true if item is present

Question 5

Oh, no, another cheat XD.

Question 6

This answer seems hopelessly too C# specific (on a question that didn't specify a language) - lambdas are far from a universal concept and make this answer much less readable to the non-C# audience than if you were just to have looped over manually. An explicit explanation of the approach outside of code also tends to make the answer much easier to understand much faster.

Question 7

@Dukeling: Thank you! I totally agree that Linq may be difficult to read esp. for those who don't use C#; I've edited the answer.

Question 8

You could use another array to store the occurrences without using hashset, though it would waste some memory,

public int maxMissingElement(final int arr[], final int max) {
boolean[] holder = new boolean[max + 1];
for (int i = 0; i < arr.length; i++) {
 holder[arr[i]]=true;
}
for (int i = max - 1; i >= 0; i--) {
 if (!holder[i]) {
 return i;
 }
}
return -1;
}

Time complexity: O(n+k)

Space complexity: O(k)

Question 9

Thank you for the responses everyone! I've selected this as the most helpful to me because it avoids hashing, but learned from and have upvoted all of them.

Question 10

Actually an array is just an implementation of the generic hash set concept. This answer isn't much different, though I still think it's a great answer for it's compactness.

Question 11

First thing that comes to my mind is keeping a hash set of size k. A hash set has the property that it is created in linear-k time, and looking up data is in O(1) time.

This means that you could create a hash set (or dictionary) with a flag for each value of k. So for k = 3 you'd get a set (0 -> false, 1 -> false, 2 -> false, 3 -> false).

Then you could simply walk through the array of data (of size n) in linear-n time, and for each value simply set the corresponding record in your hash set to true.

After this process, you've ended up with a hash set that has records set to true only for the values that exist in the data set.

At the end, simply walk through the hash set once and keep an object called result for instance. Each time you find a value that is false (i.e. did not exist in the data array) you compare it to the result. If the value is bigger than result, you set result to that value.

At the end of the cycle over the hash set, the value in result will be the biggest missing value in the data set.

So is it O(k + n)?

We created the hash set = O(k)
We walked over the data set, each time checking the hash set = O(n) * O(1)
We walked over the hash set, comparing and updating a variable = O(k) * O(1)

We add up these time complexities as follows:

 O(k) + O(n)O(1) + O(k)*O(1)
 = 2 * O(k) + O(n) = O(k) + O(n)
 = O(k + n).

Note

Keeping a hash set to me sounds practical but not really elegant. I'm sure that there exists a way to do this without the hash set, but it meets the constraints of the problem.

score 3 · Accepted Answer · 2017-05-30 07:58:26Z

C# sample code with comments:

int k = 3;
int[] data = new int[] {1, 1, 3, 1, 3};
// Step 1: collecting items presented
// O(n): we have to scan "data" array == O(n), adding item - n * O(1) == O(n)
// O(n) + n * O(1) == O(n) + O(n) == O(n) for the step 1
HashSet<int> present = new HashSet<int>(data);
// Step 2: scan 1..k items and find items missing:
// O(k) - 1..k scan + k * O(1) - checking each item
// O(k) + k * O(1) == O(k) + O(k) = O(k) for the step 2 
var missing = Enumerable
 .Range(1, k) // O(k) - scan 
 .Where(item => !present.Contains(item)) // k * O(1) == O(k) 
 .ToList(); // materialize 
// Test
Console.Write(string.Join(" ", missing));

Finally, we have step 1 followed by step 2:

O(n) + O(k) = O(n + k)

Edit: Step 2 neither Linq nor lambda, but good old for loop:

for (int i = 1; i <= k; ++i) // O(k) - scan
 if (!present.Contains(i)) // O(1) - check
 Console.WriteLine(i);

@javaLover: you can use a boolean array [1..k] instead of hash: just mark k-th item with true if item is present
This answer seems hopelessly too C# specific (on a question that didn't specify a language) - lambdas are far from a universal concept and make this answer much less readable to the non-C# audience than if you were just to have looped over manually. An explicit explanation of the approach outside of code also tends to make the answer much easier to understand much faster.
@Dukeling: Thank you! I totally agree that Linq may be difficult to read esp. for those who don't use C#; I've edited the answer.

CollectivesTM on Stack Overflow

Finding a Particular Missing Array Element: O(n+k) Complexity

3 Answers 3

Note

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

Note

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related