2

What algorithm might find a missing integer in O(n) time, from an array?

Say we have an array A with elements in a value-range {1,2,3...2n}. Half the elements are missing so length of A = n.

E.g:

A = [1,2,5,3,10] , n=5

Output = 4

asked Feb 22, 2017 at 22:22
9
  • Is this homework? Commented Feb 22, 2017 at 22:24
  • No, i study physics. This is spare time. I found the question from a homework set online tho... Commented Feb 22, 2017 at 22:26
  • Possible duplicate of Missing integer variation - O(n) solution needed Commented Feb 22, 2017 at 22:26
  • 1
    @hanko How much extra storage is permitted : e.g. it is simple to have an array of flags - as shown in the answer below. But it is not obvious how to do with this only O(1) extra storage. Commented Feb 22, 2017 at 22:30
  • 1
    I have a solution that genuinely uses O(1) extra storage and O(N) time, without stealing bits from the array. It's a variation of quickselect. Commented Feb 23, 2017 at 1:27

3 Answers 3

8

The smallest missing integer must be in the range [1, ..., n+1]. So create an array of flags, all initially false, indicating the presence of that integer. Then an algorithm is:

  1. Scan the input array, setting flags to true as you encounter values in the range. This operation is O(n). (That is, set flag[A[i]] to true for each position i in the input array, provided A[i] <= n.)
  2. Scan the flag array for the first false flag. This operation is also O(n). The index of the first false flag is the smallest missing integer.

EDIT: O(n) time algorithm with O(1) extra space:

If A is writable and there are some extra bits available in the elements of A, then a constant-extra-space algorithm is possible. For instance, if the elements of A are signed values, and since all the numbers are positive, we can use the sign bit of the numbers in the original array as the flags, rather than creating a new flag array. So the algorithm would be:

  1. For each position i of the original array, if abs(A[i]) < n+1, make the value at A[abs(A[i])] negative. (This assumes array indexes are based at 1. Adjust in the obvious way if you are using 0-based arrays.) Don't just negate the value, in case there are duplicate values in A.
  2. Find the index of the first element of A that is positive. That index is the smallest missing number in A. If all positions are negative, then A must be a permutation of {1, ..., n} and hence the smallest missing number is n+1.

If the elements are unsigned, but can hold values as high as 4 n + 1, then in step 1, instead of making the element negative, add 2 n + 1 (provided the element is <= 2 n) and use (A[i] mod (2n+1)) instead of abs(A[i]). Then in step 2, find the first element < 2 n + 1 instead of the first positive element. Other such tricks are possible as well.

answered Feb 22, 2017 at 22:26
13
  • That is an obvious answer but requires O(n) extra storage. Any way to do this with o(1) extra storage (i'm not sure). Commented Feb 22, 2017 at 22:28
  • Would i not need to sort the array first with this method? Commented Feb 22, 2017 at 22:30
  • 3
    @hanko: No, you wouldn't. What makes you think that you would? Commented Feb 22, 2017 at 22:30
  • @javadba - Not that I can think of, off the top of my head. Commented Feb 22, 2017 at 22:31
  • 1
    @AnT - Well, you can use other tricks, such as adding 2n+1 to a value in the original array in lieu of setting a flag (and using A[i] mod 2n+1 instead of abs(A[i])). Since n is variable, there's likely to be extra bits somewhere for any particular n. If not, then of course the O(1)-space algorithm can't be implemented. Commented Feb 23, 2017 at 0:14
5

You can do this in O(1) additional space, assuming that the only valid operations on the array is to read elements, and to swap pairs of elements.

First note that the specification of the problem excludes the possibility of the array containing duplicates: it contains half of the numbers from 1 to 2N.

We perform a quick-select type algorithm. Start with m=1, M=2N+1, and pivot the array on (m + M)/2. If the size of the left part of the array (elements <= (m+M)/2) is less than (m + M)/2 - m + 1, then the first missing number must be there. Otherwise, it must be in the right part of the array. Repeat on the left or right side accordingly until you find the missing number.

The size of the slice of the array under consideration halves each time and pivoting an array of size n can be done in O(n) time and O(1) space. So overall, the time complexity is 2N + N + N/2 + ... + 1 <= 4N = O(N).

answered Feb 23, 2017 at 1:18
6
  • NIce answer: use the characteristics of the problem to find what appears to be an optimal solution. Commented Feb 23, 2017 at 1:27
  • Nice approach, but I think the algorithm needs some tweaking. Suppose n = 4 and A is [1, 2, 3, 4]. Then on the first iteration, m = 1, M = 9, and the first pivot will be on 5. This will produce the partition [1, 2, 3, 4], []. Since the left part has 4 elements, which is less than (m + M)/2 - m + 1 = 5, the first missing element is supposedly in [1, 2, 3, 4]. But it's not. Commented Feb 23, 2017 at 1:38
  • @TedHopp The left half should contain 5 elements (elements <= 5), but it only contains 4. You're right the left half doesn't actually contain the missing element, but it would do if it weren't missing! Commented Feb 23, 2017 at 2:02
  • Yeah, I guess on the first iteration, one will have to select the left part of necessity. The first pivot value is n+1 and there are only n elements in the array. Because of this, though, I don't think the complexity analysis formula is quite right (at least for the coefficient; it's still O(n)). Commented Feb 23, 2017 at 3:09
  • Damn, this is great. Thank you. Commented Feb 23, 2017 at 16:52
1

An implementation of Paul Hankin's idea in C++

#include <iostream>
using namespace std;
const int MAX = 1000;
int a[MAX];
int n;
void swap(int &a, int &b) {
 int tmp = a;
 a = b;
 b = tmp;
}
// Rearranges elements of a[l..r] in such a way that first come elements 
// lower or equal to M, next come elements greater than M. Elements in each group
// come in no particular order.
// Returns an index of the first element among a[l..r] which is greater than M.
int rearrange(int l, int r, int M) {
 int i = l, j = r;
 while (i <= j)
 if (a[i] <= M) i++;
 else swap(a[i], a[j--]);
 return i;
}
int main() {
 cin >> n;
 for (int i = 0; i < n; i++) cin >> a[i];
 int L = 1, R = 2 * n;
 int l = 0, r = n - 1;
 while (L < R) {
 int M = (L + R) / 2; // pivot element
 int m = rearrange(l, r, M); 
 if (m - l == M - L + 1) 
 l = m, L = M + 1;
 else
 r = m - 1, R = M;
 }
 cout << L;
 return 0;
}
answered Feb 23, 2017 at 2:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.