First missing positive integer in linear time and constant space

Question 1

This problem was asked by Stripe.

Given an array of integers, find the first missing positive integer in linear time and constant space. In other words, find the lowest positive integer that does not exist in the array. The array can contain duplicates and negative numbers as well.

For example, the input [3, 4, -1, 1] should give 2. The input [1, 2, 0] should give 3.

You can modify the input array in-place.

class DailyCodingProblem5 {
 public static void main(String args[]) {
 int[] arr = { 3, 4, -1, 1 };
 int res = solution(arr);
 System.out.println(res);
 int[] arr2 = { 1, 2, 0 };
 res = solution(arr2);
 System.out.println(res);
 }
 private static int solution(int[] arr) {
 int n = arr.length;
 int i = 0;
 for (i = 0; i < n; i++) {
 int val = arr[i];
 if (val <= 0 || val > n)
 continue;
 while (val != arr[val - 1]) {
 int nextval = arr[val - 1];
 arr[val - 1] = val;
 val = nextval;
 if (val <= 0 || val > n) {
 break;
 }
 }
 }
 for (i = 0; i < n; i++) {
 if (arr[i] != i + 1) {
 return i + 1;
 }
 }
 return n+1;
 }
}

How can i improve the above solution? Is there any improvements i can have in my code ?

Question 2

Testing for { 1, 2, 3, 4, 5 } and similar arrays return 5, which is not missing.

Question 3

I have voted to close this since your algorithm doesn't work. Please come back when you have fixed it. (What I can tell you right away, though, is that your algorithm can never finish in linear time if you have nested loops).

Question 4

@vnp updated the code to return return n+1;. It works now

Question 5

I appreciate code comments in general. I'd like to see an argument that "the rearrangement loop" runs in linear time.

Question 6

Not I want to be enlightened how it works: I'd like to see an argument that "the rearrangement loop" runs in linear time in the code - easiest in comment form. (You can't put it there in this question, as considerable time has passed since posting and, more compellingly, answering started.)

Question 7

It would be good to have some comments in the code explaining why (a) it works; (b) it takes linear time.

 int i = 0;
 for (i = 0; i < n; i++) {

I would prefer to use for (int i = 0; ...) and similarly for the other loop over i: they don't need to use the same variable, and keeping scopes as small as possible helps understanding and maintenance. But if you prefer to keep the variable in the wider scope, there's no need to initialise it twice. Either of

 int i = 0;
 for (; i < n; i++) {

or

 int i;
 for (i = 0; i < n; i++) {

works fine.

 if (val <= 0 || val > n)
 continue;
 while (val != arr[val - 1]) {
 int nextval = arr[val - 1];
 arr[val - 1] = val;
 val = nextval;
 if (val <= 0 || val > n) {
 break;
 }
 }

It's a bit inelegant to apply the range test twice. A simple refactor gives

 while (val > 0 && val <= n && val != arr[val - 1]) {
 int nextval = arr[val - 1];
 arr[val - 1] = val;
 val = nextval;
 }

 }
 return n+1;
 }

For consistency I would add whitespace around +. The blank line makes more sense to me before the return rather than after it, and I would also add a blank line separating the two loops.

The fact that I'm nitpicking whitespace like this is a good sign: I haven't found anything which I consider to be a major problem.

Question 8

"It would be good to have some comments in the code explaining why (a) it works" even better if it's not comments but self-documenting code that explains how it works. Better variable names and extracting expressions into variables can do that. A comment can still serve to explain the linear timing, question

Question 9

In this case, I won't believe that refactoring could explain why the code works unless I see it. If you think you can do it, feel free to post the code in an answer and ping me.

Peter Taylor Peter Taylor 24.4k1 gold badge49 silver badges94 bronze badges · Accepted Answer · 2019-02-19 08:59:00Z

It would be good to have some comments in the code explaining why (a) it works; (b) it takes linear time.

 int i = 0;
 for (i = 0; i < n; i++) {

I would prefer to use for (int i = 0; ...) and similarly for the other loop over i: they don't need to use the same variable, and keeping scopes as small as possible helps understanding and maintenance. But if you prefer to keep the variable in the wider scope, there's no need to initialise it twice. Either of

 int i = 0;
 for (; i < n; i++) {

or

 int i;
 for (i = 0; i < n; i++) {

works fine.

 if (val <= 0 || val > n)
 continue;
 while (val != arr[val - 1]) {
 int nextval = arr[val - 1];
 arr[val - 1] = val;
 val = nextval;
 if (val <= 0 || val > n) {
 break;
 }
 }

It's a bit inelegant to apply the range test twice. A simple refactor gives

 while (val > 0 && val <= n && val != arr[val - 1]) {
 int nextval = arr[val - 1];
 arr[val - 1] = val;
 val = nextval;
 }

 }
 return n+1;
 }

For consistency I would add whitespace around +. The blank line makes more sense to me before the return rather than after it, and I would also add a blank line separating the two loops.

The fact that I'm nitpicking whitespace like this is a good sign: I haven't found anything which I consider to be a major problem.

"It would be good to have some comments in the code explaining why (a) it works" even better if it's not comments but self-documenting code that explains how it works. Better variable names and extracting expressions into variables can do that. A comment can still serve to explain the linear timing, question
In this case, I won't believe that refactoring could explain why the code works unless I see it. If you think you can do it, feel free to post the code in an answer and ping me.

Stack Exchange Network

First missing positive integer in linear time and constant space

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

First missing positive integer in linear time and constant space

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions