1
$\begingroup$

Consider the problem of finding, for a given input array, the longest subarray with at most two different values.

For example:

Input: [3,3,3,1,2,1,1,2,3,3,4]
Ans = 5, the longest subarray would be [1,2,1,1,2].
Input: [1,2,3,2,2]
Ans = 4, the longest subarray would be [2,3,2,2].

Below is dynamic programming solution to this problem (in Python, hopefully it's easy to read) using a sliding window that holds a "valid subarray" (the subarray of elements between indices i and j always holds two values at most).

I read on e.g. LeetCode that this solution has a runtime complexity of $O(N)$ where $N$ is the length of the input array, but that's not immediately clear to me since we have two nested loops with $i$ and $j$ and 0ドル\leq i\leq j\leq n$.

Why is the worst-case runtime complexity of this solution $O(N)$ and not $O(N^2)$?

Here's the DP solution in question with those nested loops holding a subarray between $i$ and $j$:

 def longest_subarray_holding_two_diff_values (input_array):
 ans = i = 0
 count = collections.Counter()
 for j, x in enumerate(input_array):
 count[x] += 1
 while len(count) >= 3:
 count[input_array[i]] -= 1
 if count[input_array[i]] == 0:
 del count[input_array[i]]
 i += 1
 ans = max(ans, j - i + 1)
 return ans
asked Apr 16, 2020 at 0:44
$\endgroup$
1
  • $\begingroup$ Not everybody can read python. Is it possible to rewrite your algorithm for people who are not python experts? $\endgroup$ Commented Apr 16, 2020 at 8:22

1 Answer 1

3
$\begingroup$

Here is how the sliding window algorithm works (unfortunately, I don't understand your code, so can't say whether this is the same algorithm).

We keep track of two pointers $i,j$, with the following properties: the subarray $A[j],\ldots,A[i]$ contains exactly two values, and it is maximal with respect to $j$ (that is, either $j = 0$ or $A[j-1],\ldots,A[i]$ contains three values). We also keep track of the two values in question $a,b$, and of their last appearance $k_a,k_b$. Finally, we keep track of the longest valid subarray seen so far.

In the initialization phase, we scan the array until we see two different values; if the array is constant, then the answer is the length of the array.

At steady state, we take a peek at $A[i+1]$. If $A[i+1] \in \{a,b\}$, we update $k_a$ or $k_b$, and simply increase $i$. If $A[i+1] \notin \{a,b\}$, then we do two things. First, we update the value of the longest valid subarray seen so far (comparing it to $j-i+1$). Second, suppose that $A[i] = a$; then we set $j = k_b+1$, set $b = A[i+1]$, set $k_b = i+1$, and increment $i$.

Finally, when reaching $i = n$, we update the value of the longest valid subarray (comparing it to $j-i+1$), and output the result.

As you can see, this algorithm performs $O(1)$ operations per iteration, so runs in $O(n)$ time.

answered Apr 16, 2020 at 8:31
$\endgroup$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.