Consider the problem of finding, for a given input array, the longest subarray with at most two different values.
For example:
Input: [3,3,3,1,2,1,1,2,3,3,4]
Ans = 5, the longest subarray would be [1,2,1,1,2].
Input: [1,2,3,2,2]
Ans = 4, the longest subarray would be [2,3,2,2].
Below is dynamic programming solution to this problem (in Python, hopefully it's easy to read) using a sliding window that holds a "valid subarray" (the subarray of elements between indices i
and j
always holds two values at most).
I read on e.g. LeetCode that this solution has a runtime complexity of $O(N)$ where $N$ is the length of the input array, but that's not immediately clear to me since we have two nested loops with $i$ and $j$ and 0ドル\leq i\leq j\leq n$.
Why is the worst-case runtime complexity of this solution $O(N)$ and not $O(N^2)$?
Here's the DP solution in question with those nested loops holding a subarray between $i$ and $j$:
def longest_subarray_holding_two_diff_values (input_array):
ans = i = 0
count = collections.Counter()
for j, x in enumerate(input_array):
count[x] += 1
while len(count) >= 3:
count[input_array[i]] -= 1
if count[input_array[i]] == 0:
del count[input_array[i]]
i += 1
ans = max(ans, j - i + 1)
return ans
-
$\begingroup$ Not everybody can read python. Is it possible to rewrite your algorithm for people who are not python experts? $\endgroup$Yuval Filmus– Yuval Filmus2020年04月16日 08:22:38 +00:00Commented Apr 16, 2020 at 8:22
1 Answer 1
Here is how the sliding window algorithm works (unfortunately, I don't understand your code, so can't say whether this is the same algorithm).
We keep track of two pointers $i,j$, with the following properties: the subarray $A[j],\ldots,A[i]$ contains exactly two values, and it is maximal with respect to $j$ (that is, either $j = 0$ or $A[j-1],\ldots,A[i]$ contains three values). We also keep track of the two values in question $a,b$, and of their last appearance $k_a,k_b$. Finally, we keep track of the longest valid subarray seen so far.
In the initialization phase, we scan the array until we see two different values; if the array is constant, then the answer is the length of the array.
At steady state, we take a peek at $A[i+1]$. If $A[i+1] \in \{a,b\}$, we update $k_a$ or $k_b$, and simply increase $i$. If $A[i+1] \notin \{a,b\}$, then we do two things. First, we update the value of the longest valid subarray seen so far (comparing it to $j-i+1$). Second, suppose that $A[i] = a$; then we set $j = k_b+1$, set $b = A[i+1]$, set $k_b = i+1$, and increment $i$.
Finally, when reaching $i = n$, we update the value of the longest valid subarray (comparing it to $j-i+1$), and output the result.
As you can see, this algorithm performs $O(1)$ operations per iteration, so runs in $O(n)$ time.
Explore related questions
See similar questions with these tags.