Say you have an array of integers like [1, 2, 3, 4, 5, 6], the problem is to find the smallest way to break up the array into sub-arrays where each sub-array satisfies the following requirement:
- sub-array.first_integer and sub-array.last_integer must have a common divisor that is not 1.
So for [7, 2, 3, 4, 5, 6] the answer would be 2, because you can break it up like [7] and [2, 3, 4, 5, 6], where 2 and 6 have a common divisor of 2 (which is > 1 so it meets the requirement).
You can assume the array can be huge but the numbers are not too big. Is there a way to do this in n or n*log(n) time? I think with dp and caching n^2 is possible but not sure how to do it faster.
Here is my current attempt with a BFS, but it's still N^2:
https://onlinegdb.com/ryZwkYnA8
Edit: OK I think I found an improvement which is a polynomial times the square root of a pseudo-polynomial: n * sqrt(magnitude of the values). Seems much faster, it basically builds an adjacency list; and can be done BFS, DP, and caching all the same. Although I wonder if anything better is available?
-
$\begingroup$ How does the first subarray satisfy the condition? There is no divisor $> 1$ of 1ドル$. $\endgroup$Jakube– Jakube2020年07月03日 09:24:28 +00:00Commented Jul 3, 2020 at 9:24
-
1$\begingroup$ Also, what do you mean with smallest way? Do you mean the smallest number of subarrays in the decomposition? $\endgroup$Jakube– Jakube2020年07月03日 09:26:25 +00:00Commented Jul 3, 2020 at 9:26
-
$\begingroup$ "How does the first subarray satisfy the condition? There is no divisor >1 of 1" My bad, 1 will not be in the input. Edited. $\endgroup$Dave– Dave2020年07月03日 09:51:14 +00:00Commented Jul 3, 2020 at 9:51
-
$\begingroup$ Smallest way = How to split the original array into as small a number of subarrays as possible where all the subarrays satisfy the requirement. I updated to include my current attempt, which is a working N^2. $\endgroup$Dave– Dave2020年07月03日 09:52:49 +00:00Commented Jul 3, 2020 at 9:52
1 Answer 1
In my answer I will assume that all numbers are at most $M$ big, i.e. $A[i] \le M$.
First some comments about your both approaches.
Computing the GCD takes $O(\log M)$ time. So your first approach is actually $O(N^2 \log M)$. I also don't think that your second approach is actually better than $O(N^2)$. For instance assume if all the numbers of the first half of the array are equal. Then for each of those numbers you have at least $\Theta(N)$ effort for each divisor of $N$. Which means that the complexity for such a test case is something like $O(N^2 \sqrt M)$.
Let's first discuss a very trivial (but slower) dynamic programming solution, on which I will later base a better one.
Let's define the function $f$ as $f(i)$ is the smallest number of subarrays in which you can split the prefix of size $i$ of the array (the first $i$ numbers). Also let $f(i) := \infty$ if it is not possible to split the array, and $f(0) := 0$.
It's easy to see, that you can compute $f(i)$ using the recursion:
$$f(i) = \min_{\substack{1 \le j \le i\\ \gcd(A[i], A[j]) > 1}} f(j-1) + 1$$
This formula is very trivially to implement together with dynamic programming, and will have the complexity $O(N^2 \log M)$.
f(0) = 0
for i = 1 to N:
f(i) = infinity
for j = 1 to i:
if gcd(A[i], A[j]) > 1:
f(i) = min(f(i), f(j-1)) + 1
print f(N)
Now to a better approach. We need two mathematical facts that will help us:
- If two numbers have a common divisor greater than one, then they have a common prime factor.
- A number $\le M$ has at most $\log M$ prime factors.
Let the set of prime factors of $x$ be $P(x)$.
Then we can also rewrite the recursion for $f$ as:
$$f(i) = \min_{p \in P(A[i])} \left( \min_{\substack{1 \le j \le i\\ p ~|~ A[j]}} f(j-1) \right) + 1$$ In other word, if $A[i] = 20 = 2^2 \cdot 5$, then we look for all previous numbers who are divisible by 2, and take the minimum of $f(j-1)$, and for all previous numbers who are divisible by 5ドル$ and take the minimum of $f(j-1)$. The actual optimal value $f(i)$ will one more than the minimum of both.
If we define $g(p, i)$ as the minimum of $f(j-1)$ with $p ~|~ A[j]$ and 1ドル \le j \le i$, then the formula simplifies to:
$$f(i) = \min_{p \in P(A[i])} g(p, i) + 1$$
We can actually apply some sort of dynamic programming also to the function $g$. We store the values in a hash table. First we initialize the function for every possible prime factor with $g(p) = \infty$, and whenever a $f(j-1)$ changes, we update $g(p)$ for every $p \in P(A[j])$.
This means, after we update $f(i)$, we only need to update $O(\log M)$ different values of $g$.
This gives us the following algorithm:
# initialize g
for all p:
g(p) = infinity
# set the first value f(0)
f(0) = 0
# update g
for p in P(A[1]):
g(p) = min(g(p), f(0))
for i = 1 to N:
# first compute f(i)
f(i) = infinity
for p in P(A[i]):
f(i) = min(f(i), g(p)) + 1
# and then update g
if i < N:
for p in P(A[i+1]):
g(p) = min(g(p), f(i))
print f(N)
It's easy to see that this algorithm runs in $O(N \log M)$ time.
A shorter variation, but with the exact same approach and complexity, would also be:
for all p:
g(p) = infinity
f(0) = 0
for i = 1 to N:
f(i) = infinity
for p in P(A[i]):
g(p) = min(g(p), f(i-1))
f(i) = min(f(i), g(p)) + 1
print f(N)
The only thing to discuss is, how we get the prime factorization of each number. There are loads of different approaches. For instance if $M$ is not too big, you can compute the Sieve of Eratosthenes in $O(M \log M)$ and during the computation store a prime factor for each number $\le M$. This allows to compute the prime factorization of each number in $O(\log M)$. Another option would be just to compute the prime factors on the fly, which would give take additionally $O(N \sqrt{M}$) time if you use the typical trial division until square root algorithm.