I am working on a project and came across the following problem I have to solve.
Imagine we have a time series data Ts
which is an array of pair, say each element is (Xi, Yi)
. The length of this array is constant N
but is updated regularly. Each update consists of two steps,
- We remove
Ts[0]
- Add a new pair
(X, Y)
to the end ofTs
Essentially like an online update to a sliding window of size N
.
Now here comes the problem. At any given times (could be after an update or before), we are given a nonzero input integer A
and B
. The goal is to find such i
between 0
and N-1
inclusive where the term (A-Xi)/(B-Yi)
is maximized.
Another constraints I would like to mention here,
(X0, X1, X2, ... X(N-1))
is non-decreasing,Xi
is non-zero non-negative integer for alli
.(Y0, Y1, Y2, ... Y(N-1))
is non-decreasing,Yi
is non-zero non-negative integer for alli
.A
andB
are both non-zero non-negative integer.- For any query, we can safely assume
B > Yi
for every possiblei
. Technically for the original problem,B
could be equal to the lastY(N-1)
. But this is a boundary condition and we would just eliminateN-1
in this case. - We can also safely assume
A > Xi
for every possiblei
. Unlike constraint (4). It is not possible forA - Xi
to be 0 in any case.
A naive approach would be to iterate through all N
possibilities and easily find the solution (This is what I am doing). But I am thinking of a way to optimize this in the case of very large N
. A query could come at any time for however many number of times, for example,
- Query, Update, Query (1000 times), Update. Or could also be,
- Query, Update, Query, Update, ...
Is there any room for optimization here? I am interested to see a proof from more mathematical standpoint if there is none as well.
1 Answer 1
First, apply the standard reduction: to solve a ratio maximization problem (sometimes called "fractional programming"), binary search on the solution $t$:
$$ \max_x \frac{f(x)}{g(x)} \geq t \iff \max_x f(x) - t g(x) \geq 0 \\ \text{assuming } g(x) > 0 \; \forall x. $$
Applying that to this problem,
$$ \begin{align} \max_i (A - X_i) - t (B - Y_i) &= \max_i A - X_i - t B + t Y_i \\ &= \max_i (-X_i + t Y_i ) + (A - t B) \end{align} $$
so it is sufficient to maintain a data structure on $X, Y$ to answer queries of form $Q_{X,Y}(t) := \max_i (-X_i + t Y_i)$.
When data is considered as 2-D points $\{(X_i,Y_i)\}$, a query is a linear function maximization. It is possible to maintain a convex hull of dynamic points to support such query in $O(\log N)$ time using $O(N)$ space [1].
Let $M$ be the maximum integer value of input then we only need $O(\log M)$ iterations of the binary search. Therefore, the final runtime is $O(\log N \log M)$ per query, and $O(\log N)$ time per update.
(Time complexity doesn't change but it is possible to use a simpler incremental convex hull data structure by utilizing the sliding window update. It may also be possible to use the non-decreasing constraint (thinking of Convex hull trick) but I'm not sure.)
- [1] Brodal, Gerth Stølting, and Riko Jacob. "Dynamic planar convex hull." The 43rd Annual IEEE Symposium on Foundations of Computer Science (2002).
-
$\begingroup$ Thanks, this looks very promising. Could you please elaborate on what t represents here? $\endgroup$askyfullofstars– askyfullofstars2024年02月14日 13:17:54 +00:00Commented Feb 14, 2024 at 13:17
-
$\begingroup$ @askyfullofstars $t$ is a parameter used in binary search. It represents a potential maximum value for the ratio being maximized, to check if a solution exists above this target value. $\endgroup$pcpthm– pcpthm2024年02月14日 14:00:49 +00:00Commented Feb 14, 2024 at 14:00
-
$\begingroup$ I am still trying to wrap my head around this concept of t here. If t is given, then it makes sense that we are able to run a O(log N) query by looking at X and Y as 2-D points. But I guess since t is also a free parameter, are we saying it is bounded by max(A, B)? Hence we can do a binary search to locate the most optimal t? $\endgroup$askyfullofstars– askyfullofstars2024年02月15日 02:39:07 +00:00Commented Feb 15, 2024 at 2:39
-
$\begingroup$ Yes, we can do binary search to locate the most optimal $t$ because the condition is monotonic. The optimal value is always in the range $[0, A],ドル so we can use this range for the initial bounds of the binary search. The optimal value can be a non-integer, but $~ 2 \log_2 B$ precision is enough because differences between different ratios are $\gt 1/(B^2)$. $\endgroup$pcpthm– pcpthm2024年02月15日 10:55:30 +00:00Commented Feb 15, 2024 at 10:55
-
$\begingroup$ doesn't that imply the most optimal t will always be A if we are maximizing maxi(−Xi+tYi) ? $\endgroup$askyfullofstars– askyfullofstars2024年02月15日 12:58:29 +00:00Commented Feb 15, 2024 at 12:58
Explore related questions
See similar questions with these tags.
abs(A+Xi / B-Yi)
to be maximized, orA+Xi / B-Yi
?? $\endgroup$