We are given an array $A[1..n]$ of integers, and an array of 3-tuples known as queries. The query tuples $(i,j,v)$ denote additions of an integer $v$ to the subarray of $A[i..j]$. I'm interested in the query time of $A[k]$ for 1ドル \leq k \leq n$ between the queries.
For example, let $A = [1,2,3,4],ドル and let the queries be $Q = [[1,2,5],[2,3,6],[1,3,10]]$. After the second query has been processed, say I want to find the value of $A[3],ドル which would be 3ドル+5+6=14$. Does an algorithm exists to do this in less than linear time?
-
2$\begingroup$ What should be linear time? The execution of a query (why is a query an array of queries?), or the lookup of a value? And linear in what? The size of $A,ドル the size of $Q$ or the intervals in $Q$? $\endgroup$Ainsley H.– Ainsley H.2013年05月04日 23:40:11 +00:00Commented May 4, 2013 at 23:40
2 Answers 2
Yes, there are good data structures for this. The running time to handle $q$ queries and $\ell$ lookups will be $O((q+\ell) \lg n)$ time in total, or in other words, $O(\lg n)$ time per query and $O(\lg n)$ time per lookup. This can be much faster than linear time in general.
Let $A_\text{orig}[1..n]$ denote the original value of the array $A$. We're going to build a data structure to represent the array $\Delta[1..n]$ of integers, where $\Delta[i]$ holds the amount that $A[i]$ has been increased by (given the queries so far). Thus, to compute $A[i],ドル it suffices to look up $A_\text{orig}[i]$ and $\Delta[i]$ and return their sum.
The question is, how do we maintain the $\Delta[]$ array in a way that we can efficiently look up $\Delta[i]$ at any index $i,ドル and that we can efficiently update $\Delta$ in response to a particular query?
If we represent $\Delta$ as an array, then lookups take $O(1)$ time, but updates (handling a query) may take $\Theta(n)$ time, which is no good.
A better solution is to represent $\Delta$ using an interval tree. With the proper data structures, each lookup will take $O(\lg n)$ time, and each update (processing each query) will take $O(\lg n)$ time. This should give an efficient algorithm.
An interval tree holds a set of non-overlapping intervals. In our case, the intervals will be consecutive ranges of array indices for which $\Delta$ holds the same value. We will augment the data structure to hold the value of $\Delta$ for each such interval.
For example, initially the interval tree could be represented as $[1,n] \mapsto 0,ドル to represent that all indices in the interval $[1,n]$ yield the value 0; so $\Delta[i]=0$ for all $i$. If we now receive the query $(1,2,5),ドル then $\Delta$ is updated to the interval tree $[1,2] \mapsto 5, [3,n] \mapsto 0$. If we then receive the query $(2,5,6),ドル then the interval tree should be updated to $[1,1] \mapsto 5, [2,2] \mapsto 11, [3,5] \mapsto 6, [6,n] \mapsto 0$.
We can store the interval tree using a binary search tree. If we have $m$ intervals, then there will be at most 2ドルm$ unique endpoints of the intervals. We store these endpoints in a binary search tree. Each leaf node represents a single endpoint. We will augment the nodes with a value: if a node represents the integer $a$ (an array index) and the next-largest integer endpoint is $b,ドル then the value $v$ at the node for $a$ means that the interval tree contains $[a,b-1] \mapsto v$.
Note that each query can be processed in $O(\lg n)$ time. That's because a query might possibly split two existing intervals in the tree into two sub-intervals. Each split operation can be handled in $O(\lg n)$ time by inserting one node into the binary search tree.
Similarly, lookups can be handled in $O(\lg n)$ time, by traversing the binary search tree to find the largest node that is less than or equal to the index you're looking up.
This provides a sub-linear time solution to your problem. Processing $q$ queries and $\ell$ lookups to the array takes $O((q+\ell) \lg n)$ time in total.
If start - end + 1
is equal to the number of elements in the array, then you have to touch every element. This is linear in the size of the array.
add(array, start, end, increment)
{
if(increment == 0) return;
for(i = start; i <= end; ++ i)
{
array [i] += increment;
}
}
add_all(array, queries)
{
for(q in queries)
{
add(array, q.start, q.end, q.increment);
}
}
You can use an interval tree (a red-black tree data structure augmented to handle intervals). Querying the ith element is equivalent to finding queries whose intervals overlap with i (say $q_1,\dots,qn$) and returning the sum of array[i] and the increments of those queries, i.e. $\text{array}[i] + q_1.\text{increment}+\dots+q_n.\text{increment}$.
Finding overlapping intervals takes $O(\log n)$ time. If the number of queries is small, then accessing the ith element takes $O(\log n)$ time.
-
$\begingroup$ The problem is that the problem will ask me about the value of the i-th element between queries , not after them. $\endgroup$mohammed essam– mohammed essam2013年05月04日 17:17:46 +00:00Commented May 4, 2013 at 17:17
-
$\begingroup$ I did not get that. Please state the problem correctly (copy it verbatim if you have it). $\endgroup$mrk– mrk2013年05月04日 17:19:04 +00:00Commented May 4, 2013 at 17:19
-
$\begingroup$ I said "at any given time" , that means even between queries $\endgroup$mohammed essam– mohammed essam2013年05月04日 17:20:37 +00:00Commented May 4, 2013 at 17:20
-
1$\begingroup$ @mohammedessam You also say "compute the array", which can make the reader believe you want to update the values of the whole array, and not just query a value. $\endgroup$Juho– Juho2013年05月04日 17:27:02 +00:00Commented May 4, 2013 at 17:27
-
1$\begingroup$ @saadtaame Yeah, but I think this is a nice practical solution. You can also use some additional tricks depending on the structure of the queries. For example, sometimes an interval might subsume another. $\endgroup$Juho– Juho2013年05月04日 17:51:42 +00:00Commented May 4, 2013 at 17:51