Sublinear query time for the $i$th element of an array after some additions?

Question 1

We are given an array $A[1..n]$ of integers, and an array of 3-tuples known as queries. The query tuples $(i,j,v)$ denote additions of an integer $v$ to the subarray of $A[i..j]$. I'm interested in the query time of $A[k]$ for 1ドル \leq k \leq n$ between the queries.

For example, let $A = [1,2,3,4],ドル and let the queries be $Q = [[1,2,5],[2,3,6],[1,3,10]]$. After the second query has been processed, say I want to find the value of $A[3],ドル which would be 3ドル+5+6=14$. Does an algorithm exists to do this in less than linear time?

Question 2

What should be linear time? The execution of a query (why is a query an array of queries?), or the lookup of a value? And linear in what? The size of $A,ドル the size of $Q$ or the intervals in $Q$?

Question 3

Yes, there are good data structures for this. The running time to handle $q$ queries and $\ell$ lookups will be $O((q+\ell) \lg n)$ time in total, or in other words, $O(\lg n)$ time per query and $O(\lg n)$ time per lookup. This can be much faster than linear time in general.

Let $A_\text{orig}[1..n]$ denote the original value of the array $A$. We're going to build a data structure to represent the array $\Delta[1..n]$ of integers, where $\Delta[i]$ holds the amount that $A[i]$ has been increased by (given the queries so far). Thus, to compute $A[i],ドル it suffices to look up $A_\text{orig}[i]$ and $\Delta[i]$ and return their sum.

The question is, how do we maintain the $\Delta[]$ array in a way that we can efficiently look up $\Delta[i]$ at any index $i,ドル and that we can efficiently update $\Delta$ in response to a particular query?

If we represent $\Delta$ as an array, then lookups take $O(1)$ time, but updates (handling a query) may take $\Theta(n)$ time, which is no good.

A better solution is to represent $\Delta$ using an interval tree. With the proper data structures, each lookup will take $O(\lg n)$ time, and each update (processing each query) will take $O(\lg n)$ time. This should give an efficient algorithm.

An interval tree holds a set of non-overlapping intervals. In our case, the intervals will be consecutive ranges of array indices for which $\Delta$ holds the same value. We will augment the data structure to hold the value of $\Delta$ for each such interval.

For example, initially the interval tree could be represented as $[1,n] \mapsto 0,ドル to represent that all indices in the interval $[1,n]$ yield the value 0; so $\Delta[i]=0$ for all $i$. If we now receive the query $(1,2,5),ドル then $\Delta$ is updated to the interval tree $[1,2] \mapsto 5, [3,n] \mapsto 0$. If we then receive the query $(2,5,6),ドル then the interval tree should be updated to $[1,1] \mapsto 5, [2,2] \mapsto 11, [3,5] \mapsto 6, [6,n] \mapsto 0$.

We can store the interval tree using a binary search tree. If we have $m$ intervals, then there will be at most 2ドルm$ unique endpoints of the intervals. We store these endpoints in a binary search tree. Each leaf node represents a single endpoint. We will augment the nodes with a value: if a node represents the integer $a$ (an array index) and the next-largest integer endpoint is $b,ドル then the value $v$ at the node for $a$ means that the interval tree contains $[a,b-1] \mapsto v$.

Note that each query can be processed in $O(\lg n)$ time. That's because a query might possibly split two existing intervals in the tree into two sub-intervals. Each split operation can be handled in $O(\lg n)$ time by inserting one node into the binary search tree.

Similarly, lookups can be handled in $O(\lg n)$ time, by traversing the binary search tree to find the largest node that is less than or equal to the index you're looking up.

This provides a sub-linear time solution to your problem. Processing $q$ queries and $\ell$ lookups to the array takes $O((q+\ell) \lg n)$ time in total.

Question 4

If start - end + 1 is equal to the number of elements in the array, then you have to touch every element. This is linear in the size of the array.

add(array, start, end, increment)
{
 if(increment == 0) return;
 for(i = start; i <= end; ++ i)
 {
 array [i] += increment;
 }
}
add_all(array, queries)
{
 for(q in queries)
 {
 add(array, q.start, q.end, q.increment);
 }
}

You can use an interval tree (a red-black tree data structure augmented to handle intervals). Querying the ith element is equivalent to finding queries whose intervals overlap with i (say $q_1,\dots,qn$) and returning the sum of array[i] and the increments of those queries, i.e. $\text{array}[i] + q_1.\text{increment}+\dots+q_n.\text{increment}$.

Finding overlapping intervals takes $O(\log n)$ time. If the number of queries is small, then accessing the ith element takes $O(\log n)$ time.

Question 5

The problem is that the problem will ask me about the value of the i-th element between queries , not after them.

Question 6

I did not get that. Please state the problem correctly (copy it verbatim if you have it).

Question 7

I said "at any given time" , that means even between queries

Question 8

@mohammedessam You also say "compute the array", which can make the reader believe you want to update the values of the whole array, and not just query a value.

Question 9

@saadtaame Yeah, but I think this is a nice practical solution. You can also use some additional tricks depending on the structure of the queries. For example, sometimes an interval might subsume another.

D.W. ♦D.W. 168k22 gold badges233 silver badges509 bronze badges · Accepted Answer · 2013-05-06 03:36:28Z