I'm currently studying recurrences for divide and conquer algorithms (CLRS chapter 4) and I am struggling to understand a slight change that was made to the latest (4th) edition of the book.
The recurrence relation is defined , which resolves to the following:
CLRS Ch. 4 Recursion Tree Solution
In the previous version of the book and a lot of places I see online, the limit to the summation is instead , which makes sense since you only include the combine cost for every level except the last. But now the sum just , which considering starts at 0 and ends at , means is included for all levels including leaves. The definition of a tree's height was also changed to be (which includes the root) whereas it used to be just .
Can somebody explain why these changes were made? I must be misunderstanding something here.
For instance, for a binary tree with just 2 elements with recurrence defined as:
where are constants, I get . If I use the solution formula, I get because of the extra sum so the math does not match.
1 Answer 1
TLDR
Can somebody explain why these changes were made? I must be misunderstanding something here.
The changes were made to reflect programmatic* consistency in the sense that if your input size n was odd and dividing factor b was even, your recursive program may not consistently enter the base case (for every possible odd input size) through the specified condition: if n == 1; due to floating point arithmetic. However, the use of the floor function takes care of this.
* Nevertheless, CLRS is a computer science book, and not a core mathematics book. Thus, one ought to maintain a mental context of programming (in addition to the mathematical context) when studying the examples presented in the book.
Detailed Explanation
The recurrence relation is defined T(n) = a*T(n/b) + f(n), which resolves to the following:
T(n) = Θ(n^log(a, b) + ∑j=0→log(n,b): a^j * f(n/b^j)
By "resolves to [...]", I hope you're referring to the fact that the bounds (i.e. asymptotic complexity) of the original recurrence correspond to the said resolution, as opposed to proclaiming the value of the recurrence is equivalent to the value of the resolution, because the latter would be inaccurate.
Secondly, as pointed out in the comments, the summation limit was changed to floor(log_b n) not log_b n. Therefore, it is logically conducive to declare the height of the recursion tree as floor((log_b n) + 1) since the LHS of the resolution takes care of only the last level (leaves) of the tree, while the RHS accounts for every other level - just one step apart.
And lastly:
For instance, for a binary tree with just 2 elements with recurrence defined as:
T(1) = c1 and T(n) = 2T(n/2) + n(c2)
where c1,c2 are constants, I get T(2) = 2(c1) + 2(c2). If I use the solution formula, I get T(2) = 2(c1) + 4(c2) because of the extra sum so the math does not match.
Please, understand that what you call the "solution formula" is: (1) a provisional framework for determining the bounds of recurrence relations, serving as proof to a preceding lemma (specifically Lemma 4.2 of CLRS 4th Ed., p.108) in the relevant book section, and not an explicit almighty resolution (exact value evaluation) to all recurrences of that form; (2) a precedent to the Master Theorem, which if you've learned by now, tells you what kind of recurrences it can acceptably determine the bounds for.
Thusly, you cannot simply plug in values into the provisional framework, for the sole purpose of determining the exact evaluation output of your "binary tree" recurrence as you seem to be doing here. To determine the bounds of this new recurrence, you ought to simply follow the framework presented by the authors, which is the tree method.
Here's a solution using the back substitution method instead (the tree method is left as an exercise to reinforce your understanding; your answer must be the same as this).
Given T(n)=2T(n/2)+nc_2 with conditions already stated above:
T(n/2)=2T(n/4)+(n/2)c_2
T(n)=2(2T(n/4)+(n/2)c_2)+n_c2
T(n)=4T(n/4)+nc_2+nc_2
T(n/4)=2T(n/8)+(n/4)c_2
T(n)=4(2T(n/8)+(n/4)c_2)+nc_2+nc_2
T(n)=8T(n/8)+nc_2+nc_2+nc_2
kth term -> T(n)=(2^k)*T(n/2^k)+k(nc_2)
At some point (the base case), T(n/2^k) meaning that n/2^k=1 <-> k=log_2 n
Substituting these values in kth term, we obtain:
T(n)=2^(log_2 n)(c_1)+(log_2 n)(nc_2)
T(n)=nc_1+(log_2 n)(nc_2)
The bounds of the recurrence is simply big omega(nlogn).
Trouble viewing equations? See here.
Comments
Explore related questions
See similar questions with these tags.
log(n,b), they changed it tofloor(log(n,b)).