19
$\begingroup$

Wikipedia says union by rank without path compression gives an amortized time complexity of $O(\log n),ドル and that both union by rank and path compression gives an amortized time complexity of $O(\alpha(n))$ (where $\alpha$ is the inverse of the Ackerman function). However, it does not mention the running time of path compression without union rank, which is what I usually implement myself.

What's the amortized time complexity of union-find with the path-compression optimization, but without the union by rank optimization?

D.W.
169k23 gold badges236 silver badges519 bronze badges
asked Oct 23, 2015 at 23:32
$\endgroup$
1
  • 5
    $\begingroup$ Note that $\alpha(n)$ is the inverse of the Ackerman function, not 1ドル/A(n, n))$. Here "inverse" means the inverse as a function, not the reciprocal: i.e., if $f(n)=A(n,n),ドル $\alpha(n)=f^{-1}(n),ドル not 1ドル/f(n)$. $\endgroup$ Commented Oct 24, 2015 at 1:14

2 Answers 2

9
$\begingroup$

Seidel and Sharir proved in 2005 [1] that using path compression with arbitrary linking roughly on $m$ operations has a complexity of roughly $O((m+n)\log(n))$.

See [1], Section 3 (Arbitrary Linking): Let $f(m,n)$ denote the runtime of union-find with $m$ operations and $n$ elements. They proved the following:

Claim 3.1. For any integer $k>1$ we have $f(m, n)\leq (m+(k−1)n)\lceil \log_k(n) \rceil$.

According to [1], setting $k = \lceil m/n \rceil + 1$ gives $$f(m, n)\leq (2m+n) \log_{\lceil m/n\rceil +1}n$$.

A similar bound was given using a more complex method by Tarjan and van Leeuwen in [2], Section 3:

Lemma 7 of [2]. Suppose $m \geq n$. In any sequence of set operations implemented using any form of compaction and naive linking, the total number of nodes on find paths is at most $(4m + n) \lceil \log_{\lfloor 1 + m/n \rfloor}n \rceil$ With halving and naive linking, the total number of nodes on find paths is at most $ (8m+2n)\lceil \log_{\lfloor 1 + m/n \rfloor} (n) \rceil $.

Lemma 9 of [2]. Suppose $m < n$. In any sequence of set operations implemented using compression and naive linking, the total number of nodes on find paths is at most $ n + 2m \lceil \log n\rceil + m$.

[1]: R. Seidel and M. Sharir. Top-Down Analysis of Path Compression. Siam J. Computing, 2005, Vol. 34, No. 3, pp. 515-525.

[2]: R. Tarjan and J. van Leeuwen. Worst-case Analysis of Set Union Algorithms. J. ACM, Vol. 31, No. 2, April 1984, pp. 245-281.

answered May 1, 2019 at 2:17
$\endgroup$
6
$\begingroup$

I don't know what the amortized running time is, but I can cite one possible reason why in some situations you might want to use both rather than just path compression: the worst-case running time per operation is $\Theta(n)$ if you use just path compression, which is much larger than if you use both union by rank and path compression.

Consider a sequence of $n$ Union operations maliciously chosen to yield a tree of depth $n-1$ (it is just a sequential path of nodes, where each node is the child of the previous node). Then performing a single Find operation on the deepest node takes $\Theta(n)$ time. Thus, the worst-case running time per operation is $\Theta(n)$.

In contrast, with the union-by-rank optimization, the worst-case running time per operation is $O(\log n)$: no single operation can ever take longer than $O(\log n)$. For many applications, this won't matter: only the total running time of all operations (i.e., the amortized running time) will matter, not the worst-case time for a single operation. However, in some cases the worst-case time per operation might matter: for instance, reducing the worst-case time per operation to $O(\log n)$ might be useful in an interactive application where you want to make sure that no single operation can cause a long delay (e.g., you want a guarantee that no single operation can cause the application to freeze for a long time) or in a real-time application where you want to ensure that you will always meet the real-time guarantees.

answered Oct 24, 2015 at 1:22
$\endgroup$
4
  • $\begingroup$ If we have an offline use case, in the malicious example, wouldn't building the tree cost O(N) anyway. i.e. O(N) + O(N) is still O(N) ? $\endgroup$ Commented Oct 11, 2021 at 9:33
  • 1
    $\begingroup$ @robertking, sorry, I don't understand what you are referring to or how it connects to my answer. $\endgroup$ Commented Oct 11, 2021 at 16:17
  • $\begingroup$ if you have N union operations, and then one find operation on the deepest node, the total cost will be 2N which is still O(N). Also, if you decided to do an additional N-1 find operations, the rest of the find operations would cost O(1), bringing the total cost to N + N + N-1 = 3N. This seems like it could be faster than with ranking? or am i missing something $\endgroup$ Commented Oct 11, 2021 at 22:06
  • 1
    $\begingroup$ @robertking, I apologize, but I don't yet see how that's relevant to my answer. You are talking about the worst-case time of a sequence of operations; I am talking about the worst-case time of a single operation. $\endgroup$ Commented Oct 12, 2021 at 6:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.