Wikipedia says union by rank without path compression gives an amortized time complexity of $O(\log n),ドル and that both union by rank and path compression gives an amortized time complexity of $O(\alpha(n))$ (where $\alpha$ is the inverse of the Ackerman function). However, it does not mention the running time of path compression without union rank, which is what I usually implement myself.
What's the amortized time complexity of union-find with the path-compression optimization, but without the union by rank optimization?
-
5$\begingroup$ Note that $\alpha(n)$ is the inverse of the Ackerman function, not 1ドル/A(n, n))$. Here "inverse" means the inverse as a function, not the reciprocal: i.e., if $f(n)=A(n,n),ドル $\alpha(n)=f^{-1}(n),ドル not 1ドル/f(n)$. $\endgroup$D.W.– D.W. ♦2015年10月24日 01:14:37 +00:00Commented Oct 24, 2015 at 1:14
2 Answers 2
Seidel and Sharir proved in 2005 [1] that using path compression with arbitrary linking roughly on $m$ operations has a complexity of roughly $O((m+n)\log(n))$.
See [1], Section 3 (Arbitrary Linking): Let $f(m,n)$ denote the runtime of union-find with $m$ operations and $n$ elements. They proved the following:
Claim 3.1. For any integer $k>1$ we have $f(m, n)\leq (m+(k−1)n)\lceil \log_k(n) \rceil$.
According to [1], setting $k = \lceil m/n \rceil + 1$ gives $$f(m, n)\leq (2m+n) \log_{\lceil m/n\rceil +1}n$$.
A similar bound was given using a more complex method by Tarjan and van Leeuwen in [2], Section 3:
Lemma 7 of [2]. Suppose $m \geq n$. In any sequence of set operations implemented using any form of compaction and naive linking, the total number of nodes on find paths is at most $(4m + n) \lceil \log_{\lfloor 1 + m/n \rfloor}n \rceil$ With halving and naive linking, the total number of nodes on find paths is at most $ (8m+2n)\lceil \log_{\lfloor 1 + m/n \rfloor} (n) \rceil $.
Lemma 9 of [2]. Suppose $m < n$. In any sequence of set operations implemented using compression and naive linking, the total number of nodes on find paths is at most $ n + 2m \lceil \log n\rceil + m$.
I don't know what the amortized running time is, but I can cite one possible reason why in some situations you might want to use both rather than just path compression: the worst-case running time per operation is $\Theta(n)$ if you use just path compression, which is much larger than if you use both union by rank and path compression.
Consider a sequence of $n$ Union operations maliciously chosen to yield a tree of depth $n-1$ (it is just a sequential path of nodes, where each node is the child of the previous node). Then performing a single Find operation on the deepest node takes $\Theta(n)$ time. Thus, the worst-case running time per operation is $\Theta(n)$.
In contrast, with the union-by-rank optimization, the worst-case running time per operation is $O(\log n)$: no single operation can ever take longer than $O(\log n)$. For many applications, this won't matter: only the total running time of all operations (i.e., the amortized running time) will matter, not the worst-case time for a single operation. However, in some cases the worst-case time per operation might matter: for instance, reducing the worst-case time per operation to $O(\log n)$ might be useful in an interactive application where you want to make sure that no single operation can cause a long delay (e.g., you want a guarantee that no single operation can cause the application to freeze for a long time) or in a real-time application where you want to ensure that you will always meet the real-time guarantees.
-
$\begingroup$ If we have an offline use case, in the malicious example, wouldn't building the tree cost O(N) anyway. i.e. O(N) + O(N) is still O(N) ? $\endgroup$Rusty Rob– Rusty Rob2021年10月11日 09:33:27 +00:00Commented Oct 11, 2021 at 9:33
-
1$\begingroup$ @robertking, sorry, I don't understand what you are referring to or how it connects to my answer. $\endgroup$2021年10月11日 16:17:15 +00:00Commented Oct 11, 2021 at 16:17
-
$\begingroup$ if you have N union operations, and then one find operation on the deepest node, the total cost will be 2N which is still O(N). Also, if you decided to do an additional N-1 find operations, the rest of the find operations would cost O(1), bringing the total cost to N + N + N-1 = 3N. This seems like it could be faster than with ranking? or am i missing something $\endgroup$Rusty Rob– Rusty Rob2021年10月11日 22:06:23 +00:00Commented Oct 11, 2021 at 22:06
-
1$\begingroup$ @robertking, I apologize, but I don't yet see how that's relevant to my answer. You are talking about the worst-case time of a sequence of operations; I am talking about the worst-case time of a single operation. $\endgroup$2021年10月12日 06:41:16 +00:00Commented Oct 12, 2021 at 6:41
Explore related questions
See similar questions with these tags.