Consider Merge-sort algorithm that we modify as follow:
Suppose we extend it to divide the input array into $m$ not necessary equal and after sorting each of them, merge them. Now can we conclude that the running time will be $O(n\log n)$?
I think the running time should be $O(n^2)$ as follow:
If the algorithm divide to three sections such that two first sections contains some constant number of elements and third sections contains all most elements then the running time will be $O(n^2)$ if the algorithm continue dividing as mentioned scenario.
-
$\begingroup$ You are giving no justification for the $O(n^2)$ behavior, except "I think". $\endgroup$user16034– user160342023年03月13日 08:12:29 +00:00Commented Mar 13, 2023 at 8:12
-
$\begingroup$ There seems to be at least one word missing from the supposition. $\endgroup$greybeard– greybeard2023年03月13日 08:49:02 +00:00Commented Mar 13, 2023 at 8:49
-
$\begingroup$ Yes, if I split n numbers into 1 / 1 / n-2 numbers then the runtime will be Theta(n^2). But that would be utterly stupid. $\endgroup$gnasher729– gnasher7292023年03月13日 09:08:34 +00:00Commented Mar 13, 2023 at 9:08
2 Answers 2
The linlogarithmic behavior of MergeSort is due to the fact that the size of the subsets decreases geometrically rather than arithmetically, so that the number of passes over the whole data is logarithmic.
You might very well roll a modified MergeSort that subdivides in fractions $\frac12,\frac13,\frac16$ and keep the same complexity.
But if we pursue your idea of handling subarrays of constant size, the archetypal algorithm that you obtain is StraightInsertionSort (recursively split $n$ in $n-1$ and 1ドル$ sizes), that notoriously has an $O(n^2)$ behavior, because the number of passes is linear.
You get a similar situation when comparing dichotomic search and linear search in a sorted array.
The question leaves some room for interpretation, so I apologize in advance if my answer misses the mark.
You can in fact merge (sort) $m \le n$ sorted "runs" in at most $\mathcal{O}(n\log n)$ time. This is one of the many great ideas that went into the TimSort algorithm. You can find the original design document here.
One of the ideas of TimSort is to look for already sorted subsequences (Tim Peters dubbed them runs). This of course may lead to runs of vastly different size, just like You proposed. In order to merge the runs efficiently, TimSort delays the merging, putting the runs on a stack, looking for opportunities to merge sequences of roughly equal size. To prevent the stack from growing super-logarithmically, TimSort ensures that the sizes of the runs on the stack (top to bottom) grow at least as fast as the Fibonacci sequence.
So if You use such a "Fibonacci" stack for merging, then You can even handle arbitrary splits during recursion (provided Your programming language and/or runtime supports call stacks of size $\mathcal{O}(n)$). Of course it is not exactly merge sort anymore...