3
$\begingroup$

I'm trying to understand sorting algorithms. And I'm thinking about a totally different way to tackle sorting: Use array indexes to sort a given set of integers.

Suppose We want to sort a list of positive integers (for now). Let's call it toSortArray

  • Create a boolean array that has a length equal to maximum integer in toSortArray + 1. Lets call it boolArray. All items in the array are false initially.
  • Loop through toSortArray and for each integer item, set the boolean at index item true: boolArray[toSortArray[i]] = true;
  • Loop through boolArray and if boolArray[i] is true, set the first element of toSortArray to i

Here is the pseudo-code if the algorithm is not clear:

 var boolArray = new bool[toSortArray.Max() + 1];
 for (int i = 0; i < toSortArray.Length; i++)
 {
 boolArray[toSortArray[i]] = true;
 }
 var index = 0;
 for (int i = 0; i < boolArray.Length; i++)
 {
 if (boolArray[i])
 {
 toSortArray[index] = i;
 index++;
 }
 }

First, Does every body agree that time complexity is $O(N)$?

Second, Am I missing something about this algorithm (As I don't think I'm the first one thinking about it)?

Caleb Stanford
7,3282 gold badges29 silver badges50 bronze badges
asked Oct 29, 2019 at 13:21
$\endgroup$

1 Answer 1

4
$\begingroup$

This is algorithm called bucket sort, integer sort, counting sort, or Pigeonhole sort (naming is a bit inconsistent across sources and/or different names refer to slight variations of the same idea).

Its time complexity is $O(m)$ (assuming distinct values) where $m$ is the largest among the $n$ integers you are trying to sort. Your algorithm takes $O(n)$ time as long as $m=O(n)$, for example if you know you that you want to sort a large-enough subset of the integers from 1ドル$ to $n$.

If the integers are not distinct, you can replace booleans with counters to obtain an algorithm with complexity $O(m + n)$. In general, you can sort arbitrary items with integer keys by keeping an array of lists, where the $i$-th list will contain all items with key $i$. The output is just concatenation of these lists (and hence this sorting algorithm is stable as long as you append new elements at the end of the lists).

If you apply the above (stable) algorithm multiple times on your input, where the $j$-th execution sorts the integers w.r.t. the $j$-th least significant digit you obtain Radix Sort.

For example, in base 10ドル$, you'd need $\lceil \log_{10} m \rceil$ iterations, thus obtaining an algorithm with time complexity $O( n \log m )$, which is in $O(n \log n)$ even when $m$ is exponentially larger in $n$ (on a unit-cost RAM machine). For a general base $b$, the time needed would be $O((b+n) \log_b m) = O((b+n) \frac{\log m}{\log b})$, which is optimized (in an asymptotic sense) by choosing $b = \Theta(n)$, resulting in a time complexity of $O(n \frac{\log m}{\log n})$.

answered Oct 29, 2019 at 13:38
$\endgroup$
1
  • $\begingroup$ Hmmm counting sort. Good to know $\endgroup$ Commented Oct 30, 2019 at 14:27

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.