Finding an algorithm to return the $\log n$ largest element in an array

Question 1

I have just proved that for every $\alpha, \beta>0 : (\log n)^\alpha=O(n^\beta)$.

Now, given an array of $n$ elements, I want to find an efficient comparison based algorithm for finding the $\log n$ largest elements in the array, and returning them in sorted order.

I will appreciate your help on what is the best way to solve this question, and how should I think when countering such a question.

Question 2

You can omit requirement $\alpha \gt 0$.

Question 3

It just makes the statement stronger. take $\alpha>1$

Question 4

Thanks, but this is not what I asked... I have proved the above and should use it

Question 5

A possibility (don't think it's the simplest way, though) is:

Transform the array into a max-heap;
find (without extracting) the $\log n$ largests elements in the heap.

The first step can be done in $O(n)$. For the second step, let's give a little bit details:

The largest element is found among one element (it's the root);
the second largest must be found among two elements (the two children of the root);
in general, the $k$-th largest element must be found among $k$ elements, because each time you select an element to be the largest, you remove it from the candidates and add its two children.

That means that after having found the $k$ largest elements, finding the $k+1$-th is done by searching the maximum value among $k+1$, so it is done in $O(k)$. Since we want to find at most $\log n$ elements, the total search takes $O((\log n)^2)$ time, and with the property you proved, it is $O(n)$.

That means that the total complexity of the algorithm is $O(n)$ which is obviously optimal since you need to browse all elements of the array.

Edit: Actually, I realize (that confirms what I initially said) that there is a simpler way to do it:

Use a quickselect algorithm to find the $\log n$-th largest element (no plural here) of the array (this is done in $O(n)$). By doing so in place, the $\log n$ largest elements will be placed in consecutive positions at the end of the array. You can then sort those $\log n$ elements in time complexity $O(\log n \log \log n) \subset O((\log n)^2) \subset O(n)$.

Question 6

Excellent! Thank you so much - that was very helpful! I appreciate it! :D

Question 7

Can you please explain more about the quickselect option (I know this sort algorithm)

Question 8

@Math4me It is quite long to explain, so I suggest you read the wikipedia article I linked in my answer. Note that the quickselect algorithm is not the same as quicksort (though they have similarities).

Question 9

No need for an edit tag, see here cs.meta.stackexchange.com/q/657/472

Question 10

Further to what Nathaniel said, you can think of this as a simple modification of quicksort, where you ignore any partitions which fall outside the range of interest.

Given:

algorithm quicksort(A, lo, hi) is
if lo < hi then
 p := partition(A, lo, hi)
 quicksort(A, lo, p - 1)
 quicksort(A, p + 1, hi)

Then this algorithm sorts the top $k$ elements in order, leaving the other elements in an unknown order:

algorithm quicksort_k(A, k, lo, hi) is
if lo < hi then
 p := partition(A, lo, hi)
 quicksort_k(A, k, lo, p - 1)
 if k > p then
 quicksort_k(A, k, p + 1, hi)

The expected running time is $O(n \log k)$.

The same trick works with any sort algorithm that is conceptually based on partitioning, such as MSD radix sort.

Nathaniel Nathaniel 18.4k2 gold badges30 silver badges59 bronze badges · Accepted Answer · 2021-04-25 12:07:01Z

A possibility (don't think it's the simplest way, though) is:

Transform the array into a max-heap;
find (without extracting) the $\log n$ largests elements in the heap.

The first step can be done in $O(n)$. For the second step, let's give a little bit details:

The largest element is found among one element (it's the root);
the second largest must be found among two elements (the two children of the root);
in general, the $k$-th largest element must be found among $k$ elements, because each time you select an element to be the largest, you remove it from the candidates and add its two children.

That means that after having found the $k$ largest elements, finding the $k+1$-th is done by searching the maximum value among $k+1$, so it is done in $O(k)$. Since we want to find at most $\log n$ elements, the total search takes $O((\log n)^2)$ time, and with the property you proved, it is $O(n)$.

That means that the total complexity of the algorithm is $O(n)$ which is obviously optimal since you need to browse all elements of the array.

Edit: Actually, I realize (that confirms what I initially said) that there is a simpler way to do it:

Use a quickselect algorithm to find the $\log n$-th largest element (no plural here) of the array (this is done in $O(n)$). By doing so in place, the $\log n$ largest elements will be placed in consecutive positions at the end of the array. You can then sort those $\log n$ elements in time complexity $O(\log n \log \log n) \subset O((\log n)^2) \subset O(n)$.

Excellent! Thank you so much - that was very helpful! I appreciate it! :D
Can you please explain more about the quickselect option (I know this sort algorithm)
@Math4me It is quite long to explain, so I suggest you read the wikipedia article I linked in my answer. Note that the quickselect algorithm is not the same as quicksort (though they have similarities).
No need for an edit tag, see here cs.meta.stackexchange.com/q/657/472

Stack Exchange Network

Finding an algorithm to return the $\log n$ largest element in an array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Finding an algorithm to return the $\log n$ largest element in an array

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions