Does an algorithm exist that finds the maximum of an unsorted array in O(log n) time?
12 Answers 12
This question is asked a lot (is this a popular CS homework question or something?) and the answer is always the same: no.
Think about it mathematically. Unless the array is sorted, there is nothing to "cut in half" to give you the log(n)
behavior.
Read the question comments for a more in-depth discussion (which is probably way out of the question's scope anyhow).
-
What about Bitonic array? how do we find Bitonic number( max number )in an array with log(n) complexity?Prasanna– Prasanna2020年05月01日 19:28:12 +00:00Commented May 1, 2020 at 19:28
-
Obviously the question refers to parallel computation in which context it is certainly possible.Gregory Morse– Gregory Morse2021年02月15日 01:44:18 +00:00Commented Feb 15, 2021 at 1:44
Consider this: without visiting every element, how do you know that some element you haven't visited isn't larger than the largest you have found so far?
It's not possible to do this in O(log(N))
. It is O(N)
in the best/worst/average case because one would need to visit every item in the array to determine if it is the larges one or not. Array is unsorted, which means you cannot cut corners.
Even in the case of parallelisation, this cannot be done in O(N)
, because Big-O notation doesn't care about how many CPU one has or what is the frequency of each CPU. It is abstracted from this specifically to give crude estamate of the problem.
Parallelisation can be neglected because time spent dividing a job can be considered equal to the time of sequential execution. This is due to the reason of constants being disregarded. The following are all the same:
O(N) = O(Const * N) = O(N / Const) = O(N + Const) = O(N - Const)
From the other hand, in practise, divide-and-conquer parallel algorithms can give you some performance benefits, so it may run a little bit faster. Fortunately, Big-O doesn't deal with this fine-grained algorithmic complexity analysis.
-
Big O analysis certainly deals with divide and conquer complexity. And if your parallelism is n/2 comparisons simultaneously then it reduces to O(log n) complexity. And the fine grained points are totally irrelevant if n is large enough and you actually had such an absurdly large number of processors. You would get practical time savings. Dividing the job is potentially a one time task, while the computation could be repeated many times. It's not particularly realistic to have huge numbers of dedicated processors waiting to solely do this task, but its theoretically and practically posibleGregory Morse– Gregory Morse2021年02月15日 01:51:33 +00:00Commented Feb 15, 2021 at 1:51
-
@GregoryMorse assuming it is possible to have an N-Core supercomputer. One can assume that it is possible to divide the task between every core. So that each core executes some computation task. Which would make this is an
O(1)
task, only if we didn't have to then conquer - compare the computation between each other to figure out the largest of the computed values. Comparison isO(1)
operation, but it needs to be repeatedN
times, as there is no way to not-compare something and still get a max value. ThusO(log n)
is impossible for this task, irrespective of a number of cores.oleksii– oleksii2021年02月15日 15:54:48 +00:00Commented Feb 15, 2021 at 15:54 -
1This is incorrect the conquer part is O(log n) steps. The tournament winner style algorithm means each step half as many comparisons are needed. So first compare n/2 in O(1) then n/4 in O(1) ... until 1 remains, the max. It's the archetypal log n algoGregory Morse– Gregory Morse2021年02月16日 16:14:57 +00:00Commented Feb 16, 2021 at 16:14
-
1Actually the tournament algorithm applies EXACTLY to unsorted arrays. [4, 1, 0, 22, 7, 3, 5] -> [4, 22, 7, 5] -> [22, 7], -> [22]. It doesnt matter where it is. [22, 4, 1, 0, 7, 3, 5] -> [22, 1, 7, 5] -> [22, 7] -> [22]. [4, 1, 0, 7, 3, 5, 22] -> [4, 7, 5, 22] -> [7, 22] -> [22]. I suggest taking a formal algorithms course. It is where I learned about this. The odd number at any stage is a "by" to the next round of a tournament - just a ceiling function ultimately in the recurrence relation. Remember we are talking about parallelism here!!! Of course this is right for the non-parallelGregory Morse– Gregory Morse2021年02月22日 18:14:34 +00:00Commented Feb 22, 2021 at 18:14
-
1@GregoryMorse One cannot simply use tournament selection here, as one needs to build tournaments first. Building tournaments is an O(n). I am happy to leave it here: you think you are right, so be it.oleksii– oleksii2021年02月22日 20:46:33 +00:00Commented Feb 22, 2021 at 20:46
no. you well have to iterate through the array at least once.
No. It's O(n). In the worst case all members of the array have to be visited and compared.
Of course NOT . suppose that there's still an element which you haven't still compared it with any other element . so there is no guarantee that the element you haven't compared is not the maximum element
and suppose that your comparing graph (vertices for elements and edges for comparing ) has more than one component . in this case you must put an edge (in the best way between maximum of two components).we can see that at n-1 operation MUST be done
O(log n)
implies you won't even have to read the whole array as that would be O(n)
, that's not really possible for unsorted array as you can't be assured about an element being maximum if you can't compare it to all other elements. O(n)
is the best you can have to get absolute maximum which traverses array only once, if you only want an approximate, you can randomly pick elements and have maximum of them which will pick lesser than n
elements, still, O(log n)
is just not possible for unsorted array.
There is an algorithm better than O(N)
:
You just pick a random element from array and assume it's the largest.
No, seriously, if you have an array of normally distributed numbers, and you need to get not the largest number, but some number close to the largest, then you can, let's say, make N/10 random picks and chose the largest from those. For normal distribution the chances of finding a number close enough to the largest are pretty high. Or you can be lucky and even find the largest, but you won't know for sure if you found it or not.
I think, for some cases this approach may be useful. For example, if you need to group your numbers into buckets but you don't want to read the whole array. In that case you can take random 10% sample and make buckets based on max value of that sample plus one extra bucket for numbers above that 10% max. And that should be good enough.
Yes, we can do that, but your array must be a mountain array. Here is an example function:
public int peakIndexInMountainArray(int[] arr) {
int s = 0;
int e = arr.length-1;
int mid = s+(e-s)/2;
while(s<e) {
if(arr[mid] < arr[mid+1]){
s = mid+1;
}
else {
e = mid;
}
mid = s+(e-s)/2;
}
return mid;
}
This is very old, but I don't agree with the answers given. YES, it can be done, with parallel hardware, in logarithmic time.
Time complexity would be:
O(log(n) * log(m))
n
is the quantity of numbers to compare; m
is the size of each number.
However, hardware size would be:
O(n * m)
The algorithm would be:
Compare numbers in pairs. Time of this is
O(log(m))
, and size isO(n * m)
, using carry look-ahead comparators.Use the result in 1 to multiplex both inputs of 1. Time of this is
O(1)
, and size isO(n * m)
.Now you have an array half the initial size; go to step 1. This loop is repeated
log(n)
times, so total time isO(log(n) * log(m))
, and total size isO(n * m)
.
Adding some more MUXes you can also keep track of the index of the largest number, if you need it, without increasing the complexity of the algorithm.
-
2It is also possible to do it in O(1), but that is not practical; hardware size would grow very fast with n and m.alx - recommends codidact– alx - recommends codidact2018年04月01日 00:16:15 +00:00Commented Apr 1, 2018 at 0:16
If you use N
processors, it can be done in O(log N)
time.
But the Work Complexity is still O(N)
.
If using N^2
processors, you can reduce time complexity to O(1)
by applying the Usain Bolt algorithm.
I think using Segment tree could be helpful , you could achieve log(N) cost .
k > n
.k
as some constant and thus ignore it. (2) Use thek
notation explicitly. I have never seen any book/article analyzing an algorithm assuming the number of coresk = f(n)
for some functionf
(besides constant, of course). If someone did - please reference me to this source and I'll revert my comment.O(n)
, so you end up withO(n log n)
. Brent's theorem may help with some algorithm cascading here (the proof is nontrivial), but maybe I've misunderstood the concept. See uni-graz.at/~haasegu/Lectures/GPU_CUDA/Lit/reduction.pdf slide 30.