Median in unsorted array in O(n)

In this above link, median in unsorted array is given. please anyone can tell me how is it O(n) solution.
On the other hand it looks like O(n*n) in worst case.
Even if we use randomise fun still then we have O(n*logn) in worst case???

1 Like

Let see the perfect case of this algorithm. Let say that we do logN steps until we find the k-th element in the sorted array. For eatch step we have to itterate form the left to the right bound of some interval [l, r] to rearange the elements. But the interval length is twice smaller in eatch step so the total number of itterations will be N+N/2+N/4+N/8+…+1 that is something like number of nodes in a segment tree over an array with N elements so it is 2*N-1, so the complexity of the algorithm is O ( N ). Yes you are right that it is O (n^2) in wrost case, but if you use randomise you’ll have O ( N ) time just as described above.

For partitioning the array ,the rough median of an unsorted array can be found in O(N) using the median of 5’s algorithm. It involves splitting the array into groups of 5 and finding the median of this group of 5 elements. The median of 5 elements can be found in at most 6 comparisons(How? Think!). Once you find the medians of all the groups, you again recursively divide these medians into groups of 5 and keep finding the medians until you reach the exact median. You can prove that this method is O(N) though the constant factor is too large for practical implementation. I think this algorithm is coded in the second answer on the link given above.You can look at Wikipedia page for details.

Now coming to the randomized implementation ,it has a worst case time of O(N^2) but on an average it runs in O(N). When you partition the array and recursively apply the algorithm, the equation for this will be something like T(N)=T(N/2) + O(N) assuming you split the array in half every time. T(N) = T(N/2)+ c * N where c is a constant. Therefore T(N)= c * N + c * (N/2) + c * (N/4)+ … = c * N * (1+ 1/2 + 1/4 + 1/8 + …) = c * N * 2 which is O(N).

The main argument that you must have against this proof is that the array will not be partitioned into equal halves every time as i have taken here. However if you assume that you are eliminating around 30% of the array every time instead of 50% above (i.e. 3:7 split instead of 5:5) ,you can prove that the algorithm will still be O(N) by using the recursive equation T(N) = T(7 * N /10) + O(N). (7/10 because i removed 30% and recursed on the remaining 70%).There are other better proofs for this but i think this will give you a rough idea about why we say it is O(N) average.

//