LEBOBBLE - Editorial






Probability, Expectation, Binary Index Tree/Segment Tree


You’re given an array of integers B. An array is created out of this array as following : A[i] is equal to B[i] + d with probability P[i], and it is equal to B[i] with probability 1-P[i]. Your task is to find out expected number of swaps bubble sort will make on array A.


Number of swaps made by bubble sort on sequence is number of inversions - so we’re to find the expected number of inversions. By linearity of expectation, we have to find out sum of probabilities of (i,j) forming an inversion for all i, j. This can be done using a data-structure like Binary Index Tree/Segment tree in O(N logN) time.


Let’s try to understand how many swaps would Bubble sort make on a general
sequence. In ith pass, ith largest element of the sequence goes in the ith last position and order of all other elements doesn’t change. Let’s focus on the largest element for now. Once the largest element reaches end of the array, it is not
involved in any other swaps later. If this element was at ith position initially
in how many swaps was it involved? It is easy to see that this number is N-i-1.
By same logic for other elements, for each element, number of swaps it is involved in is equal to the number of places it is left to its position in the sorted array. This is nothing but
number of inversions this element creates and so total number of swaps is equal
to total number of inversions in array.

More formally number of inversions is the number of pairs (i,j) where A[i] > A[j]
and i < j. So essentially we’ve to calculate the expected number of pairs (i,j)
with i < j and A[i] > A[j] in the array A. Now let’s see how to calculate expectation E :

Let’s define an indicator variable X(i,j) = 1 if i and j create an inversion
else 0.
So it is clear from definition that E = Expectation of ( Sum over all i and j where i < j { X(i, j) } )
Using linearity of expectation : E = Sum over all pairs i and j where i < j { Expectation of ( X(i, j) }

Now what is Expectation of X(i, j) (denoted as E(i,j) from now on) ?

E(i, j) = Prob (X(i, j) = 1) * 1 + Prob( X(i, j) = 0) * 0 which is equal to Prob (X(i, j) = 1) . i.e the probability that ith element and jth element form an inversion.

Approach 1:
Move over i from 1 to N, move j from i+1 to N. There are two possibilities for A[i] : either B[i] or B[i] + d. Similarly there are two possibilities for jth element and so total there are 4 possibilities for the (A[i], A[j]) pair. For each possibility we know the probability of its occurence (product of individual probabilities) and we also know if A[i] > A[j] or not. So we can find E(i,j) in constant time. This approach is O(N2) and is probably slow for the given constraints.

Approach 2:
We need to speed up our algorithm to O(N log N) or faster. There is indeed a way out - by using segment tree or binary indexed tree. Let’s try to understand the high level idea. Say you’ve processed first i elements of array and in the process have also constructed all possible numbers that could belong to A[1…i] each with probability of its occurrence as well. Now we see x = B[i+1]. With probability P[i+1], A[i+1] would be x + d. So we find out sum of probabilities of occurrence of all those elements which are greater than x + d and are generated in first i elements of A. We multiply the probability by P[i+1] and add that to the answer. Similarly with probability (1-P[i]), A[i+1] would be B[i+1]. We can again find out sum of probabilites of numbers generated by first i elements greater than B[i], multiply it by (1-P[i]) and add it to answer.

Pseudo code of the algorithm:

ans = 0
S = {}   // S is map of integers to probabilities

for i = 1 to N
  ans += P[i] * ( query(S, B[i] + d) )  // query(S, x) gives : sum over y > x { S[y] }
  ans += (1- P[i]) * query( S, B[i])
  S[B[i] + d] += P[i]
  S[B[i]] += 1 - P[i]

A binary index tree or segment tree could be used after cordinate compression for this algorithm.
Refer to setter/tester code for a better understanding.

Note : Beware of precision issues. One could completely avoid usage of floating point numbers by interpreting probabilities as integers and later dividing by 10000.


Can be found here.


Can be found here.


Simple mergesort “inversion count” technique can also be used here coupled with binary search.
Complexity: O(n*(logn)^2)


TESTER’S SOLUTION and SETTER’S SOLUTION is absolutly same.is this kind of error or there is only one solution?


Could anyone explain the inversion count / merge sort method. I was reading this one :


I can’t wrap my head around the use of l_sum and the temp arrays.

1 Like

suppose at any point of time you have left array -> {5,6} and right array -> {1,7}. Lets call element from left as “e1” and from right array as “e2”. Now there are four cases for a pair (e1,e2) to be considered.

  1. (e1,e2)
  2. (e1+d,e2)
  3. (e1,e2+d)
  4. (e1+d,e2+d)

Find the probability for each of these cases to be an inversion (and thus causing a swap). Ex: for 1) You don’t need to consider each pair separately, rather for each e2, use binary search on left array to find its correct sorted order position and see how many elements e1 now form inversion pair with e2.
Extend it for other cases :slight_smile:

1 Like

way with Merge Sort -

No: of Inversion in an array = No: of Swaps in Bubble Sort(give it a thought).

lets take the second sample test case given

4 7
5 6 1 7
25 74 47 99

now we have to find the expected value. expected value means E[x]=p1x1 +p2x2+…
where x1,x2,… are the values that random variable x can take and p1,p2,… are the probability values that X can take values from x1,x2,…
so here what i did… instead of counting inversion in each array possible and multiplying by probability of happening of that array…i counted probability of inversion in array

i took another array C,size of C is double that of A or B. it have values a,a+d for all “a” belongs to array B that is given as goes by…

here it have C:{ 5,12,6,13,1,8,7,14 }

I also saved values of probability of happening of each element in another array or u may use STL pair(C++) here. so array prob: {75,25,26,74,53,47,1,99}

now using the approach of counting inversion through Merge sort… we pass both these arrays in merge sort function… here what we are doing is we are assigning weight to every element in array C of which inversion we are going to talk about and calculating probability of each inversion
so divide the array into 2 parts.

Left Part- 5 12 6 13
. only inversion here is 12,6. so we multiply probability values of the element in this pair…and saved it in ANS.

Right Part- 1,8,7,14. only inversion here is 8,7. so we multiply probability values of the element in this pair too…and add to ANS…

if u can recall method of merge sort… we also sort each half every time it get dealt with…
(Make sure that. since value of array C changes place (or it get sorts ).so, its corresponding value in prob array must be gets changed to its new place)

so now is the step of merging..
now array is 5,6,12,13,1,7,8,14.

we put left index(i) on 5… i.e value of i=0. similarly right index(j) is on 1. i.e value of j=4.
mid =(0+7)/2 = 3.


while(i<=mid-1 && j<=7)
,whenever C[i]>C[j]… means this is an inversion… since left part of this array is sorted so… all element in left from i to mid-1 also form an inversion… so add all the product of probability to ANS.
. but remember to sort after this merge step…(that’s what “temp is used for”).

L_sum or Left_sum is used to store the sum all product of probabilities before the step of merge.(So that we don’t have to sum them at processing time in while loop since, that can increase complexity).

Lastly, the answer will be in ANS… for this case 1.6049 :slight_smile:

@rohanag- i hope i have cleared your doubt :slight_smile:


you have, thanks @shashank_jain ! :slight_smile:

@shashank_jain : +1 for such a nice Explanation.

1 Like

What a clever approach. Hats off! Thank you.

1 Like

I’ll repeat here what I wrote under the practice problem statement as it is more likely to be seen. For some reason, the problem statement for both the practice problem and the actual competition problem is just the letter “p”, so I’m wondering if it’s possible to restore the problem statement :). Thanks!

The original problem statement was:

Little Elephant loves bubble sorting.
Bubble sorting for any array A of n integers works in the following way:
var int i, j; 
for i from n downto 1 
for j from 1 to i-1 
if (A[j] > A[j+1]) 
swap(A[j], A[j+1]) 
You are given an array B of n integers. Then the array A is created using array B as following : for each i (1 <= i <= n), we set Ai = Bi + d with the probability Pi, otherwise Ai = Bi.
Help Little Elephant to find the expect number of swaps that bubble sorting will make when the array A is sorted with the above bubble sorting algorithm.

First line of the input contains single integer T - the number of test cases. T test cases follows. First line of each test case contains pair of integers n and d. Next line of each test case contains n integers - array B. Next line contains n integers - array P.

In T lines print T real numbers - the answers for the corresponding test case. Please round all numbers to exactly 4 digits after decimal point.

1 <= T <= 5
1 <= n <= 50000
1 <= Bi, d <= 10^9
0 <= Pi, <= 100

2 7
4 7
50 50
4 7
5 6 1 7
25 74 47 99

1 Like

Thank you!