PROBLEM LINK:
Author: Sergey Nagin
Tester: Shiplu Hawlader
Editorialist: Praveen Dhinwa
DIFFICULTY:
HARD
PREREQUISITES:
combinatorics, binary indexed trees, sqrt decomposition.
PROBLEM:
You are given n points. Color of i^th point is color[i]. Each pair of points with same color is joined by an arc of that color. You need to
find total number of intersections between the different colored arcs.
EXPLANATION
Let c[x] denote the number of occurrences of color x in the array color.
Suppose you calculate total number of pair of arcs, call it V1.
You also calculate total number of intersections between pair of arcs with same colored arcs, call it V2.
Calculating V2
For calculating V2, we will iterate over each color and we will try to find number of intersection for each color.
Suppose the current color is X. Let us say that we have Y points having color equal to X. Then we can observe that total number of intersections
will be C(n, 4), where C represents binomial coefficient.
Proof:
You can write a brute force over small value of n and searching the sequence on OEIS will yield you the required formula.
Though the reasoning behind the formula is not difficult too. An Intersection of two arcs corresponds to selection of
4 points i, j, k, l such that k lies between i and j exclusive (i < j and k > i and k < j) and l > j. So we will select 4 different indices in from 1 to n and the
indices will correspond to i, j, k, l when they are sorted in increasing order. So total number of ways of selecting 4 indices/ values from 1 to n is C(n, 4).
Calculating V1
The pair of arcs can be in these 3 possible ways.
- ABBA. Here the arc going from A to A is including the arc from B to B. In other words, the arcs going from B to B is being included
by the arc going from A to A. Call such number of pair of arcs S1. - AABB. Here the arc going from A to A is completely separate (excluded arcs) from the arc from B to B. Call its value S2.
- ABAB. Here the arc going from A to A is intersecting with the arc from B to B. Call its value S3.
Total sum of the 3 values will be V1. ie V1 = S1 + S2 + S3.
We can easily find V1 as V1 is the total number of pairs of arcs between different colors.
For two fixed pair of colors x, y total number of pair of arcs such that one of the arc has color x and other y will be
(c[x] * (c[x] - 1) / 2 * c[y] * (c[y] - 1) / 2).
So V1 will be sum(c[x] * (c[x] - 1) / 2 * c[y] * (c[y] - 1) / 2, 1 <= x, y <= N && x < y).
The sum could be easily found by maintaining prefix sum of c[y] * (c[y] - 1) / 2.
Please note that in any of pseudo codes, no modulo operation is shown, You should be careful about modulo operation while writing your code.
Pseudo code
// Calculation of V1.
long long V1 = 0, prefSum = 0;
for (int Num = 1; Num <= 100000; Num++){
Ans += prefSum * (c[Num] * (c[Num] - 1) / 2);
// now update prefSum.
prefSum += c[Num] * (c[Num] - 1) / 2;
}
Note that answer of our problem is S3. As we have already found V1, we simply need to find S1 and S2, Answer of our problem
will be S3 = V1 - S1 - S2.
Calculating S1
Recall definition of S1.
In case of ABBA, Here the arc going from A to A is including the arc from B to B. In other words, the arcs going from B to B is being included
by the arc going from A to A. Call such number of pair of arcs S1.
Assume that we fix a color x such that c[x] > sqrt(n). We can consider this number equal to 300 (As 300 * 300 = 9 * 10^4) throughout our analysis. So fix
color x such that c[x] > 300.
When c[x] > 300 and c[y] > 300
So we will count number of arcs which are included by our color x.
Now let us fix some color y. Let us say that we have some places p[1], p[2], , p[L] such that col[p[i]] = y.
We can easily know number of colors between p[1] and p[2], between p[2], p[3] etc. For that we can maintain a list of positions for each color where
the color occurs, then by doing a binary search, we can easily find the number of colors between any range [L, R].
Now we need to find number of colors y that include our color x. So we will mark all the points like blocks. e.g. between p[1] and p[2], we will
call it first block, between p[2] and p[3] second and so on.
So if we have two points in block l and r, number of ways will be l * (L-r).
but as the number of pairs (l, r) is huge, we can not calculate this number directly, we need to find some way to calculate this quicker.
if l = r, then it means that we are in the same block numbered l. We can simply use formula Q[l] * (Q[l] - 1) / 2 where Q[l] denotes number
of points x in the block l as it is simply the number of pairs in then block l.
When l < r, our formula becomes Q[l] * Q[r].
Summing over each pairs, we get the entire formula: sum(Q[l] * Q[r] * l * (L - r) s.t. l < r) + sum(Q[l] * (Q[l] - 1) * l * (L - r)).
We can compute the first part of the formula in linear time. We can rewrite the expression as sum(Q[r] * (L - r) * sum(Q[l] * l s.t. l < r)).
The inner summation sum(Q[l] * l s.t. l < r) can be computed using prefix sum.
When c[x] > 300 and c[y] <= 300
Now consider the case when c[x] > 300 and c[y] <= 300. We will try all such x. Let us count D[i] denoting number of points with color x in interval
1 to i inclusive ([1, i]). Number of intervals containing it are (D[infinity] - D[r]) * D[l] where l and r are its sides.
We can calculate these values similar to above part easily.
Pseudo code
// Calculation of S1 when c[x] > 300, c[y] <= 300.
long long S1 = 0;
for (int x = 1; x <= 100000; x++)
if (c[x] > 300){
prefSum[0] = 0;
for (int i = 1; i <= n; i++){
prefSum[i] = prefSum[i - 1];
if (a[i] == x) prefSum[i]++;
}
for (int y = 1; y <= 100000; y++){
if (c[y] <= 300){
long long Ssum = 0;
// Note that array positions[i] is same as described above.
for (int i = 0; i < positions[y].size(); i++){
Ans = (Ans + Ssum * (prefSum[n] - prefSum[positions[y][i]]));
Ssum = (Ssum + prefSum[positions[y][i]]);
}
}
}
}
When c[x] and c[y] both are <= 300.
We can simply use scan line algorithm to find the value.
In the following code, at i^th iteration, we are counting number of included pairs of arcs in which one of the pairs has its left end-point at the current
index. We will use binary indexed tree to query and update the value for the current index.
Pseudo code
long long S1 = 0;
for (int i = 1; i <= n; i++)
if (c[a[i]] <= 300) {
int j = positions[a[i]].size() - 1;
int intervals = 0;
while (positions[a[i]][j] > i){
intervals++;
S1 += bit_query(positions[a[i]][j]);
bit_update(positions[a[i]][j], -1);
j--;
}
bit_update(i, intervals);
j--;
intervals = 0;
while (j >= 0){
intervals++;
bit_update(positions[a[i]][j], -1);
j--;
}
bit_update(i, intervals);
}
// now we will remove pairs of kind AAAA.
for (int i = 1; i <= n; i++)
if (c[a[i]] <= 300)
S1 -= nCr(c[a[i]], 4);
Calculating S2
Now we need to count pairs of arcs of kind AABB. Note that we can do this by using a simple dynamic programming algorithm.
Let us define dp[i] = number of pairs of excluded arcs such that right arc has its right end point at the current index i.
Let us assume that we are currently as position i, let us say color at position i is x. Suppose j 1 , j 2 , j k
be the positions < i where color is same as x. So we can chose one of these indices as left point of current/right arc and our index i will act as
right end point of current/right arc.
So dp[i] = (number of arcs from 1 to j 1 - 1) + (number of arcs from 1 to j 2 - 1) + … +
(number of arcs from 1 to j k - 1).
Now only thing that we need to find is how to calculate number of arcs from 1 to j.
This part is very easy and can be done by a simple dp.
DP[i] = DP[i - 1] + cnt of number of indices j such that a[j] = a[i] (both i and j have same color).
So dp[i] = DP[j 1 - 1] + DP[j 2 - 1] + . . + DP[j k - 1].
Now we can use binary indexed tree to compute this value faster. Overall complexity of this computation will be O(n log n).
Using previous formulas, we can easily compute S3 which will be our answer of the problem.
Complexity
The complexity of the algorithm is N * sqrt(N) or 300 * N (as we have fixed value 300 instead of sqrt(N)), which will easily pass in the time limit.
Note
If some of the sections of the editorial is not clear to you, please ask for more explanation, I will add it on tomorrow (tuesday).