SEAARC - Editorial

PROBLEM LINK:

Practice
Contest

Author: Sergey Nagin
Tester: Shiplu Hawlader
Editorialist: Praveen Dhinwa

DIFFICULTY:

HARD

PREREQUISITES:

combinatorics, binary indexed trees, sqrt decomposition.

PROBLEM:

You are given n points. Color of i^th point is color[i]. Each pair of points with same color is joined by an arc of that color. You need to
find total number of intersections between the different colored arcs.

EXPLANATION

Let c[x] denote the number of occurrences of color x in the array color.

Suppose you calculate total number of pair of arcs, call it V1.
You also calculate total number of intersections between pair of arcs with same colored arcs, call it V2.

Calculating V2

For calculating V2, we will iterate over each color and we will try to find number of intersection for each color.
Suppose the current color is X. Let us say that we have Y points having color equal to X. Then we can observe that total number of intersections
will be C(n, 4), where C represents binomial coefficient.

Proof:
You can write a brute force over small value of n and searching the sequence on OEIS will yield you the required formula.

Though the reasoning behind the formula is not difficult too. An Intersection of two arcs corresponds to selection of
4 points i, j, k, l such that k lies between i and j exclusive (i < j and k > i and k < j) and l > j. So we will select 4 different indices in from 1 to n and the
indices will correspond to i, j, k, l when they are sorted in increasing order. So total number of ways of selecting 4 indices/ values from 1 to n is C(n, 4).


Calculating V1

The pair of arcs can be in these 3 possible ways.

  • ABBA. Here the arc going from A to A is including the arc from B to B. In other words, the arcs going from B to B is being included
    by the arc going from A to A. Call such number of pair of arcs S1.
  • AABB. Here the arc going from A to A is completely separate (excluded arcs) from the arc from B to B. Call its value S2.
  • ABAB. Here the arc going from A to A is intersecting with the arc from B to B. Call its value S3.

Total sum of the 3 values will be V1. ie V1 = S1 + S2 + S3.

We can easily find V1 as V1 is the total number of pairs of arcs between different colors.
For two fixed pair of colors x, y total number of pair of arcs such that one of the arc has color x and other y will be
(c[x] * (c[x] - 1) / 2 * c[y] * (c[y] - 1) / 2).

So V1 will be sum(c[x] * (c[x] - 1) / 2 * c[y] * (c[y] - 1) / 2, 1 <= x, y <= N && x < y).

The sum could be easily found by maintaining prefix sum of c[y] * (c[y] - 1) / 2.

Please note that in any of pseudo codes, no modulo operation is shown, You should be careful about modulo operation while writing your code.

Pseudo code

// Calculation of V1.
long long V1 = 0, prefSum = 0;
for (int Num = 1; Num <= 100000; Num++){
    Ans += prefSum * (c[Num] * (c[Num] - 1) / 2);
	// now update prefSum.
    prefSum += c[Num] * (c[Num] - 1) / 2;
}

Note that answer of our problem is S3. As we have already found V1, we simply need to find S1 and S2, Answer of our problem
will be S3 = V1 - S1 - S2.


Calculating S1

Recall definition of S1.
In case of ABBA, Here the arc going from A to A is including the arc from B to B. In other words, the arcs going from B to B is being included
by the arc going from A to A. Call such number of pair of arcs S1.

Assume that we fix a color x such that c[x] > sqrt(n). We can consider this number equal to 300 (As 300 * 300 = 9 * 10^4) throughout our analysis. So fix
color x such that c[x] > 300.


When c[x] > 300 and c[y] > 300
So we will count number of arcs which are included by our color x.

Now let us fix some color y. Let us say that we have some places p[1], p[2], , p[L] such that col[p[i]] = y.
We can easily know number of colors between p[1] and p[2], between p[2], p[3] etc. For that we can maintain a list of positions for each color where
the color occurs, then by doing a binary search, we can easily find the number of colors between any range [L, R].

Now we need to find number of colors y that include our color x. So we will mark all the points like blocks. e.g. between p[1] and p[2], we will
call it first block, between p[2] and p[3] second and so on.
So if we have two points in block l and r, number of ways will be l * (L-r).
but as the number of pairs (l, r) is huge, we can not calculate this number directly, we need to find some way to calculate this quicker.

if l = r, then it means that we are in the same block numbered l. We can simply use formula Q[l] * (Q[l] - 1) / 2 where Q[l] denotes number
of points x in the block l as it is simply the number of pairs in then block l.

When l < r, our formula becomes Q[l] * Q[r].

Summing over each pairs, we get the entire formula: sum(Q[l] * Q[r] * l * (L - r) s.t. l < r) + sum(Q[l] * (Q[l] - 1) * l * (L - r)).

We can compute the first part of the formula in linear time. We can rewrite the expression as sum(Q[r] * (L - r) * sum(Q[l] * l s.t. l < r)).
The inner summation sum(Q[l] * l s.t. l < r) can be computed using prefix sum.

When c[x] > 300 and c[y] <= 300

Now consider the case when c[x] > 300 and c[y] <= 300. We will try all such x. Let us count D[i] denoting number of points with color x in interval
1 to i inclusive ([1, i]). Number of intervals containing it are (D[infinity] - D[r]) * D[l] where l and r are its sides.

We can calculate these values similar to above part easily.

Pseudo code

// Calculation of S1 when c[x] > 300, c[y] <= 300.
long long S1 = 0;
for (int x = 1; x <= 100000; x++)
    if (c[x] > 300){
       prefSum[0] = 0;
       for (int i = 1; i <= n; i++){
           prefSum[i] = prefSum[i - 1];
           if (a[i] == x) prefSum[i]++;
       }

       for (int y = 1; y <= 100000; y++){
           if (c[y] <= 300){
				long long Ssum = 0;
				// Note that array positions[i] is same as described above.
				for (int i = 0; i < positions[y].size(); i++){
					Ans = (Ans + Ssum * (prefSum[n] - prefSum[positions[y][i]]));
					Ssum = (Ssum + prefSum[positions[y][i]]);
				}
           }
       }
    }

When c[x] and c[y] both are <= 300.
We can simply use scan line algorithm to find the value.
In the following code, at i^th iteration, we are counting number of included pairs of arcs in which one of the pairs has its left end-point at the current
index. We will use binary indexed tree to query and update the value for the current index.

Pseudo code

long long S1 = 0;
for (int i = 1; i <= n; i++)
    if (c[a[i]] <= 300) {
		int j = positions[a[i]].size() - 1;
		int intervals = 0;
		while (positions[a[i]][j] > i){
			 intervals++;
			 S1 += bit_query(positions[a[i]][j]);
			 bit_update(positions[a[i]][j], -1);
			 j--;
		}
		bit_update(i, intervals);

		j--;
		intervals = 0;
		while (j >= 0){
			 intervals++;
			 bit_update(positions[a[i]][j], -1);
			 j--;
		}
		bit_update(i, intervals);
    }

// now we will remove pairs of kind AAAA.
for (int i = 1; i <= n; i++)
	if (c[a[i]] <= 300) 
		S1 -= nCr(c[a[i]], 4);

Calculating S2

Now we need to count pairs of arcs of kind AABB. Note that we can do this by using a simple dynamic programming algorithm.

Let us define dp[i] = number of pairs of excluded arcs such that right arc has its right end point at the current index i.

Let us assume that we are currently as position i, let us say color at position i is x. Suppose j 1 , j 2 , j k
be the positions < i where color is same as x. So we can chose one of these indices as left point of current/right arc and our index i will act as
right end point of current/right arc.

So dp[i] = (number of arcs from 1 to j 1 - 1) + (number of arcs from 1 to j 2 - 1) + … +
(number of arcs from 1 to j k - 1).

Now only thing that we need to find is how to calculate number of arcs from 1 to j.
This part is very easy and can be done by a simple dp.
DP[i] = DP[i - 1] + cnt of number of indices j such that a[j] = a[i] (both i and j have same color).

So dp[i] = DP[j 1 - 1] + DP[j 2 - 1] + . . + DP[j k - 1].
Now we can use binary indexed tree to compute this value faster. Overall complexity of this computation will be O(n log n).


Using previous formulas, we can easily compute S3 which will be our answer of the problem.

Complexity
The complexity of the algorithm is N * sqrt(N) or 300 * N (as we have fixed value 300 instead of sqrt(N)), which will easily pass in the time limit.

Note
If some of the sections of the editorial is not clear to you, please ask for more explanation, I will add it on tomorrow (tuesday).

AUTHOR’S SOLUTION:

Author’s solution
Tester’s solution

5 Likes

Wow, fast editorial!

Look at my O(N^2),

my O(N^2) is amazing…

Let’s just calculate the number of pairs of (strictly) intersecting arcs, together with identical colours. We can split the colours into f “common” ones, with > K occurences each, and “rare” ones, with <= K occurences each and forming a arcs in total. Let’s consider intersecting arcs (u,v) and (x,y) such that u < x < v < y.

For pairs of (rare,rare)-colored arcs: move v from left to right, and keep a BIT X[] such that X[x] denotes the number of points y such that y > v and A[x] = A[y]. It’s clear that the number of arcs intersecting (u,v) is X[u+1…v-1]; we can afford to try all such pairs and calculate the answer for them. We also need to update the BIT; notice that before incrementing v, we need to decrement X[u] of all u < v such that A[u] = A[v]; that can also be done in a straightforward way. Time complexity: O(a log N).

For pairs of (common,common)-colored arcs, we’ll just iterate v from left to right and store V[a][b]: the number of pairs (i,j) such that i < j < v, A[i] = a, A[j] = b. Also, we’ll have arrays L[], R[], where L[c], R[c] denote the number of y > v such that A[y] = c (c common). The number of intersecting pairs of arcs with given v is now the sum of R[c]*V[A[v]][c] over all common c. Updating L[], R[] is easy; when incrementing v, V[c][A[v]] should be increased by L[c] (for all c). This works in O(Nf) time - we do O(f) operations for each v.

Now, the interesting part comes with pairs of (common,rare)-colored arcs; (rare,common) can be solved in the same way by reversing A[]. Let’s do exactly the same thing as we did with (rare,rare)s, but without instant updating or a BIT - instead, we’ll just have an array Y[]. Instead, we’ll keep and update the previously mentioned array R[], and a list of common colours that have changed since the last update (which happened when v = v0). If v > v0+L or at least K common colours have changed, then we’ll update Y: if A[x < v] was common, then Y[x] = R[A[x]] (in all other cases, it’s 0), and afterwards, we’ll convert the array into prefix sums.

The number of arcs intersecting (u,v) is now Y[v-1]-Y[u], but with a correction: plus R[A[max(v0,u+1)…v-1]] (arcs starting after the last update) minus the number of arcs of common colour starting in [u+1…v0-1] and ending in [max(v0,u+1)…v-1] (the ones that were counted in Y[v-1]-Y[u] but aren’t actually intersecting (u,v)).

Let’s add another array now: prefix sums for all common colours: S[c][i]-S[c][j-1] is the number of times colour c appears in A[j…i].

All pairs (u,v) with u >= v0, can be handled separately in O(LN) time - by iterating i from v to v0 and calculating the number of arcs that start between i and v and go over v (by adding R[A[i]] to that number). Then, we have some more (p) arcs with given v, and for each of them, we just add the same number - sum of R[A[v0…v-1]]; this is also O(LN) overall.

Here comes the asymptotically slow part. For each arc (u,v) (u < v0) and each common colour c that’s changed since the last update, subtract (S[v]-S[v0])*(S[v0-1]-S[u]) from the answer. This is the number of common-colored arc endpoints in [v0+1,v] times the number in [u+1,v0-1], which is the second mentioned correction from Y[v-1]-Y[u]. It works in O(aK).

Asymptotically, a = O(NK) and f=O(N/K), plus there are O(N/L) updates in O(N) each in the last part, so we get an algorithm that works in O(NK log N + N^2/K + NK^2 + LN + N^2/L). Okay, this isn’t exactly quadratic, but close - for optimal K = O(N^(1/3)) and L=N^(1/2)), it’s O(N^(5/3)) with an awful constant. But asymptotics isn’t all IRL.

My optimal set of constants was: K=55, L=400 - since the (common,common) part of the algorithm is the fastest and the (common,rare) is the slowest, we’re trying to make f large and K,a small, so K should be much less than sqrt(N) for optimal performance. (It turns out that f is up to around 800 in this case, and it’s reasonable to expect f a bit smaller than worst-case. The most significant bit, to be precise :D) Also, we can afford larger L.

Still, any K from 20 to 80 passes comfortably. In that sense, the test data are very weak - for K = 20, there could be 5000 common colours, which would cause my algorithm huge memory (and time) issues, but nothing like that happened. It’s clear that there were no tests in which many colours would be “moderately common” - with around 50 occurences, for example.

It’s funny how I solve Codechef problems. “Expected O(N log N)? I’ll pass with O(N sqrt N). Expected O(N log^2 N)? I’ll pass with O(N sqrt N log N), and if the expected complexity is O(N sqrt N log N), then I’ll pass with O(N^1.67).” :smiley:

6 Likes

@xellos0, thank you for your effort, bro

Already submitted a solution, but got WRONG ANSWER. So, need help to check if my understanding of the problem is right or not. Please tell me how many pairs of arc with different color intersect in the test case below:

18

5 6 7 8 9 1 2 3 4 7 8 9 1 2 3 6 5 4

my code gives 23 arc … i also want to now that my code is correct or not.

That is not tester’s code, that is kevinsogo’s(http://www.codechef.com/viewsolution/4022433). Please put the proper tester’s code.

1 Like

Please help me to understand the problem correctly. The editorial seems to be a bit complicated for me and could not understand. As the test case submitted earlier was too long, new smaller test cases have been framed. Please tell me how many pairs of arc with different color intersect in the test cases given below:

Case A:

1 1 1 1 1

Case B:

1 1 2 1 2 1

Case C:

1 1 2 2 1 1

As per my understanding of the problem, the answers should be 0, 3 & 0 in the cases A, B & C respectively.

I have submitted kevinsogo’s code to test and editorialist posted it here, will update it soon.

Yeah, you are correct.

problem link of both practice and contest is pointing to other problem. please correctify this.

@shiplu : Thank you for your clarification. The solution to this problem was developed based on the above logic(Cases A, B & C) and a test case of 18 numbers as given below:

5 6 7 8 9 1 2 3 4 7 8 9 1 2 3 6 5 4

Got the answer (by writing down all the possible combinations) as 23.
While testing the code, got the answer as 23 again.
But on submission, i got WRONG ANSWER.
(http://www.codechef.com/viewsolution/4151170)

So, i request you to please help me to identify what could be the problem.

And when do you plan on doing that?

2 Likes