PROBLEM LINK:
Author: Jatin Yadav
Tester: Triveni Mahatha
Editorialist: Adarsh Kumar
DIFFICULTY:
Hard
PREREQUISITES:
Fast Fourier Transform, Centroid decomposition, Properties of Expectation
PROBLEM:
You are given a tree with N nodes. On each of the next N-1 days, you remove one of the remaining edges to create a forest. The strength of a forest is defined to be the sum of squares of the sizes of its connected components.
You need to compute the expected value of the strength of the resulting forest after day i for all 0 \le i < n, modulo 1000000007.
EXPLANATION:
Lets try to find the expression for E[x^2,i]. Here E[x^2,i] represent the expected value of the strength of the resulting forest after day i.
Say, after i days, there is a component of size x consisting of vertices (u_1,u_2,...,u_x). We know that, it will contribute x^2 to the answer with some probability. Now, what can be the other way to look at it?
We can decompose x^2 in the following fashion, where each f(u_k)=1:
$E[x^2,i] = E[(\sum_{k=1}^x f(u_k))^2,i]$\Rightarrow E[x^2,i] = E[\sum_{k=1}^x f(u_k)^2,i] + 2E[\sum_{a=1}^x\sum_{b=a+1}^xf(u_a)f(u_b),i]
\Rightarrow E[x^2,i] = E[x,i] + 2E[\sum_{a=1}^x\sum_{b=a+1}^xf(u_a)f(u_b),i]
\Rightarrow E[x^2,i] = n + 2E[u-v,i]
Now, the main problem reduces to finding the expected number of ordered pairs (u,v) which are connected after i days, indicated by E[u-v, i]. This can be solved using linearity of expectation. Basically, we need to find what is the probability that u-v are connected after i days? If we sum this probability for all the ordered pairs, we are done.
What is the probability that u,v are in the same connected component after i days?
For u,v to be in same component all the edges on the shortest path from u to v must not be removed in i days. Let the number of edges between u and v on their shortest path be k (i.e. distance between u and v is k). Then the probability P(u-v,i) can be computed as:
$P(u-v,i) = \frac{\binom{n-1-i}{k}(k!)((n-1-k)!)}{(n-1)!}$Notice that, above expression for probability only depends on the distance between u and v. If we can somehow compute number of ordered pairs at distance k as D[k] for all 1 \le k < n. We will take D[0]=0. Then, we can write the formula for expectation as:
$E[u-v,i] = \sum_{k=0}^{n-1-i}D[k]\frac{\binom{n-1-i}{k}(k!)((n-1-k)!)}{(n-1)!}$Counting the number of ordered pairs at distance k for all 1 \le k < n.
This is a standard problem and can be done in O(nlog^2n) with the help of centroid decomposition and FFT. Following are the step for doing this:
- Build the centroid tree of given tree.
- Traverse each node of centroid tree. When on a particular node (say P):
- You need to count distance between two nodes present in different subtrees. For doing the same:
- Build a polynomial for each of the subtree. Polynomial should contain $\sum_{\text{d $\in$ subtree}} x^{\text{distance between P and d}}$.
- Add all the polynomial created for all the subtree and square it. This will overcount those pairs which are present in same subtrees.
- Subtract the square of each of the subtree polynomial to remove overcounting.
- Still we have every pair counted two times. So divide by two, all the coefficient of polynomial.
- We now have a polynomial in which coefficient of $x^k$ represents the number of paths of length $k$ passing through this node and present in different subtrees.
- You need to count distance between P and all other nodes.
- Traverse all the subtree and update the count naively.
- You need to count distance between two nodes present in different subtrees. For doing the same:
Computing the value of E[u-v,i] modulo 1000000007 for all 0 \le i < n.
Naive computation of E[u-v,i] for all 0 \le i < n will take O(n^2) time, which is not feasible here. Lets take a closer look at the formula for E[u-v,i] now:
$E[u-v,i] = \sum_{k=0}^{n-1-i}D[k]\frac{\binom{n-1-i}{k}(k!)((n-1-k)!)}{(n-1)!}$\Rightarrow E[u-v,i] = \frac{(n-1-i)!}{(n-1)!} \sum_{k=0}^{n-1-i}D[k]\frac{(n-1-k)!}{(n-1-i-k)!}
\Rightarrow E[u-v,i] = \frac{(n-1-i)!}{(n-1)!} H(i)
where,
$H(i)=\sum_{k=0}^{n-1-i}D[k]\frac{(n-1-k)!}{(n-1-i-k)!}$\Rightarrow H(i)=\sum_{k=0}^{n-1-i}(D[k] (n-1-k)! )\left ( \frac{1}{(n-1-i-k)!} \right )
Observe that H(i) for all 0 \le i < n can now be computed from convolution of polynomial A(x) and B(x) where:
$A(x) = (D[0] (n-1)!) x^0 + (D[1] (n-2)!) x^1 + ... (D[r] (n-1-r)!) x^r + ... + (D[n-1] 0!) x^{n-1}$, andB(x) = \left(\frac{1}{0!}\right) x^0 + \left(\frac{1}{1!}\right) x^1 + ... \left(\frac{1}{r!}\right) x^r + ... + \left(\frac{1}{(n-1)!}\right) x^{n-1}
Value of H(i) is the coefficient of x^{n-1-i} in A(x)*B(x) modulo 1000000007.
Multiplication of two polynomials with arbitrary modulo.
We have to multiply two polynomials and then output result coefficients modulo M which is not necessary of the kind c · 2^k + 1. Coefficients in multiplication can be up to O(M^2n) which usually can’t be handled precisely enough by floating point types. You can have a look at this pdf, in order to solve this problem efficiently in O(nlogn).
Time Complexity:
O(nlog^2n)