CLERVT - Editorial

meooow · April 16, 2018, 12:44am

PROBLEM LINK:

Practice
Contest

Author and Editorialist: Soumik Sarkar
Tester: Sarthak Manna

DIFFICULTY:

MEDIUM

PREREQUISITES:

Euler tour, Dynamic programming

PROBLEM:

There is a tree with N vertices, and a set of vertices G. For every subset G' of G, find the maximum number of edges which can be removed while leaving each vertex in G' connected to vertex 1. Calculate the XOR-sum of all these values.

EXPLANATION:

Consider the tree to be rooted at 1. Given a set G imagine you have removed all edges are not necessary for the vertices in G to remain connected to 1. What will this graph look like?

Of course the graph will remain a connected tree. Also, all the leaves will belong to G. Some elements of G may lie in the interior of the tree as well. If asked to find the number of edges in this tree, you would say the answer is |G| - 1. However, in a roundabout way I can also claim that counting the edges on an Euler tour of this tree will give me exactly twice the answer.

Let it be that during the Euler tour I start from the root and happen to visit the vertices in G in the order G_1, G_2, G_3, ... G_K. I can say the total length of the Euler tour is dist(1, G_1) + dist(G_1, G_2) + dist(G_2, G_3) + .... + dist(G_K, 1). As mentioned before, this is exactly twice the number of edges in the tree, but the values of all these terms are the same as in the original tree. So if we know dist(G_i, G_j) for each pair (i, j) and also the “Euler tour order” of the vertices in G, we can calculate the answer in \mathcal{O}(K)!

We can get the Euler tour order with one dfs from the root. And then we can proceed to make K dfs from each vertex in G to calculate the distance matrix for each pair of vertices in G. To get the distance matrix it is also possible to use faster methods, but not necessary.

However applying this procedure in this form is too slow as it requires \mathcal{O}(K) time (K/2 on average) for each subset of G.

To optimize it, we can notice that we can use the answer for one subset to calculate that of another. Let us denote by f(G) the value dist(1, G_1) + dist(G_1, G_2) + ... dist(G_{x-1}, G_{x}) where G_1 to G_x are the elements of G in Euler tour order. Then clearly f(G) = f(G \setminus G_x) + dist(G_{x-1}, G_{x}). Now we can apply dynamic programming using bitmasks to denote subsets and calculate the answer for each subset in constant time and XOR them together.

ans = N - 1
for mask in [1..2^K]:
    x = last set bit in mask
    if x is the only set bit in mask:
        dp[mask] = dist(1, G[x])
        ans = ans XOR (N - 1 - dp[mask])
    else:
        y = second last set bit in mask
        dp[mask] = dp[mask \ y] + dist(G[y], G[x])
        ans = ans XOR ((N - 1 - (dp[mask] + dist(G[x], 1)) / 2)

Note: The solution requires finding the last (or first) 2 set bits in an integer which can be done quickly in GCC using __builtin_ctz. If you instead find the set bits by looping over each bit position it appears to take \mathcal{O}(K) time for each subset but with a little thought it can be shown that the total operations required over all subsets remains \mathcal{O}(2^K).

Total complexity is \mathcal{O}(NK + 2^K).

ALTERNATE BASIS FOR SOLUTION:

Instead of calculating the answer as half of dist(1, G_1) + dist(G_1, G_2) + ... + dist(G_K, 1) one can also claim that the answer is exactly dist(lca(1, G_1), G_1) + dist(lca(G_1, G_2), G_2) + ... + dist(lca(G_{K-1}, G_K), G_K), where lca(u, v) is the lowest common ancestor of u and v. The rest of the process can be done based on this as well.

ALTERNATE SOLUTION:

Instead of storing answers to “What is the tree size for a particular subset?” we can instead store “How many subsets have a particular tree size”? This is a better way because the number of unique sizes is bounded by N-1, whereas the number of subsets is much greater.

Let f(i, d) be the number of subsets that have G_i as last element and tree size exactly equal to d. Then for any G_j for j > i, the value of f(i, d) directly contributes to f(j, d + dist(lca(G_i, G_j), G_j)). Here dynamic programming can be used in the “forward” manner, to get a solution in \mathcal{O}(NK^2).

Solution by @uwi: 18276897
Solution by @filyan: 18287756

AUTHOR’S AND TESTER’S SOLUTION:

Author’s solution can be found here
Tester’s solution can be found here.