TESTERS - Editorial

pragrame · February 13, 2013, 10:08pm

Problem Link:

Practice

Contest

Difficulty:

Hard

Pre-requisites:

Divide And Conquer, Number Theory

Problem:

Define a beautiful number as one whose sum of divisors squared is odd. You are given a tree, in which each node has a number. A path from node X to node Y is beautiful iff the number of beautiful nodes is atleast the number of ugly (non-beautiful) nodes along the path. The length of a path is the number of nodes on the path. Find the total length of all beautiful paths over all pairs (X, Y) of nodes. Do not consider (X, Y) distinct from (Y, X) in this count.

Explanation:

Determining whether a number is beautiful or not

Let us first find out how to determine whether a number is beautiful or not. So, given the number n, what is the value of the sum of its divisors squared? Generally, most functions that are sums of functions of the divisors are arithmetic functions (i.e. f(mn) = f(m)f(n) when m and n are coprime). You could make this guess for this function too, and then prove it.

So, if n = p^k, then its divisors are 1, p, p², . . ., p^k. The sum of its divisors squared is now 1 + p² + p⁴ + … + p^2k. If p == 2, then this the sum of these numbers is definitely odd. If p was an odd prime, then the sum of these numbers is odd iff k is even.

Now, when n = ∏(p_i^a_i). The sum of squares of p_i^a_i is ∑_j=0^a_i (p_i^2j). Product of all these when expanded gives you exactly one square term for each divisor. Hence, f(n) = prod(f(p_i^a_i)).

Now, f(n) is odd iff f(p_i^a_i) is odd. This means that for every odd divisor of n, its exponent must be even for n to be beautiful.

By prime factorizing numbers upto 1.5 * 10^6, one can check in O(1) time whether a given number is beautiful or not.

Finding the total length of all beautiful paths

Now, we are left with finding the total length of all beautiful paths.

Since this is a tree, let us pick a “root” vertex R and check what is the total length of beautiful paths passing through R. We then apply Divide and Conquer: by removing R and considering the subproblems of each remaining connected component separately, and add all the answers up.

So, we have now rooted the tree at R, and each node is either beautiful or not. Let R’s children be x1, x2, …, xk.

Now, suppose you are at a node y in the tree. Ask yourself what is the sum of lengths of beautiful paths which start at y, and pass through R?

It is easy to find the beauty of the path from R to each node in the tree (inclusive of both). You are currently at y, which has a path-beauty of b(y). Let us say the child of R which is y’s ancestor is xi. You need to ask, what all nodes are there in the subtrees rooted at x1, x2, …, xi-1, whose path-beauty value is >= -b(y)+b(R). This is because, any path from y to such a node would go through R, and would have a total beauty of b(y) + b(other-node) - b(R).

What you really need is sum of the lengths, and the number of such nodes, not quite “which nodes”. Therefore, let us keep an abstract array len[] in which we wish len[b] = length of paths “seen so far” whose beauty = b. By “seen so far”, I mean that if you are node y which is a descendant of xi, then the values stored only correspond to nodes whose ancestors are x1, x2, …, xi-1. Now, you are at node y whose beauty is b(y), and you need to ask, what is sum len[j] for j >= -b(y) + b(R). We similarly have an abstract array cnt[] where cnt[b] = the number of nodes whose beauty is b.

The answer to the question “what is the sum of lengths of beautiful paths which start at y, and pass through R?” is “sum len[j] for j >= -b(y)+b(R)” + (length of path from y to R, exclusive of R) * “sum cnt[j] for j >= -b(y)+b(R)”. What I have done is just counted the length of each path by breaking it up as a path from R to the other node + a path from y to R.

This can easily be accomplished by constructing a BIT over the abstract arrays len[] and cnt[]. Thus, so far we get the following algorithm. Assume that when you are at a node which is a child of x1, that R’s beauty value is already seen.


ans = 0
Root the tree at R	// at this step, compute depths and beauties of all nodes
BITLen.increment(-b(R), 1)
BITCnt.increment(-b(R), 1)
if(b(R) == 1) ans++	// the path (R, R) is beautiful, of length 1
for i from 1 to k
	ans += do(k)
return ans

do(xi)
ret = 0
For each node y in subtree at xi, 
	ret += BITLen.query(-b(y) + b(R)) + (len(y)-1)*(BITCnt.query(-b(y) + b(R))
// Now, insert all the y's values into BITLen and BITCnt.
For each node y in subtree at xi,
	BITLen.increment(b(y), len(y))
	BITCnt.increment(b(y), 1)
return ret

We are now nearly done. After we have solved the problem for R, we now remove R and solve the problem for each of the remaining connected components.

If we were to analyze the time complexity of a single one of these runs, we would get O(n log n). If we were to arbitrarily choose R however, then for cases like the tree being a path, if we may end up taking time nlogn + (n-1)log(n-1) + (n-2)log(n-2) + … Which is atleast as bad as O(n^2).

The problem is at each step, the subtree size is very large. If we run a dfs to count the size of subtrees, then we can choose the vertex R intelligently. This approach has been used by the setter.

The overall time complexity is now O(N log^2 N)

Setter’s Solution:

Can be found here

Tester’s Solution:

Can be found here

anton_lunyov · February 13, 2013, 10:40pm

Actually beautiful numbers are exactly squares and double squares, which can be seen from the above formula for sum of the squares of divisors by analyzing parities of prime factors and exponents. So numbers at the vertexes could be made up to 10^18 easily

Also it should be noted that DFS here fails due to stack overflow error (at least at C++) because of quite large bound on N. So to get AC I runs BFS to order vertexes by their depth.

BTW, I have completely different solution. I will update this answer in a while.

anton_lunyov · February 13, 2013, 10:51pm

Both solutions are not available.

aircube · February 13, 2013, 11:45pm

Actually, there is O(NlogN) solution, but it`s so hard to explain =).

Upd.

Alright, solve problem using divide and conquer. Called function that caclulate answer for some tree with root x – F(x). See on tree with root v, this sons are u1, u2, …, uk.
Then F(v) = F(u1) +F(u2) + … + F(uk) + something else. Generaly, something else are paths that starts in some vertex p from subtree with root ui, and ends in some vertex q from subtree with root uj (i != j).

Lets vertex number 1 - is R.

Lets dq(v) - count of ones subtract count of zeros on path from v to R.

Lets umax - some of ui, with maximum depth.

In the end of function F(v), also return Treap, that contains information about all paths from R to some vertex in subtree of v.
Element of Treap contains following fields:

dq - balance of path
cnt - count of paths with balance dq
summary length of paths

This information give us to calculate “something else”.

We do following in the vertex v:

Make Treap of umax is Treap of v
foreach ui (ui != umax) do
1. iterate on all elements Treap of ui and do query to Treap of v,
2. add all elements Treap of ui to Treap of v

It`s possible to proof, this algorithm makes O(N) iterations, and one iteration takes O(logN) because of Treap.
Main idea is

Number of elements of Treap of v is no more than 2(Height of subtree + 1)* with root v.
Proof by induction.

lutyj · February 14, 2013, 3:16am

I used setrlimit from <sys/resource.h> to increase stack size.

rubanenko · February 14, 2013, 3:24am

We would be very thankful if you try)

mugurelionut · February 14, 2013, 3:59am

The fact that “we can choose the vertex R intelligently” (mentioned at the end of the editorial) is actually called centroid decomposition and is a standard technique for many problems on trees. The centroid is that node for which the size of the largest subtree connected to it is minimum.

I also have a very different solution (however it has the same time complexity as the official one - O(N*log^2(N))), based on the concept of heavy paths (actually, only heavy children - the heavy paths themselves are not used). I will try to post it soon (when I find some time) in a separate comment.

I would be very interested in knowing aircube’s O(N*log(N)) solution.

aircube · February 14, 2013, 5:31am

Firstly, i try to understand author`s solution, maybe it is also O(NlogN).

evandrix · February 14, 2013, 7:55am

sample

rubanenko · February 14, 2013, 5:20pm

Unfortunetly, it is not O(NlogN). My solution will work exactly O(Nlog^2(N)) on chain of length N, for example.

mugurelionut · February 14, 2013, 6:15pm

Here is the general idea of my O(N * log^2(N)) solution. First, let’s pick a root (say vertex number 1) - then the tree is rooted. Afterwards, we will traverse the tree bottom-up. For each node i we will store all the paths starting at that node - and for each path we will have its length and its beauty (we add +1 for a beautiful node and -1 for a non-beautiful node on the path): let’s call this set of paths Paths(i).

When we reach a node i we already have the sets Paths(j) computed for each child j of i. We are interested in computing the set Paths(i) and the total length of the beautiful paths passing through the node i. First we have a new path consisting of the node i itself (which may be beautiful or not). Then, we will choose the child j of i with the largest number of nodes in its subtree (this will be the heavy child of i). Paths(i) will actually be initialized to Paths(j) (as a pointer, not by copying the elements), which will later be updated.

In order to update Paths(i) properly we will need to increase by 1 the length of each path in it and we will need to increase the beauty of each path in it by the beauty of node i. This will be achieved as follows: for each set Paths(k) we will also have two extra values: extralen(k) and extrabeauty(k) which are values added to each of the paths in the set (besides that, each path also has its own length and beauty within the set).

Then we will insert the single-node path consisting of the node i into Paths(i). Afterwards we will consider all the other children of the node i (except the heavy child j). Let’s assume that we now reached the child k. We will consider every path from the set Paths(k), will compute its current length and beauty (considering that the path was extended up to node i) and then we will compute the sum of the lengths of all the paths from Paths(i) which can be extended with the current path in order to form a beautiful path (this is simply a range query over the set Paths(i), in order to consider the paths having a beauty larger than a given threshold).

After considering all the paths from Paths(k) we will insert all these paths into Paths(i).

Let’s analyze the time complexity of this solution, assuming that the sets Paths(i) are implemented as balanced binary trees (I used treaps in my implementation). Inserting and searching in such a tree takes O(log(N)) time. Traversing all the elements of such a tree can be performed in linear time (proportional to the number of elements in the tree).

All that is left is to see how many times insertions and searches are performed on these sets. For each node i we perform an insertion and a search operation for each path starting at i and having its other endpoint NOT in the heavy child. It’s not difficult to see that the total number of such operations is O(N * log(N)). Whenever we reach a node i all the interesting paths are defined by their other endpoint p. Whenever a path with the other endpoint p is moved to the heavy child of the current node i, the size of the subtree at least doubles (compared to the previous node i’ when it was considered). Thus, a path with the other endpoint at a node p is moved at most log(N) times (at the other times it is already in the set of paths of the heavy child and, thus, it is not moved at all).

pragrame · February 14, 2013, 7:06pm

Links have been fixed.

mugurelionut · February 14, 2013, 9:33pm

@aircube: I don’t see any divide-and-conquer in your solution. Anyway, apart from that, your solution seems quite similar to mine (try to have a look at it), except that for me umax is the vertex with the largest number of nodes in its subtree (not the one with the maximum depth). If the treap you compute for every node v contains the paths from v to every node in v’s subtree then for sure there aren’t just O(N) iterations (there would be O(N*log(N)) iterations if you choose umax=the vertex with maximum subtree size and O(N*sqrt(N)) if umax=the vertex with maximum depth).

aircube · February 14, 2013, 9:53pm

@mugurelionut: Do you know structure of the test where will be O(N*sqrt(N)) iterations for my solution? Btw, I added assert in my code, and there was no more than 1,500,000 iterations on tests.

aircube · February 14, 2013, 9:59pm

@mugurelionut: O(N) is simply to see on following problem: you are given a tree, and you are calculate number of paths of length K. I claim that it can be solved O(N) time with array, or O(NlogN) with balanced search tree.

mugurelionut · February 15, 2013, 12:21am

@aircube: 1,500,000 iterations seems closer to O(NlogN) than O(N), right?(is that the maximum number of operations with treaps in your solution?). It’s still possible that I did not fully understand your solution (because it’s not clear to me how many elements you may have in each treap). My observations only referred to the case when for each node v you may end up having in v’s treap one element for each node w located in v’s subtree (corresponding to the path from v to w)-that’s the case in my solution and I kind of automatically assumed that was the case with yours, too.

aircube · February 15, 2013, 12:27am

@mugurelionut “Number of elements of Treap of v is no more than 2*(Height of subtree + 1) with root v.” It depends only of height of subtree. I agree that number of iterations will be O(N * logN) if we keep in Treap elements for each vertex in subtree, and don`t merge vertex with equal dq.

“is that the maximum number of operations with treaps in your solution?”

It`s maximum number of operations on codechef’s tests.

mugurelionut · February 15, 2013, 12:34am

@aircube: The O(N * sqrt(N)) case that I was referring may occur for a tree with the following structure. Let’s say that we have K vertices on a path, numbered from 1 to K and say vertex 1 is the root. Add a path of 2 * (K-i) vertices below each vertex i (from 1 to K). Thus: node K will have depth 0, node K-1 will have depth 2, node K-2 will have depth 4, …, node 1 will have depth 2 * (K-1). The idea is that whenever you get to node i (between 1 and K-1) the maximum depth child will NOT be i+1.

mugurelionut · February 15, 2013, 12:54am

@aircube: OK, I finally understood your solution. My treaps actually contain pretty much the same information as yours. And I think you’re right: there are only O(N) iterations in your solution, as you said. That’s because the balance of a path starting at a node v and going down in its subtree can only take values between -depth(v) and +depth(v) (i.e. O(depth(v)) different values). My previous observations would be correct if the balance of a path could take arbitrary values (e.g. if the beauty of a node was not just -1 or +1, but would be given as part of the input).

sbhatewara · February 17, 2013, 8:31pm

can someone please give a little elaborate explanation of the tester’s solution … specially the lines 149-152 in the calcu() function …
I’ll be highly grateful …