Problem Link:
author: Roman Rubanenko
tester/editorialist: Utkarsh Lath
Difficulty:
Hard
Pre-requisites:
understanding of rooted trees, diameter of a tree, heavy-light composition, BIT / segtree
Problem:
Given a tree, you have to find its diameter. But thats not all of it. The tree initially consists of a single node, but it is extended by adding one leaf at a time. After adding each leaf, report the diameter of the tree
Explanation:
Expected complexity O( n log n ), or a not very slow O( n log2 n).
The Setter and Tester had more complicated solutions than some of those submitted during the contest. Therefore, I will first describe the simple solution:
Simpler solution
Let vertices v1 and v2 be the most distant vertices, forming diameter of the current tree(ties can be broken arbitrarily). If diameter of the tree should increase upon adding a leaf w, then the most distant pair is either w and v1, or w and v2.
Once the above fact has been proven, we simply need to maintain the most distant node v1 and v2. Upon addition of a node w, find its distance from v1 and v2, and update things if we have a better diameter. We only need to find the distance between two arbitrary nodes of our tree efficiently. This can be done by rooting the tree on vertex 0 and finding LCA of the two nodes, and reporting sum of distances of both nodes from LCA. Answering LCA queries is a very standard problem, and [this](http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=lowestCommonAncestor#Another easy solution in O(N logN, O(logN)) approach to find LCA is sufficient to solve our problem in O(n log n) time.
Because of known efficient algorithms for online construction of LCA data-structure, this solution can be used even in fully online setting, with worst case complexity O(n). We thank the contestants for coming up with this beautiful solution.
Proof that v1 or v2 is one end point of a diameter
Consider the path from v1 to v2, of length, say D. Diameter of new tree must be D+1. Also, w must be one endpoint of any pair of most distant vertices because otherwise diameter of original tree would also be D+1.
Pick any vertex v3 most distant from w, and consider the path from v3 to w. This path must intersect the path from v1 to v2, because the path from parent(w) to v3 forms a diameter of the original tree and so does the path between v1 and v2, and all diameters of a tree must pass through the centers of a tree.
The two paths w to v3 and v1 to v2 intersect as shown. We have
dist(v2, y) ≥ dist(v3, y) – otherwise v3-v1 is a better diameter of original tree
dist(v2, y) ≤ dist(v3, y) – otherwise w-v2 is a better diameter of new tree
⇒ dist(v2, w) = dist(v3, w), and v2-w is also a diameter of the new tree
Our original O(n log n) solution
Lets root the tree at vertex 0, and use height (u) to denote dist(0, u).
Also, use depth(u) = maxx descendant of u dist(x, u).
First, lets try to construct a naive solution for updating diameter on adding of tree.
add_node(w)
update(w, w, parent(w))
update(v, child, p)
d1 = dist(p, v) = height(v) - height(p)
d2 = max<sub>c' children of p, c' ≠ c</sub> depth(c')
diameter = max(d1+d2+1, diameter)
if(p != root)
update(v, p, parent(p))
The above algorithm goes along the path from v to root, and for each node, finds the depth of deepest descendant not along the path. Suppose I want to choose a particular path P from root to some vertex in the final tree, and make updates fast when a vertex on this path is added. This can be done by storing the following quantity at each node v.
value(v) = maxc child of v, c not on Path P depth© + 1 - height(v)
When adding a new vertex w ∈ P, the longest path starting from w has length height(w) + maxv ancestor of w value(v). This can be found efficiently (in O(log n) time) by maintaining either a BIT, or a segment tree for the path P.
Therefore, if we manage to maintain value(v) for all nodes on path P, we could efficiently update diameter when adding any node on the path P. To efficiently maintain value(v) for all nodes on P, whenever you add a node w ∉ P, find its nearest ancestor on path P, and update its value parameter in the BIT/segtree for path P.
Now that the problem is solved for a path, how to solve it for the entire tree ?
Heavy-light decomposition immediately comes to our mind. Heavy light decomposition(HLD) has been described very well in some of the previous editorials and outside 1, 2.
Any rooted tree can be completely split into a family F of paths such that path from any node to root intersects at most log n paths from the family. Therefore, when adding a new vertex v, you will have to query O(log n) paths for the quantity height(w) + maxv ancestor of w value(v). Since all the queries ask for maximum of value(v) in some prefix of the path, it can be done with maintaining a BIT. Also, we would need to update/query only O( log n) paths because of properties of HLD.
Special handling is needed when switching between paths.
Along with maxx ancestor of u value(x), we would need to account for deepest descendant of u along paths of type Q and R. maxx descendant of u on path P value(x) + 2 * height(x) - height(u) accounts for deepest leaf along paths of type Q. We could maintain another BIT for maintaining value’(x) = value(x) + 2 * height(x), which can be updated just like value(x), and query for maxsuffix of path P value’(x) can be efficiently answered by building a BIT.
Handling paths of type R is a bit tricky. We will need maxc child of u, c ≠ v depth©. This would make the complexity of algorithm O(max-degree) for each new vertex added. Can we make it just O(log n) ?
This is possible, and can be done by storing at each node the depth of deepest and second deepest child which does not lie on the path (i.e. vertex v stores maxc child of u, c ≠ v depth© and 2nd-maxc child of u, c ≠ v depth©). If the v is the deepest child of u, then maxc child of u, c ≠ v depth© is depth of second deepest child, otherwise it is depth of deepest child. This can also be efficiently updated on adding vertices in O(1) time per update. Addition of a new vertex will cause O(log n) such updates, because adding a vertex will affect this parameter for O(log n) nodes only - the points where the path is switched.
The overall complexity of this solution is O(n log2 n), and is perfectly acceptable. To get complexity down to O(n log n), one final observation is required:
When switching paths, it makes sense to go further up iff w is the deepest descendant of v. Because otherwise, depth of any ancestor of u does not change because of addition of w. Also, the diameter does not increase because in any diameter having w as an end point, w can be replaced by this deepest descendant.
Therefore, if we are querying(and updating) the path P, its cost can be taken from the increase in depth of node u. The overall complexity of our solution could be written as O(n + log n * ΣPath P in decomposition of tree depth of first node of path P). In HLD, the mysterious expression above is not guaranteed to be O(n log n), but it is hard to design test data to make the above quantity O(n log2 n) for Heavy light decomposition of tree. However, the above quantity can be easily proved to be O(n log n) if we use another decomposition scheme - where the deepest child is picked, instead of heaviest child as in HLD.
Behind the scene: The tester used the above O(n log n) approach, and the author used a similar but different O(n log2 n) solution. The Tester initially thought that the complexity of his solution was also O(n log2 n), but his solution was 10 times faster than setters inspite of applying all kinds of speedups in setters solution. Trying to hunt down the difference, the tester concluded that his complexity was actually better due to reasons given above.