 # SUBTREE - Editorial

MEDIUM

### PREREQUISITES

Graph Theory, Lowest Common Ancestor

### PROBLEM

You are given a tree G built on N vertices. There are several queries of the type

• K out of N vertices are marked
• Find the diameter D of the minimal subtree H that contains these K vertices
• For all pairs of vertices that are distance D apart in the H, find the length of the shortest edge that is shared by all the paths
• Print -1 if there is no such edge

### QUICK EXPLANATION

Firstly, you must build the tree G from the notation given in the problem statement.

This can be done by making a DFS from an assumed root node (you can assume it to be 1). In this new rooted tree, run another DFS to build the distances from the root to all the nodes. Let such an array be distance.

Next, build your LCA lookup data structure to make queries of the following type

• ancestor(x,h) = The ancestor of x that is 2h hops above
• shortest(x,h) = The length of the shortest edge on the path from ancestor(x,h) to x

shortest can be built in the same way ancestor is built in the classical LCA by the recursion

```shortest(x,h) = min( shortest(x,h-1), shortest( ancestor(x,h-1), h-1) )
```

Distance between any two nodes can be found using the LCA, and the dist table.

The diameter of the tree for the K vertices can be found conveniently in two steps

• Find the vertex v that is farthest from the first vertex in the set of K vertices
• Find the vertex u from the set of K vertices that is farthest from v

It is well known that this algorithm returns the two farthest vertices, and hence the diameter of the tree.

Let D be the length of the diameter of the tree.

### EXPLANATION

Now, there are some interesting properties of the Diameter that we must exploit. First of all

All diameters of the tree must pass through at least one vertex. This vertex is called the Center of the Tree.

This simplifies some things for us. We know that path(u,v) is a diameter of the tree. Now, if path(p,q) is also a diameter of the tree, where p, q are chosen from the set of all vertices minus { u, v }, then

• Either path(u,p) is also a diameter
• Or path(u,q) is also a diameter
• Or both

This is easy to prove as following

• Let c be the node from the intersection of nodes in path(u,v) and path(p,q), at the shortest distance from u
• Let a = distance(c,p)
• Let b = distance(c,q)
• Without loss of generality, let a ≤ b
• a + b = D
• Let x = distance(u,c)
• Let y = distance(c,v)
• x ≤ a
• otherwise x + b > D (which is not possible)
• Similarly, y ≤ b
• Adding the above two gives
• x + y ≤ a + b
• Or, x + y ≤ D
• But we know that x + y = D

Thus, the equality holds in both the cases (since none of the numbers are negative).

• x = a
• y = b

This means that path(u,q) is a diameter. And path(p,v) is a diameter.

This takes care of the case, to prove that path(u,p) or path(u,q) is a diameter. Since, we have proved that the larger one of them is in fact true. If in our proof a was equal to b, then we would have the situation where both the new paths are as well a diameter.

But in such a case a = b = c = d = D / 2. And path(u,v) and path(p,q) share no edges. This is indeed the only case for which we print -1. For all other cases, we will have at least 1 common edge.

In fact, after taking care of checking whether there is an answer or not, we can partition the set of leaves into two parts, say A and B.

• The pairwise distance between all vertices in A is < D
• Similarly, the pairwise distance between all vertices in B is < D
• The pairwise distance between a vertex in A and a vertex in B is equal to D

Now, you can find the length of the shortest edge as follows

```Let a = LCA( vertices in A )
Let b = LCA( vertices in B )

if a is ancestor of b
search for ancestor v of b
which is not ancestor of any node in A
shortest edge from b to parent of v, is the answer
else if b is ancestor of a
search for ancestor u of a
which is not ancestor of any node in B
shortest edge from a to parent of u, is the answer
else
shortest edge between a and b is the answer
```

Be careful to not perform any O(N) operation for any query. Keeping running time to O(K log N) for each query is the key to solving the problem. Along with building of the data structures in O(N log N) the overall complexity of the solution is O(N log N + sum-of-all(K) log N).

### SETTER’S SOLUTION

Can be found here.

### TESTER’S SOLUTION

Can be found here.

6 Likes

“But in such a case a = b = c = d = D /
2. And path(u,v) and path(p,q) share no edges. This is indeed the only case
for which we print -1. For all other
cases, we will have at least 1 common
edge.”

Did you mean a=b=x=y?
And I don’t understand how a=b implies no edges are common. I can think of many cases where this is not true.

why is it not shortest edge between a and b when one of a/b is ancestor of the other?

//