PROBLEM LINK:
Author: Yuyang Huang
Tester: Kevin Atienza
Editorialist: Ajay K. Verma
Russian Translator: Sergey Kulik
Mandarian Translator: Gedi Zheng
DIFFICULTY:
MEDIUM-HARD
PREREQUISITES:
PROBLEM:
Given a directed graph G = (V, E), and an entry vertex r, find the number of pairs of vertices (x, y) such that there exist paths r \longmapsto x and r \longmapsto y having no common vertex other than r.
QUICK EXPLANATION:
If two vertices x and y have a common dominator other than r, then this dominator will be present in all paths from r to x and from r to y. On the other hand, if the only common dominator of x and y is r, then we can find paths r \longmapsto x and r \longmapsto y, which share no vertex other than r. Hence, if we compute the dominator tree of G using the Lengauer and Tarjan algorithm, the task reduces to find the pair of vertices in the tree, which have a common ancestor other than r, and can be done in linear time.
EXPLANATION:
We are given a directed graph G = (V, E), and an entry vertex r. Now, for a given vertex x, consider all paths from r to x. Some vertices will occur in all these paths (e.g., r, x). These vertices are called the dominator of x. More formally, if a vertex d is present in all paths from r to x, then d is a dominator of x. The nodes r and x are the trivial dominators of x, other dominators of x are called non-trivial dominator.
It can be seen that being dominator is a transitive relationship, i.e., if x is a dominator of y, and y is a dominator of z, then x must also be a dominator of z. This is because any path r \longmapsto z must contain vertex y, however, the path from r to y must pass through x. Therefore, any path from r to z must contain the vertex x, i.e., x is a dominator of z.
On the other hand, if a vertex z has two dominators x and y, then x and y must have a dominator relationship between them. We can prove this using contradiction. Consider a path from r to z, this path must pass through both x and y, as they are dominator of z. Let us say that y occurs later in this path, i.e., there is a path \it{p} from y to z, which does not contain x. Now, if x is not a dominator of y, then there must be a path \it{q} from r to y, which does not contain x, if we concatenate the two paths \it{q} and \it{p}, we will have a path from r to z, which does not contain x. This is a contradiction, as x is a dominator of z. Hence, x must occur in every path from r to y, i.e., x is a dominator of y.
Based on these two facts, we can represent dominator relationship using a tree rooted at r. If x is a dominator of y, then there must be a path from x to y in this tree. This tree is called the dominator tree of G. The dominator of a vertex x will be all vertices lying in the path from r to x in the dominator tree. The immediate dominator of a node x is its parent in the dominator tree.
Next, we show how to check whether for two given vertices x and y, there exist paths r \longmapsto x and r \longmapsto y having no common vertex other than r.
Lemma: For two vertices x and y, there exist paths r \longmapsto x and r \longmapsto y having no common vertex other than r if and only if x and y have a common ancestor in the dominator tree other than r.
Proof: Assume that x and y have a common ancestor z \neq r in the dominator tree. This means z is a dominator of both x and y, i.e., both paths r \longmapsto x and r \longmapsto y must contain the vertex z.
On the other hand, let us assume that x and y have no common ancestor other than r. The dominators of vertex y are r = v_1, v_2, \cdots, v_k = y, in the order of distance from r in the dominator tree. Hence, any path from r to y, will be a concatenation of paths v_1 \longmapsto v_2, v_2 \longmapsto v_3, \cdots, v_{k - 1} \longmapsto v_k. There must be a path \it{p} from r to x, which does not contain the dominator v_2. Note that any vertex in the path v_2 \longmapsto v_k has v_2 as its dominator, hence the intersection of \it{p} and any path v_2 \longmapsto v_k must be empty, otherwise \it{p} would need to contain the dominator vertex v_2.
Now, there are two possible cases:
- when there is a unique path from v_1 to v_2 which is the edge (v_1, v_2). In this case we have path \it{p} and v_1 \mapsto v_2 \longmapsto v_k, which do not share any vertex other than r = v_1.
- In the second case, we have more than one path from v_1 to v_2. Since v_2 has no dominator other than v_1, we cannot disconnect v_1 from v_2 by removing only a single vertex other than v_1, v_2. Hence, according to Menger’s Theorem, there must be two vertex disjoint paths between v_1 and v_2 (excluding the endpoints). Let us call them \it{q_1} and \it{q_2}. If the intersection of \it{p} with any of the \it{q_1} or \it{q_2} is empty, then we already have two vertex disjoint paths \it{p} = r \longmapsto x and r \longmapsto y. Now if, \it{p} has a nonempty intersection with both \it{q_1} and \it{q_2}.
\it{p} \cap \it{q_1} = \{u_1, u_2, \cdots , u_m\}
\it{p} \cap \it{q_2} = \{w_1, w_2, \cdots, w_n\}
The two sets \{u_1, u_2, \cdots , u_m\} and \{w_1, w_2, \cdots, w_n\} are disjoint. Without loss of generality we can assume that w_n is closer to x than u_m in the path \it{p}. Let us denote the subpath r \longmapsto w_n of \it{q_2} by \it{pred}, and the subpath w_n \longmapsto x of \it{p} by \it{succ}. Now consider the following two paths:
r \longmapsto x = \text{concat} (\it{pred}, \it{succ})
r \longmapsto y = \text{concat} (\it{q_1}, v_2 \longmapsto y)
The two paths have no common vertex other than r. Hence, if x and y have no common dominator (except r), then we can find vertex disjoint paths r \longmapsto x and r \longmapsto y.
Algorithm:
Based on the above lemma, we can compute the number of bad pairs (i.e., the pairs which have no vertex disjoint paths from r), if we have the dominator tree available. This is the same as the number of pairs which have a common ancestor other than r. If the children of r in the dominator tree are v_1, v_2, \cdots v_k, and the size of the subtree rooted at these nodes are s_1, s_2, \cdots, s_k, then the number of bad pairs will be {s_1 \choose 2} + {s_2 \choose 2} + \cdots + {s_k \choose 2}. The number of good pairs will be N \choose 2 - number of bad pairs, where N is the number of vertices reachable from r in the graph G.
The sizes s_1, s_2, \cdots, s_k can be computed easily in linear time using a bottom up traversal of the tree. Hence, the only task that remains is to calculate the dominator tree of the graph.
Also, note that we do not need to compute the whole dominator tree, but only the sizes s_1, s_2, \cdots, s_k. In other words, we only need to compute the farthest ancestor (i.e., a dominator d, such that d has no non -trivial dominator other than r) of each vertex in the dominator tree. We call this node as the farthest dominator (denoted as \it{fdom}(.)) of a vertex. More formally,
\it{fdom}(r) = -1
\it{fdom}(x) = d, such that d dominates x, and d has no non-trivial dominators.
If all predecessors of x have the same farthest dominator y , then, this will also be the farthest dominator of x, otherwise the farthest dominator of x will be x itself.
We can group the nodes according to their farthest dominator, and these groups will represent the vertices in the subtrees rooted at v_1, v_2, \cdots, v_k, i.e., children of r in the dominator tree. Hence, we can compute the number of good pairs using the size of these groups.
Farthest Dominator in a DAG:
The computation of farthest dominator can be done easily by traversing the vertices in topological order (i.e., from root to leaves).
\it{fdom}(r) = -1
\it{fdom}(x) = d, if \forall y \in \text{pred}(x), \it{fdom}(y) = d
\it{fdom}(x) = x, otherwise.
This will compute the farthest dominator of all nodes in O (N + E) time.
Farthest Dominator in General Graph:
For the general graphs, we can compute all dominators of a vertex, and then pick the one which has no non-trivial dominator as its farthest dominator.
The computation of all nodes, which have x as their dominator, can be done in O (N + E) using a depth first search. The basic idea to start a depth first traversal from r, however, do not go beyond the node x. All the nodes, which are traversed, will not have x as their dominator, while all other nodes will have x as their dominator.
// In the end, S will contain all nodes which have // x as their dominator. S = {1, 2, ..., N}; void dfs(int curr, int x) { // curr is reachable, hence it cannot have x // as its dominator, remove it from the list. remove(S, curr); // Do not proceed after reaching x. if (curr == x) return; foreach successor y of curr { dfs(y, x); } } // This will compute the nodes whose dominator is x. dfs(r, x);
This approach will compute the dominators of all nodes in O (NE) time, and will work for the smaller subtasks.
Dominator Tree of Large Graphs:
For large graphs we can use the Lengauer and Tarjan algorithm to compute the dominator tree. This approach works in O ((N + E) \log N) time, and, can be improved to work in O (N + E) time using advanced data structures. However, in our problem the O ((N + E) \log N) will suffice.