TAPAIR - Editorial

PROBLEM LINK:

Practice
Contest

Author: Tuan Anh Tran Dang
Tester: Gerald Agapov
Editorialist: Jingbo Shang

DIFFICULTY:

Hard

PREREQUISITES:

DFS order

PROBLEM:

Given an undirected simple graph with N nodes and M edges, find how many pairs of edges are fatal. A pair of edges are fatal if and only if the graph will be disconnected after we removed them.

EXPLANATION:

First of all, a trivial case is that, the given graph is disconnected. In such case, the answer should be M * (M - 1) / 2.

Therefore, we only concern the connected graph. Because of the connectivity, we can find a spanning tree first. Then, all edge can be classified into two sets: (1) Tree-Edge, (2) Non-Tree-Edge.

If we remove two Non-Tree-Edges, obviously, the graph will be still connected. If any Tree-Edge is removed, the connectivity depends on whether there are still some Non-Tree-Edges connect the two parts separated by this Tree-Edge in the spanning tree.

Let’s have a example as following (same as the sample input). Since the spanning tree can be chosen in any way, we choose the DFS-Tree starting from node 1 for the convenience (we will state the convenience later).

In this example, we can see that there are two Non-Tree-Edges (labeled as 1 and 2). And then, for each Tree-Edge, it is labeled by a set of Non-Tree-Edges which are crossing the edge. That is, if we delete that Tree-Edge, the Non-Tree-Edges in that set will connect the two separated parts. For a given Non-Tree-Edge, it will take care of a path on the DFS-Tree. Furthermore, in the DFS-Tree, there are only “children --> ancestor” Non-Tree-Edges (i.e. back edges. This provides some conveniece).

Based on this label system, the forbidden pairs of edges are those edges with same label or some edge with empty set. For the first case, that is they depends on the same set of Non-Tree-Edges, after removing them together, the Non-Tree-Edges are not able to connect the three parts of tree (or the Non-Tree-Edge which is needed to connect the two parts are deleted). For the second case, edges with empty set as labels are all cuts, i.e. removing them will lead to disconnected state.

Therefore, our task is now focused on calculating the label of each edge. For a simple hash idea and the perfect (perfect because x xor x = 0) operation XOR, we can set a random number ranging from 1…2^64 - 1 to each Non-Tree-Edges. And then, represent a set of numbers as their XOR sum. That is, if a set is {1, 3}, the label will be treated as 2 (since 1 xor 3 = 2). Two sets are identical, if and only if their labels are same (high probability correct).

Finally, the only obstacle comes that how to XOR a number x to the edges between node u and v for all Non-Tree-Edges. And we need to query the XOR sum after all operations for all Tree-Edges. Thanks to the DFS-Tree we selected, without loss of generality, we suppose u is an ancestry of v. Denote prefix[u] as the XOR sum added to the path from root 1 to node u. Adding x to the path between u and v is equivalent to

prefix[u] = prefix[u] xor x;
prefix[v] = prefix[v] xor x

And after all Non-Tree-Edges are added to their corresponding paths on the DfS-Tree, we traverse the DFS-Tree again and can get the label on all Tree-Edges (it depends on the XOR sum of prefix[] in its sub-tree).

In the end, what we need is to count the number of pairs of labels, which are same or at least one zero. This could be done by hash too.

In summary, we have a O(N + M) algorithm to solve this problem with a high probability (due to the randomness to choose the Non-Tree-Edges’ labels). Because 2^64 is large enough comparing to the range of N and M, don’t worry about the probability. :slight_smile: You will get AC if you implemented correctly.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

2 Likes

Our graph is connected. So first statement is unnecessary.

1 Like

I worked with the problem setter during the preparation of the contest and we also have fully deterministic solutions with O((N+M) * sqrt(N)) and O(N+M) * log(N)) time complexities (the time limit was set in order to allow both such types of solutions to pass). The solution described in the editorial is, however, the simplest in terms of implementation. I only wrote this comment in order to let you know that this is not the only known solution.

1 Like

This was a very nice problem, but the test cases for it are ridiculously bad. After seeing a clearly wrong solution among the accepted ones I tested top solutions with what in my mind is a simplest non-trivial case, which is a graph like this:

*--*--*--*
|  |  |  |
*--*--*--*

Here, the answer should be 7. However, half of the top 12 solutions fail on this case alone.

6 Likes

Here is once more nice test which fails at least one of accepted top solutions:

*-*-*-*
|x| |x|
*-*-*-*
2 Likes

I have an O(M log^2 N) deterministic solution:

  • traverse the graph by DFS; it’s then split into a set B of back (non-tree) edges and a spanning tree rooted at vertex 0
  • define S(i) as the subtree of vertex i and P(k,i) as the path going from vertex i upwards and containing 2^k vertices (including i); define parent[k][i] as the 2^k+1-th vertex on this path, which can be computed as
  • parent[0][i] == parent of i in the spanning tree
  • parent[k > 0][i] == parent[k-1][parent[k-1][i]]
  • define count[k][i] as the number of back edges crossing P(k,i), e.g. connecting a vertex V in S(i) and a vertex U (non-strictly) above parent[k][i]; define D[k][i] as the maximum of depths of U for all those edges
  • we see that deleted pairs of back edges don’t contribute to the answer; we also see that if at least one deleted edge is a bridge, that pair is always counted in the answer, so let’s try deleting a non-bridging tree edge between i and parent(i)
  • if we delete a back edge which doesn’t cross P(0,i), the graph stays connected (S(i) is connected and there’s a non-deleted back edge between S(i) and the rest of the graph); also, if we delete a non-bridging tree edge which is neither in S(i) nor on the path from i to the root, the graph stays connected for the same reason (applied to subtrees corresponding to both deleted edges)
  • if we delete a back edge crossing P(0,i) and count[0][i] > 1, there’s still another back edge from S(i), so so the graph also stays connected; if count[0][i] == 1, there are no such back edges, so this pair contributes 1 to the answer
  • w.l.o.g. let the other deleted edge be a non-bridging tree edge (j-parent(j)) in S(i); we have 3 cases:
  • there’s a back edge (of vertex i) starting in (S(i) without S(j)); along with the non-bridging properties of deleted vertices, it means that S(j) is connected to (G without S(j)) and S(i) is connected to (G without S(i)), so the whole graph is connected
  • all back edges of i start in S(j) and there’s a back edge of j which ends in (S(i) without S(j)); that means S(i) and (S(i) without S(j)) are connected, so S(i) is connected, and S(i) is connected to (G without S(i)), so those edges are not counted in the answer, either
  • all back edges of i start in S(j) and all back edges of j end strictly above i: S(i) without S(j) is disconnected from the rest of the graph, so those pairs are counted
  • in terms of count[][], D[][], it means that for all i: count[0][i] > 0 (if it’s zero, i is a bridge), we add 1 to the answer if count[0][i] == 1, and add 1 for every j in S(i) (of course, j != i) such that D[0][j] < depth(i) and j is (non-strictly) above the LCA of all back edges of i
  • firstly, we need to find LCA of bottom vertices of all back edges of i; to do so, define L[k][i] as the LCA of bottom vertices of all back edges crossing P(k,i); since we have parent[][], we can count LCA(i,j) in O(log N) time
  • we can find D[][], count[][] and L[][] all by the same algorithm: let’s iterate over all back edges b in B; b == (U-V) where depth(U) < depth(V); we’ll split the path [V->U) into disjoint subpaths P(k_i,v_i), and for each of them, set:
  • D[k_i][v_i] =max(D[k_i][v_i],depth(U))
  • count[k_i][v_i] =count[k_i][v_i]+1
  • L[k_i][v_i] =LCA(L[k_i][v_i],V)
  • we know that P(k,i) can be split into P(k-1,i) and P(k-1,parent[k-1][i]), and that by applying the operation D[k][i] =max(D[k+1][i],D[k][i]), D[k][parent[k-1][i]] =max(D[k+1][i],D[k][parent[k-1][i]]) and similarly with count[][] and L[][] with their corresponding operations (addition and LCA), all the edges which were counted in the arrays for given k+1 will also be counted in them for k, so by decreasing k and computing D[k],count[k],L[k] from those arrays for k+1, we can get D[0],count[0],L[0] to contain the values for all back edges crossing the paths P(0,i), e.g. back edges of S(i)
  • now, we want to count for each i, how many vertices j (j != i) there are on the path from L[0][i] to i, such that D[0][j] < depth(i) (if we set D[] to infinity for bridges, they’ll be excluded automatically); this is equivalent to counting the number of vertices i on the path from j to D[0][j], for which L[0][i] is in S(j), which is also equivalent to counting L[0][i] in S(j), for which i is on the path from j to D[0][j], or D[0][j] > depth(i) > depth(j)
  • let’s traverse the tree in pre-order and assign an interval I(i) to S(i), and make a list of i for which L[0][i] == v, for every vertex v; now, we just need to count (for each j) the elements of lists of I(j) in the interval (depth(j),D[0][j]); this task can be solved using sweepline + Fenwick tree (picture them as points in a plane and being asked to count the points in some rectangles)

Since k can be at most log(N)+1, we can count D[],count[] in O(1) time and L[] in O(log N) time per element; for each edge, we process at most k elements, the tree traversal works in O(N+M) and the last part with Fenwick tree in O(N log N), so the resulting complexity is O(M log^2 N). If we could do LCA in O(1), it’d go down to O(M log N).

EDIT: sorry if you find the post hard to read because of big chunks of text - I was trying to do some formatting, but it got erased when posted

4 Likes

Yes, it sounds like a good OMlogN) solution, but you may take the advantage of the DFS-order. In DFS tree, all the other edges are all back edges. It will help a lot to get things much easier.

I know, and I do. Do you see me taking about non-back edges anywhere?

Yes, you are right. But I think we can obtain an algorithm without log. :smiley:

… guilty

Sure, sure. I think we can start from this method to find a fully deterministic solution with linear time.

Not guilty :stuck_out_tongue:

2 Likes

UPD. THE SOLUTION BELOW IS WRONG!!!

I have completely different solution with complexity O(M * log M).
Note that the graph is connected, so N <= M+1 and we could not include it into complexity.

  • At first I divide the graph into biconnected components and count number of bridges, say B.
  • Just to clarify, biconnected component does not have articulation points.
  • Then due to obvious reasons the answer is
    C2(M) - C2(M-B) + Sum[CNT(cmp) : cmp is biconnected component]
    where CNT(cmp) is the number of bad pairs for the biconnected component cmp.
  • Now the challenge is to handle biconnected graph.
    Note that if we have subpath where each vertex has degree two,
    then every two edges in this path forms a bad pair.
  • Let’s convert this observation to working solution.
    At first assign to each edge a weight, which is initially 1.
  • In short, we will delete vertexes of degree 2 by replacing two edges V1 – V – V2 by one V1 – V2, update the answer and the weight W(U, V) of edge U – V will correspond to the number of accumulated edges from the initial graph.
  • In detail, when we have vertex V of degree two with neighbors V1 and V2, we add to the answer W(V, V1) * W(V, V2), since every pair of original edges (E1, E2) where E1 is “from” V1 – V and E2 is “from” V – V2 is bad (it separates V from remaining graph).
  • If V1 and V2 are the only remaining vertexes then, we add (W(V, V1) + W(V, V2)) * W(V1, V2) to the answer according to the observation above and return the answer.
  • Otherwise we delete edges V1 – V, V – V2 and add new edge V1 – V2 with weight equals to sum W(V, V1) + W(V, V2). But, if edge V1 – V2 was before we set this weight to zero. Indeed, if biconnected graph has two edges (U, V) and it has at least three vertexes then deleting one edge (U, V) will left graph biconnected, so it can’t be in any bad pair. Thus, setting it’s weight to zero is the right thing to do to not accumulate initial edges in this accumulated edge in further considerations.
  • If we end up with no vertexes of degree 2 (and thus all degrees are at least 3) the answer is zero [THIS IS WRONG for this test]
  • All this can be implemented very simple using associative array from edges to weights (e.g., stl map in C++)
  • Most probably we can get pure O(M) solution using linked lists in a smart way, but I don’t think much in this direction.

You can find implementation details in my solution.

2 Likes

Actually, this test was the basic test in mind for designing my solution posted below.

Not guilty either :smiley:

Beside the sample, I only tested on trees and cliques…

What if you don’t have vertices of degree 2? The answer still can be non-zero (see the test case below by lgsmitty).

3 Likes

Finally, the only obstacle comes that how to XOR a number x to the edges between node u and v for all Non-Tree-Edges. And we need to query the XOR sum after all operations for all Tree-Edges. Thanks to the DFS-Tree we selected, without loss of generality, we suppose u is an ancestry of v. Denote prefix[u] as the XOR sum added to the path from root 1 to node u. Adding x to the path between u and v is equivalent to

I have two doubts in the solution provided.

First : Can anyone please explain why there is a need to have prefix[u].

All I have understood is that all the back edges get random number between 0 and 2^64 - 1 and all the dfs-tree edges gets a label from whatever backedges is crossing. Now the fatal edges are those having same labels.

Second : How removing of dfs-tree edges makes the graph unconnected ?