PROBLEM LINK:
Author: Roman Rubanenko
Tester: Shiplu Hawlader
Editorialist: Praveen Dhinwa
DIFFICULTY:
Medium
PREREQUISITES:
Graph Theory, topological sort
PROBLEM:
Given a DAG(directed acyclic graph). Find out maximum number of edges that can be added into it so that it still remains a DAG. If there are multiple solutions which lead
to same maximum number of edges that can be added, then you should output the lexicographically smallest sequence of edges.
Explanation
Let’s go step by step, How to find count of maximum number of edges?
Oh, it is easy, You know that graph is a DAG, so there is some topological ordering of a graph.
We can add an edge from any vertex u to v (v lies after u in topological ordering). Note that if the the edge (u, v) already exists, then we won’t add it.
But wait, you have not proved that the these number of edges which you are adding will be maximum.
Yes, you are right.
Note that we can say that the edges which we have added, form a maximal set of edges. By maximal set of edges, I mean that if I add even a single more edge into the DAG,
then the it will no longer remain a DAG.
Let us prove this claim.
Let us look at the DAG after adding the edges in the above stated manner, Now the DAG contains an edge between each u to each v (v lies after u in topological ordering).
So we can not add any more edge which is in the direction of the topological ordering ie. we can not any edge (u, v) (u lying before v in topological ordering).
So We can only add the edges from (v, u) in the non-topological ordering.
But adding such an edge will create a cycle, because edge (u, v) is already there.
Let me also give a simpler proof (due to Shiplu):
First of all, one can show that there can’t be more than C(n,2) edges by pigeon hole principle.
Then we can show that any topological sort gives us these C(n,2) edges.
Nice idea, But still, you have only proved the maximal part, not the maximum part.
Oh, ok. I will now prove that the maximal set above generated is the indeed one of the maximum sets.
We will prove it by contradiction.
Let us assume that we have added edges in some other way such that (i.e. our DAG is not same as maximal DAG) it contains more edges than the above maximal set of edges.
Now, note that as graph is still a DAG, we can do a topological ordering of this. Note that in the topological order of graph, there can not be any edge from v to u (v lies after u) (from the definition of topological ordering).
Only edges we can have, lie in the topological direction. But then it can not have more edges than the maximal set, because maximal set also contains all the edges going in topological direction.
Hence it leads to contradiction.
Nice proof, I got it
Only thing remaining now is to figure out the lexicographically smallest part, I have some idea about it. What if I take any topological ordering and add edges in the topological direction.
Oh, this strategy is wrong. Consider the following DAG.
A valid topological sorting is 2 -> 3 -> 1. So you will try to add edges between (2, 3), (2, 1) and (3, 1). Edge between (3, 1) is already in the DAG.
So we will add edges between (2, 3) and (2, 1).
But there is a better way of adding edges, add edges (1, 2) and (3, 2). This sequence is lexicographically smaller than previous.
Yes, my bad. Let me propose another strategy. Start from the vertices with index = 0, (As the graph is DAG, there should be atleast one such vertex).
If there are more than one vertices having in-degree zero, pick the vertex having minimum number. Now from this vertex, add the edges to all the vertices in the current DAG.
Remove the vertex from the DAG and update the in-degree of the vertices which are attached to it.
Keep doing the same until all the vertices are taken.
Nice strategy, but alas. This is also wrong. Infact, it wont even work on the previous DAG.
First we will pick vertex 2(because 2 and 3 are with indeg = 0, but 2 is lower number than 3). Now we will add edges between (2, 1) and (2, 3).
Then we will remove 2.
Our new DAG will remain only of a single edge (3, 1). So we can not add any more edges.
But as said earlier, adding edges (1, 2) and (3, 2) is better than this.
I am out of idea, Please help me in finding the right strategy
Let us do similar thing as you did. But we will do it from the end to the beginning. Let’s take a vertex that has no outcoming edges as the last one (i.e one with the outDegree zero)
If there is more then one such vertices, then pick one with bigger number. Now add edges from all the vertices in the current graph to this vertex.
Then delete this vertex from the DAG, and keep doing this until the graph is empty.
I did not get much, Can you please take an example?
Of course, Let us consider the previous DAG.
Initially both 1 and 2 are vertices with outDegree zero. We will pick 2 because have bigger number than 1. Now we will add edge (1, 2), (3, 2).
Now we will delete vertex 2. Now the graph contains a single edge (3, 1). We will pick 1 (only having outDegree 0). We will try to add (3, 1), but it
is already there, so won’t re-add it.
But how to prove that this is the lexicographically smallest sequence of edges?
Okay, here we go. If we chose a vertex v with outdegree 0, we can add as many edges (u, v) where u is unused. So there is no chance of cycle.
Now we will prove the lexicographical part by contradiction.
Suppose that there are two vertices u and v ( u < v) with outdegree 0 and we are claiming that if we chose u before v than we will get lexicographically smallest sequence of edges.
Suppose there is another vertex w (u < w < v). Now there is a chance that the edge from w to u will be chosen but if we would have chosen v then there will be chance that the
u -> w will be chosen. If not, then there is no way to to chose it.
Nice proof, I got it. Now how to implement it, Can you please give some pseudo code?
Yes. Let us have its pseudo code.
Pseudo Code
// nextEdge[][] is two dimensional array of size n * n.
// nextEdge[u][v] denotoes whether we are adding an edge between u and v or not?
for (int i = 0; i < n; i++) {
// v will be vertex with outdegree 0 and having largest number.
// we will chose v, add edges from all other vertices to v (if not already added).
// then we will delete v.
int v = -1;
for (int j = n - 1; j >= 0; j--) {
if (!used[j] && outDegree[j] == 0) {
v = j;
break;
}
}
used[v] = true;
for (int j = 0; j < n; j++) {
if (j != v && !a[j][v] && !used[j]) {
newEdge[j][v] = true;
}
}
// delete v, Update the outDegree of all the neighbours of v.
for (int j = 0; j < n; j++) {
if (a[j][v]) outDegree[j]--;
}
}
Let us do the complexity analysis now
Complexity:
At each iteration, we first find the vertex v with outdegree 0, The finding step can be done in a single iteration of outDegree array.
Total number of operations taken in this step will be O(n).
Then we remove the vertex v and update its outDegree. This operation is also O(n).
So we are doing n such iterations. Each iteration takes O(n) time. So overall time complexity is O(n^2).