TWOCOMP - Editorial

dpraveen · June 16, 2014, 3:01pm

PROBLEM LINK:

Practice
Contest

Author: Sergey Kulik
Tester: Shiplu Hawlader
Editorialist: Praveen Dhinwa

DIFFICULTY:

Hard

PREREQUISITES:

trees, max flow, bipartite graphs.

PROBLEM:

There is a country with N cities. Road network in the country can be represented as a tree.
There are two companies A and B. A and B wants to build M1 and M2 paths respectively. A path is a collection of one or more roads.
For building a path, the company will be given a profit amount. You have to select subsets of paths from A and B in such a way that there should not be
two paths of different companies intersecting each other. Note that paths of same company might intersect each other.

Find out maximum possible amount of profit that the companies can have.

QUICK EXPLANATION

Note that N <= 10^5 but M1, M2 <= 700.
We will make a bipartite graph where left part contains M1 and right part contains M2 nodes. Each vertex in the left part corresponds to a
path of company A, Similarly Each vertex in the right part corresponds to a path of company B.
If there is a edge between vertex u in left part and vertex v in right part, it will mean that path corresponding to u and v
are intersecting. Hence both u and v vertices can not be chosen together.
For the given graph, the required answer is exactly equal to finding maximum weighted independent set in the above given bipartite graph.
Finding maximum weighted independent set in general graph is NP complete.
But you can solve it in polynomial time for bipartite graph.
Construction of Graph can be seen directly by the images shown below.
Use an optimized max flow algorithm to pass the time limits of the problem. Author’s solution uses push-relabel algorithm for max flow.

EXPLANATION

How to check whether two paths intersect or not?

Let us have a look at the following image.

Now let us say that we have two paths (s1, t1) and (s2, t2). We want to find whether they intersect or not?
First lets try to find possible intersection points of the given paths.

The end points of the paths. ie s1, t1, s2, t2.
lca of 4 possible (s, t) pairs ie lca(s1, t1), lca(s1, t2), lca(s2, t1), lca(s2, t2).

Now our problem reduces to how to detect whether a vertex lies on the given path or not.
Let (s, t) be the given path. We need to check whether a vertex p lies on the path or not.
Lets root the tree at root node being 1. Now without loss of generality assume that s is closer to root than t.

There can be two cases.
Case 1: p lies between the s and t. and lca(s, t) = s. For checking whether p lies between s and t
or not, we can simply check that s should be ancestor of p and p should be ancestor of t.

Case 2: p lies on one of the paths from s to lca(s, t) or t to lca(s, t).

How to find lca of a tree

Trivial algorithm for finding LCA can take time of height of the tree, which can go of O(N) in the worst case.

Please have a look at this amazing topcoder tutorial which discusses the problem in great detail and proposes many solutions.

Finding weighted independent set in a bipartite graph

Graph Construction
The bipartite graph has two parts, the left and the right. Left part contains M1 vertices which correspond to first company and right
part contains M2 vertices corresponding to second company. For a pair (u, v) such that u is from the left part and v is from the right part,
If the path corresponding to u and v intersect with each other, then we will add an edge between them. The vertex u will have cost corresponding to
profit of building the path corresponding to vertex u and similarly we will add costs for right vertex set.

Now you can realize that answer of our problem is finding the maximum weighted independent set of the graph. Maximum weighted independent set of a
graph is collection of vertices having maximum weight and no two vertices in the set have an edge between them.

Network construction
Now, we will convert into graph into a network, then we will relate the maximum weighted independent set with max flow in the below network.

Our network will have 4 parts. source, sink, and parts corresponding to A and B. Call the part corresponding to A left and that of B right.

From source to left part, the vertex v with weight w will have capacity of x.
From right part to sink node, the vertex v with weight w will have capacity of x.
If the two nodes x, y in the graph (where x is from left side of the graph (company A) and y from right side of the graph (company B))
have an edge between them, Then add an edge in the network with capacity of infinity between x and y in the network.

Please see the following image for more clarification.

Claim
maximum weighted independent set of the graph = The total sum of edges’ costs - Maximal flow in the network built.

Proof of the construction
By max-flow min-cut theorem, In a flow network, the maximum amount of flow passing from the source to the sink is equal to the minimum
capacity that, when removed in a specific way from the network, causes the situation that no flow can pass from the source to the sink.

Minimal cut denotes the minimal total sum of deleted edges’ costs whose deletion will cause source and sink to no-longer have a route between them.
Note that when we find the minimal cut, we will never delete the edges between left and right part, because their capacity (or the weight in the
min-cut problem) is infinity. Instead it will be better even to delete all the edges from the source to the left part, because the sum of
the finite number of finite numbers is less that infinity.

What does path the path here?
Let us consider a path in the flow diagram.
Source -> (Edge between source to left part corresponds to the first company’s route) -> intermediate edge (cost = infinity) ->
(edge between right part and sink that corresponds to the second company’s route) -> Destination (sink vertex).

We have already noted that intermediate edges will never be deleted.

Having that given, we need to make sure that if there is a pair of edges (denote it by (A, B) where A is the edge of the first company (source to left)
and B is the edge of the second company(right to sink)), we need to ensure that there is an intermediate edge between them because then
at least one - A or B is deleted, if there is a construction A -> intermediate edge -> B then A and B can not live together.
In the beginning of the construction of the network, we were adding such edges in case A’s and B’s corresponding routes intersect.

But what is more, As long as there is such a combination, there is a route between the source and the destination/sink and vice versa.

So now we conclude that we need to delete some edges in such a way that there will not be a way from the source to the sink in the constructed network.

This is a minimal cut problem and as noted in the starting, it is widely known that this weight equals to the maximal flow.

So finally what is the answer of our problem?

We need to delete edges with the minimal total cost.

If we delete edges with the total cost C, SumAllCosts - C will be the answer.

So the answer is: The total sum of edges’ costs - Maximal flow in the network built.

Complexity:
O(Max Flow + N * log n).

Here N * log n corresponds to time taken in detecting intersection of the paths.

To fit the solution in time, faster implementation of the max flow were required, Dinic algorithm can get TLE,
safe way to solve the problem is to use Push Relabel Algorithm.

HOMEWORK

Find a relation between independent set and vertex cover in bipartite graphs.
Learn about various max flow algorithms. Specifically learn about edmond karp, ford fulkerson, dinic, push relabel algorithm.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution
Tester’s solution

soda_beta7 · June 16, 2014, 3:12pm

can not see the picture

adijimmy · June 16, 2014, 3:48pm

@dpraveen sir am not getting the image

adijimmy · June 16, 2014, 3:49pm

same here

lsmll · June 16, 2014, 5:10pm

The tag “min-cost-flow” is wrong. It should be max flow.

akulsareen · June 16, 2014, 5:46pm

pictures are missing. please fix it.

albert96 · June 16, 2014, 5:53pm

Can’t you explain please this part of Author’s solution
for(i = 2; i < n; i++) {
lift(i);//<-
If I am not mistaken you must lift if the excess is positive but here you don’t know whether the excess is positive or not. And you will not find such k for which
if (fc[k][a[k][i]] > 0 && h[a[k][i]] + 1 < h[k]) h[k] = h[a[k][i]] + 1;
and h[k] = n + n; will stay n+n and later you will not be able to push flow to vertex k.
Where am I wrong ?
Thanks!

caioaao · June 16, 2014, 6:41pm

If the vertex p lies on the path (s, t) then lca(t, p) should be t and lca(s, p) should be s. If p does not lie on the path, then both the conditions can not be true simultaneously.

This doesn’t make any sense for me. Consider the following tree, with each line containing “node_id: parent(node_id)”:

1: - // root

2: 1

3: 2

4: 1

If s=4, t= 3, we have the following path: 3->2->1->4. Assuming we’d like to test if node 2 is in this path, we now have the following: s=4, t=3 and p=2. So, we have:

lca(s,p) = 1 != s
lca(t,p) = 2 != t

Both are different from s and t, so your explanation tells us that node 2 isn’t in the path between 4 and 3, which obviously isn’t true.

flaminrage · June 16, 2014, 7:13pm

@admin the problem stated that M1, M2 <= 700 not 300.

rsaha77 · June 16, 2014, 8:40pm

Editorial says that Dinic’s algo O(EV^2) can fail … In fact it should fail since, V = 700+700 = 1400 .

But Most of the AC solutions are implemented with Dinic’s algo …

I tried submitting my solution using simple Edmonds-Karp and got AC in 8:33 secs…!!

Does this mean that the test cases were weak…? Or Edmonds-Karp usually does well enough in practice despite it’s worst case runtime complexity of O(VE^2)…?

EDIT: I agree to kevinsogo’s comment. It’s pretty difficult to design strict max-flow cases for a bipartite graph. Especially for this particular problem where we get the B.P.Graph after finding intersections of path’s in a tree. That’s probably the reason Dinitz and even Edmonds-Karp algo was able to find maximum flow within the given time limit for this problem!

mugurelionut · June 16, 2014, 9:31pm

When I first read the problem I thought that maybe there is a more specific algorithm for finding the maximum weighted independent set in the bipartite graph, given that the bipartite graph is not arbitrary - it is the intersection graph of two sets of paths in a tree.

For instance, the following problem has a much better solution. If you have a single set of paths in the tree and you generate the graph of their intersections, you can find the maximum weight independent set of this graph in polynomial time (whereas you cannot find it in polynomial time in a general graph). Such graphs are called chordal graphs.

Thus, the graph from the problem is a kind of chordal bipartite graph, so I thought that maybe there is some special algorithm for it which is more efficient than using max-flow. I thought about it and I couldn’t find one, but it would have been nice if one existed.

kevinsogo · June 16, 2014, 11:40pm

I implemented Ford-Fulkerson with a mix of BFS (Edmonds-Karp) and DFS augmentation. I do DFS on the first (M1+M2)/2 augmentations, and BFS afterwards. Also, I try a different order of nodes to visit per augmentation, so I expect to usually get short augmenting paths.

This got accepted. I know this shouldn’t pass the time limit, but I guess it’s really hard to generate max flow cases to break such solutions, especially if our graph is bipartite with a specific structure (as mugurelionut explains).

The submission: http://www.codechef.com/viewplaintext/4097453

rsaha77 · June 17, 2014, 1:34am

The idea of using DFS augmentation for finding first V/2 augmented path and then using BFS augmentation is really nice!

beroul · June 17, 2014, 2:26am

I,too, don’t understand the following part

“If the vertex p lies on the path (s, t) then lca(t, p) should be t and lca(s, p) should be s. If p does not lie on the path, then both the conditions can not be true simultaneously.”.

For, if we choose p=lca(s,t) then the above formula gives “lca(t,p) = p = t = s = lca(s,p”

xellos0 · June 17, 2014, 2:37am

I don’t think there’s any point to making data that needs faster flow algorithms than Ford-Fulkerson. They’re so rarely required in competitions where you can’t use pre-written code that there’s no point in trying to implement them, it’s simpler to just use a code from somewhere else and source it.

rmg_91 · June 17, 2014, 3:41am

i can’t see the images

dpraveen · June 17, 2014, 8:54am

corrected, earlier the constraint was 300 which was later changed to 700. Thank you for informing !!

dpraveen · June 17, 2014, 9:08am

Hi, yes, you are right. Extremely sorry for writing it wrong.
I have changed the part in the editorial. I had covered only 1 case. May be image might have made it clear. I will upload the image in a while. Thank you for informing.

dpraveen · June 17, 2014, 9:10am

My dinic’s implementation failed, some people got accepted with dinic too. So I think same as kevinsogo.

dpraveen · June 17, 2014, 5:37pm

@akulsareen: fixed, ty for informing and sorry for being late