PROBLEM LINK:
Author: Maksym Bevza
Tester: Istvan Nagy
Editorialist: Xellos
DIFFICULTY:
EASY-MEDIUM
PREREQUISITES:
Maximum flow.
PROBLEM:
You’re given a directed acyclic graph. Find the minimum number of disjoint paths in it which cover all vertices.
QUICK EXPLANATION:
The answer is N minus the maximum flow in a suitable graph. The maxflow can be found using Ford-Fulkerson.
EXPLANATION:
This is a fairly well-known problem - formally, we’re finding the size of a minimum path cover of our DAG. You can find a lot about it with a simple search query.
Motivation: let’s split each vertex v into 2 parts (let’s call them v_{in} and v_{out}) connected by an edge v_{in}\texttt{-> }v_{out} such that all edges directed into v will now be directed into v_{in} and all edges directed out of v will be directed out of v_{out}. Intuitively, a path is something like a flow of 1 in this graph and the splitting guarantees that no two “flows” can cross the edge v_{in}\texttt{-> }v_{out} for any v, which is something like path disjointness. And the solution really reduces to finding a maximum flow.
We’ll make a new flow-graph G_f in a similar way: there are vertices v_{in} and v_{out} in G_f corresponding to each vertex v in G and also edges u_{out}\texttt{-> }v_{in} in G_f corresponding to each edge u\texttt{ -> }v in G, but no edges v_{in}\texttt{-> }v_{out}. Instead, we’ll have a source s, a sink t and edges s\texttt{ -> }v_{out} and v_{in}\texttt{-> }t for each v. All edges have capacities 1.
The answer is N minus the maximum flow F in G_f between s and t. Why it works isn’t so obvious, unless we look at the problem from a different perspective - if there are P paths covering our DAG, they will have N-P edges in total (a path with v vertices has v-1 edges and there are N vertices in P paths), and we’re computing the maximum number of edges which can be assigned to paths. Those edges are the ones through which we’re sending the flow.
Think about what the constraints on edges of G_f mean: edges s\texttt{ -> }v_{out} with capacity 1 say that we can only send a flow of 1 through one outcoming edge (that actually appears in the DAG) from vertex v, edges v_{in}\texttt{-> }t mean a similar thing for incoming edges. Since any path cover corresponds to a valid flow and vice versa, the max. number of edges assigned to paths will be F and the minimum number of paths P=N-F.
We could also view it as bipartite matching of edges as outcoming to one vertex and incoming to another, where each vertex can only have at most 1 incoming and at most 1 outcoming edge. In fact, maximum bipartite matching can always be found as a maximum flow in a graph constructed by appending a source to one part and a sink to the other with unit capacities of all edges.
The graph G_f has O(N) vertices, O(N+M) edges and the maximum flow between s and t is at most N. Therefore, Ford-Fulkerson runs on it in O(N(N+M)) time and O(N^2) memory per test case without any trouble.
By the way, this problem is the exact topic of study of Dilworth’s theorem, which says that the minimum number of paths is the same as the length of the longest path in the complementary (undirected) graph of G. Feel free to read more about it.
AUTHOR’S AND TESTER’S SOLUTIONS:
The author’s solution can be found here.
The tester’s solution can be found here.
The editorialist’s solution can be found here.