PROBLEM LINKS
DIFFICULTY
HARD
PREREQUISITES
Graph Theory, Matching, Maximum Flow
PROBLEM
You are given two strings A and B. Imagine a language which consists of |A| * |B|
unique words, where each word’s first letter is in A and the second letter is in B. For each word, you are given how many times it may be used to build an Article. Let this be denoted by C(w) for a word w.
An Article is a collection of Sentences. A Sentence consists of |A| words, such that
- Each letter from A is used exactly once.
- Each letter from B is used at most once.
Find the longest Article (most number of sentences) that can be achieved and print it.
EXPLANATION
Of course |A| ≤ |B| to be able to build any sentence.
The problem must be broken into two pieces
- Finding the maximum number of sentences that can be built in an Article - call it K.
- Finding the sentences themselves.
First Insight
If there are at least K sentences in the longest Article, then the following network has flow N * K
- There are |A| + |B| + 2 vertices (1 source and 1 sink extra)
- For all edges (source, a), where a belongs to A
- capacity( source, a ) = K
- For all edges (a, b), where a belongs to A and b belongs to B
- capacity( a, b ) = C(ab)
- For all edges (b, sink), where b belongs to B
- capacity( b, sink ) = K
It is intuitive that in such a network, the number of times each character in A is used is K and the number of times each character of B is used is less than or equal to K.
Corollary
If the maximum flow in the above network is N * K
then there are K proper sentences
Let us denote the flow between two vertices u and v as F(u, v). Now we know that if the maximum flow is N * K
then
- F(source, a) = K for all a in A
- F(b, sink) <= K for all b in B
Let us generate a bipartite graph G(A,B) such that the number of edges between a in A and b in B is equal to F(a,b).
We wish to partition the edges in G(A,B) into K matchings of size |A| each. If this is possible, then we have K proper sentences. The mapping between a matching and the sentence is easy to see.
On G(A,B) we can use the following algorithm to generate the matchings (we will prove why).
max_degree = K repeat K times 1. Let B' be the set of vertices in B whose degree is max_degree 2. build a matching of size |A| which contains all vertices in A contains all vertices in B' 3. remove all the edges in the matching from the graph 4. max_degree--
Step 2 in the above algorithm is actually always possible. Let us see why
(Material for the proofs below has been derived from the book Graphs: Theory and Algorithms by K. Thulasiraman, M. N. S. Swamy)
Lemma: In G(A,B), you can always build a matching of size |A|
- Degree of each vertex in A is max_degree
- Let X = a subset of A
- Let E = the edges incident on X
|E| = |X| * max_degree
- Let Y = all the distinct vertices in B that are in the neighbourhood of all the vertices in X
- Let E’ = edges incident on Y
-
|E'| <= |Y| * max_degree
, because no vertex in Y (or B, or A) has a degree more than max_degree - But,
|E'| >= |E|
, since E is a subset of E’ |Y| * max_degree >= |X| * max_degree
- Or,
|Y| >= |X|
Now, by Hall’s Marriage Theorem we know that there should always be a matching that saturates A.
Lemma: In G(A,B), you can always build a matching that contains all vertices in A and contains all vertices in B’
From the previous lemma, we can get two matchings M1 and M2 that saturate A and saturate B’ respectively. Both are built independently and may or may not have edges in common.
- Consider M, the union of M1 and M2
- Let X be the neighbourhood of A in B
- Let Y be the neighbourhood of B’ in A
- We can build a bipartite graph on
- Vertices in A
- Vertices in B’ union X
- Union of edges in M1 and M2
- degree of any vertex in graph = 1 or 2
- Hence the bipartite graph is made of mutually exclusive paths and circuits.
- Now consider a vertex b in B’ - X
- we have degree(b) = 1
A path from b may only end at a vertex in X - B’. Can you see why?
Hint
- In a path from b, when we arrive at any vertex in B’, we do so using an edge in M1
- This means, the degree of such a vertex should have to be 2 and the path can continue
Now,
- We can augment the matching M1 by using this path such that
- b is included
- the last vertex in the path is excluded.
This process for each vertex b in B’ - B converts M1 into a matching that saturates A as well as B’.
Thus, we will have a solution by repeatedly finding this matching - and removing the edges.
Le Implementation
You can calculate the value of K by performing a binary search on the value of K.
Build the graph for each sample of K and then test whether a flow of size |A| * K
exists or not. This test requires finding the flow as fast as possible. This calls for using a faster flow algorithm, such as Dinic’s or Push Relabel. Ford Fulkerson and Edmond Karp will certainly be too slow.
For rebuilding the sentences we can use the discussion to derive a matching that saturates A as well as B’. This will bound the complexity of our implementation by O(K*N*N*M)
. Unfortunately K can be quite large.
What if we collect similar sentences together?
We will have to choose the smallest F(u,v) for each edge (u,v) in the matching. Also, we will have to find the smallest delta through which we can go before a new vertex must be added to B’. The smaller one of both the value is the number of sentences that can be built by this matching.
After deleting the edges in the matching, we may have to run augmentations to rebuild the matchings that saturate A and B’ (and hence find the matching that saturates both A and B’).
- The only augmentations made to B’ would be adding more vertices to it, meaning we will augment for the new vertex that must be added to B’
- There are at most N vertices in B’ that will limit this
- The only augmentations made to A would be to handle deletion of edges as words get used up, meaning we will augment for the edge that is now deleted
- There are at most
N*M
words that will limit this
- There are at most
An augmentation takes at most N*M
steps. Hence the complexity of the reconstruction is O(N*M * N*M)
. It is worth noting that the constraint on the output size (less than 30000) blocks is superflous. We would always find an answer within that size!
EXERCISE
Try to find alternate ways of constructing a matching that saturates all the maximum degree vertices. A very interesting way using flows is possible by using pre-sinks. Using the Dinic’s algorithm, the matching can be found quickly. Also using collection trick described above works like an alternate solution.
Hint
- Design a graph with similar to the G(A,B) but with two additional sinks
- The sinks help enumerating the saturated vertices from B and the non-saturated vertices respectively.
- The graph is apt for optimiations due to Dinic’s Flow’s properties.
- See the tester’s solution for solution based on this approach.
SETTER’S SOLUTION
Can be found here. ( Link is broken. Will upload the file on this address shortly. )
TESTER’S SOLUTION
Can be found here.