LONGART - Editorial

gamabunta · March 13, 2013, 12:07am

PROBLEM LINKS

DIFFICULTY

HARD

PREREQUISITES

Graph Theory, Matching, Maximum Flow

PROBLEM

You are given two strings A and B. Imagine a language which consists of |A| * |B| unique words, where each word’s first letter is in A and the second letter is in B. For each word, you are given how many times it may be used to build an Article. Let this be denoted by C(w) for a word w.

An Article is a collection of Sentences. A Sentence consists of |A| words, such that

Each letter from A is used exactly once.
Each letter from B is used at most once.

Find the longest Article (most number of sentences) that can be achieved and print it.

EXPLANATION

Of course |A| ≤ |B| to be able to build any sentence.

The problem must be broken into two pieces

Finding the maximum number of sentences that can be built in an Article - call it K.
Finding the sentences themselves.

First Insight

If there are at least K sentences in the longest Article, then the following network has flow N * K

There are |A| + |B| + 2 vertices (1 source and 1 sink extra)
For all edges (source, a), where a belongs to A
- capacity( source, a ) = K
For all edges (a, b), where a belongs to A and b belongs to B
- capacity( a, b ) = C(ab)
For all edges (b, sink), where b belongs to B
- capacity( b, sink ) = K

It is intuitive that in such a network, the number of times each character in A is used is K and the number of times each character of B is used is less than or equal to K.

Corollary

If the maximum flow in the above network is N * K then there are K proper sentences

Let us denote the flow between two vertices u and v as F(u, v). Now we know that if the maximum flow is N * K then

F(source, a) = K for all a in A
F(b, sink) <= K for all b in B

Let us generate a bipartite graph G(A,B) such that the number of edges between a in A and b in B is equal to F(a,b).

We wish to partition the edges in G(A,B) into K matchings of size |A| each. If this is possible, then we have K proper sentences. The mapping between a matching and the sentence is easy to see.

On G(A,B) we can use the following algorithm to generate the matchings (we will prove why).

max_degree = K

repeat K times
    1. Let B' be the set of vertices in B
        whose degree is max_degree
    2. build a matching of size |A| which
        contains all vertices in A
        contains all vertices in B'
    3. remove all the edges in the matching
        from the graph
    4. max_degree--

Step 2 in the above algorithm is actually always possible. Let us see why

(Material for the proofs below has been derived from the book Graphs: Theory and Algorithms by K. Thulasiraman, M. N. S. Swamy)

Lemma: In G(A,B), you can always build a matching of size |A|

Degree of each vertex in A is max_degree
Let X = a subset of A
Let E = the edges incident on X
|E| = |X| * max_degree
Let Y = all the distinct vertices in B that are in the neighbourhood of all the vertices in X
Let E’ = edges incident on Y
|E'| <= |Y| * max_degree, because no vertex in Y (or B, or A) has a degree more than max_degree
But, |E'| >= |E|, since E is a subset of E’
|Y| * max_degree >= |X| * max_degree
Or, |Y| >= |X|

Now, by Hall’s Marriage Theorem we know that there should always be a matching that saturates A.

Lemma: In G(A,B), you can always build a matching that contains all vertices in A and contains all vertices in B’

From the previous lemma, we can get two matchings M₁ and M₂ that saturate A and saturate B’ respectively. Both are built independently and may or may not have edges in common.

Consider M, the union of M₁ and M₂
Let X be the neighbourhood of A in B
Let Y be the neighbourhood of B’ in A
We can build a bipartite graph on
- Vertices in A
- Vertices in B’ union X
- Union of edges in M₁ and M₂
degree of any vertex in graph = 1 or 2
- Hence the bipartite graph is made of mutually exclusive paths and circuits.
Now consider a vertex b in B’ - X
we have degree(b) = 1

A path from b may only end at a vertex in X - B’. Can you see why?

Hint

In a path from b, when we arrive at any vertex in B’, we do so using an edge in M₁
This means, the degree of such a vertex should have to be 2 and the path can continue

Now,

We can augment the matching M₁ by using this path such that
- b is included
- the last vertex in the path is excluded.

This process for each vertex b in B’ - B converts M₁ into a matching that saturates A as well as B’.

Thus, we will have a solution by repeatedly finding this matching - and removing the edges.

Le Implementation

You can calculate the value of K by performing a binary search on the value of K.

Build the graph for each sample of K and then test whether a flow of size |A| * K exists or not. This test requires finding the flow as fast as possible. This calls for using a faster flow algorithm, such as Dinic’s or Push Relabel. Ford Fulkerson and Edmond Karp will certainly be too slow.

For rebuilding the sentences we can use the discussion to derive a matching that saturates A as well as B’. This will bound the complexity of our implementation by O(K*N*N*M). Unfortunately K can be quite large.

What if we collect similar sentences together?

We will have to choose the smallest F(u,v) for each edge (u,v) in the matching. Also, we will have to find the smallest delta through which we can go before a new vertex must be added to B’. The smaller one of both the value is the number of sentences that can be built by this matching.

After deleting the edges in the matching, we may have to run augmentations to rebuild the matchings that saturate A and B’ (and hence find the matching that saturates both A and B’).

The only augmentations made to B’ would be adding more vertices to it, meaning we will augment for the new vertex that must be added to B’
- There are at most N vertices in B’ that will limit this
The only augmentations made to A would be to handle deletion of edges as words get used up, meaning we will augment for the edge that is now deleted
- There are at most N*M words that will limit this

An augmentation takes at most N*M steps. Hence the complexity of the reconstruction is O(N*M * N*M). It is worth noting that the constraint on the output size (less than 30000) blocks is superflous. We would always find an answer within that size!

EXERCISE

Try to find alternate ways of constructing a matching that saturates all the maximum degree vertices. A very interesting way using flows is possible by using pre-sinks. Using the Dinic’s algorithm, the matching can be found quickly. Also using collection trick described above works like an alternate solution.

Hint

Design a graph with similar to the G(A,B) but with two additional sinks
The sinks help enumerating the saturated vertices from B and the non-saturated vertices respectively.
The graph is apt for optimiations due to Dinic’s Flow’s properties.
See the tester’s solution for solution based on this approach.

SETTER’S SOLUTION

Can be found here. ( Link is broken. Will upload the file on this address shortly. )

TESTER’S SOLUTION

Can be found here.

anudeep2011 · March 13, 2013, 12:28am

“It is worth noting that the constraint on the output size (less than 30000) blocks is superflous”

Had i believed this, I would have solved this sum . Implemented everything as discussed. When submitted i got WA, and I felt it was because of that 30000 factor which probably m missing. So some other bug caused me WA.

mugurelionut · March 13, 2013, 1:36am

My solution is almost identical to what was presented in the editorial (in concept, at least). Just one observation. In the first part (where the flow is binary searched), a standard flow algorithm run from scratch each time is indeed too slow. However, what I did is the following: every time I got a positive answer for a flow value F (during the binary search), I copied the flow values of the graph edges. Then, when I tested whether a value F’>F is achievable, I started the max-flow algorithm from the flow values of the previous good value F (and not all the way from zero). Then I used just one of the standard max-flow algorithms with the following enhancement:

I incremented the flow on a set of maximal vertex-disjoint paths at each iteration, just like in the Hopcroft-Karp algorithm (of course, the flow was incremented by the maximum value possible along the path, not just by 1 unit)

However, for me, generating the bipartite matchings in time (i.e. solving the second part of the problem) was more difficult. Generating each matching from scratch was too slow in my solution. So, in the end, I used a standard max-flow algorithm for finding each matching, enhanced with;

the Hopcroft-Karp idea (increase the flow on a maximal set of vertex-disjoint paths after each iteration)
use an initial greedy for each matching (to try not to start the max-flow algorithm from zero)
reuse as much as possible the flow obtained at the previous matching: e.g. if an edge i-j (i from A and j from B) was in the previous matching and it still had to be used in some future matchings, then I would keep the flow on it, unless:
- j was not a “critical” node from B (i.e. it didn’t have to be part of each matching from now on) and there was some other “critical” node k from B which could not be matched by the initial greedy algorithm
- the max-flow algorithm redirected the flow on this edge

lxfind · March 15, 2013, 3:20am

A quick question, should those B - B’ and B’ - B be X - B’ and B’ - X ?

gamabunta · March 15, 2013, 3:27am

Yes

I have made that change.