## PROBLEM LINK:

## DIFFICULTY:

HARD

## PREREQUISITES:

Bipartite Graphs, Maximum Matching, Probability theory, Hall’s Theorem

## PROBLEM:

You need to find the expected size of the maximum matching for a given complete bipartite graph where each edge can be in the graph with some given probability.

## EXPLANATION:

Before reading the editorial please go through the links given above. Make sure you understand the concepts of bipartite graphs and maximum matching properly. Please go through this editorial slowly and carefully since the ideas presented here are not very intuitive.

Let’s look at the problem in hand. We need to find the * Expected Maximum Matching* which is defined as the sum of

**P(G) * MM(G)**for all bipartite graphs

**G = (U, V, E)**where

**U = {U[1], …, U[N]}**and

**V = {V[1], …, V[M]}**.

As mentioned in the problem statement, there are

**2**possible bipartite graphs in all and calculating the above sum of products will definitely exceed the time limit. So let’s think of a better approach.

^{N * M}One of the most important things when you deal with * Maximum Matching* is

**Hall’s Theorem**which says:

A bipartite graph

G = (U, V, E)has

matching of size|U|if and only if

|adj(S)| >= |S|

foreverysubsetSof

U. Hereadj(S) = {v in V : (u, v) in E– is the set of all different neighbors of vertexes from

for at least one u in S}S. Also by|S|we denote the number of elements in the setS.

Let’s define a couple of notations to being with:

For a given bipartite graph

G = (U, V, E)let’s call subsetSofUthat satisfies condition of Hall’s Theorem

asofHall subsetG. And let’s call

the system of all Hall subsets ofGas

of G. Note that any Hall system has an empty subset ofHall systemU(which we denote byE) as its element.

Now we move on to our solution. For **K** from **1** to **M** and **T = {HS _{1}, HS_{2}, …, HS_{R}}**, where

**HS**are some different subsets of

_{1}, HS_{2}, …, HS_{R}**U**define

**DP[K][T]**to be the probability that a bipartite graph

**G = (U, V(K), E)**, where

**V(K) = {V[1], …, V[K]}**, has its Hall system equal to

**T**. This means that for this graph we have all the vertices from

**U**in the first part but we consider only

**first K**vertices from

**V**in the second part. Using these two vertex sets, we can have many graphs. For each graph there will be various Hall subsets and they all together form the Hall system for that particular graph. So

**DP[K][T]**is the sum of all

**P(G)**for all graphs

**G = (U, V(K), E)**that has Hall system

**T**.

Clearly, when we know the Hall system of **G** we can easily find the maximum matching for it. It will be equal to the *size of the largest Hall subset of G* (it follows directly from the Hall’s Theorem if we will apply it to the subgraph of

**G**with parts

**S**and

**V**, where

**S**is subset of

**U**).

It is clear that **DP[M][T]** will be *the probability that original random bipartite graph defined in the problem statement has a Hall system equal to T*. So the answer to the problem will be the sum of

**DP[M][T] * MM(T)**for all different possible Hall systems

**T**. Here

**MM(T)**is the maximum matching of the graph with the Hall system

**T**which, as described above, will be the size of largest Hall subset.

Further note that it is convenient to define **DP[0][T]** as **1** for **T = {E}** and **0** otherwise. It is completely natural since when the bipartite graph **G** has empty second part then **T = {E}** is the only possible Hall system of such graph.

Now let’s try to see how we can calculate **DP[K+1][]** using **DP[K][]**.

When we consider **(K+1) ^{th}** vertex of

**V**, we should iterate over all

**2**subsets of

^{N}**U**(recall that

**U**has size

**N**). For each subset

**S = (u**of

_{1}, …, u_{L})**U**we add edges

**(u**to the graph, where

_{i}, v)**v = V[K+1]**is the current vertex in

**V**. The probability that exactly this set of edges will be added is equal to the product of

**P[x][K+1]**for x =

**u**multiplied by the product of all

_{1}, …, u_{L}**(1 – P[x][K+1])**for

**x != u**. (

_{1}, …, u_{L}**P**is the probability adjacency matrix given to us). Denote this probability as

**P[K+1][S]**, i.e.

**P[K+1][S]**is the probability of having

**adj(V[K+1]) = S**. Now assume that we have the Hall system

**T = {HS**and add edges

_{1}, HS_{2}, …, HS_{R}}**(u**for each

_{i}, v)**u**in

_{i}**S**. We will show that the Hall system of the new graph is unique and can be found easily by

**T**and

**S**. Denote it by

**next[T][S]**. Basically

**next[T][S]**represents

*the new Hall system that is formed from the graph with existing Hall system*.

**T**when we add a new vertex**v**from**V**for which**adj(v) = S**From basic properties of probability it follows that **DP[K+1][T1]** will be the sum of **P[K+1][S] * DP[K][T]** for all pairs **(T, S)** such that **T1 = next[T][S]**, i.e. the product of probability that **adj(v) = S**, when the vertex **v = V[K+1]** is added to the current graph **G = (U, V(K), E)**, and the probability that **G** has the Hall system **T**. Hence we can do

```
DP[K+1][next[T][S]] += P[K+1][S] * DP[K][T]
```

for all Hall systems **T** and all subsets **S** to compute correctly the values of **DP[K+1][T1]** for all Hall systems **T1**.

Now, for the given **T** and **S**, let’s look at how to calculate **next[T][S]**. Let us first consider the case when **|S| = 1**, i.e., **S = {U[i]}** for some **i** from **1** to **N**. That is, we add exactly one edge **(u, v)** to the current graph **G**, where **u = U[i]** and **v** is the new vertex that was not present before in **V**. Since **v** is a new vertex then **adj(W)** will be increased by one for each subset **W** of **U** having **u** in it and will be the same for all other subsets not having **u**. Now it is clear that **W** will be a Hall subset for the new graph if and only if **W – {u}** was the Hall subset in the old graph. Hence **next[T][S]** in this case is equal to **T + MOV(T, u)**, where **MOV(T, u) = {HS + {u} : HS in T}**, i.e. **MOV(T, u)** is obtained by adding **u** to each element of **T** (recall that elements of **T** are the subsets of **U**). So we know how to find **next[T][S]** when **|S| = 1** and know that we don’t need any other information about the graph to do this. When **S** is arbitrary we can apply the same arguments to see that **next[T][S]** is union of **T** and all **MOV(T, u)** for each **u** in **S**.

All these observations show that introduced by us function **DP[][]** is self-contained, that is, each value **DP[K][T]** can be calculated only using its previous values in quite reasonable time. But the most intriguing question is why the number of states in such dp is small enough to solve the problem in time. Consider the whole process of calculating of DP from the graph theory point of view. We start at the state **(0, {E})** and then for each state **(K, T)** we move to the states **(K+1, next[T][S])** for all subsets **S** of **U**. Let’s construct an oriented graph whose vertices are Hall systems and for each Hall system **T** it is connected by the oriented edge to the Hall systems **next[T][S]** for all subsets **S** of **U**. Now it is clear that the set of all Hall systems that will be involved in the whole process of DP calculation is exactly the set of all reachable vertices from the vertex **{E}** in this graph. Hence we can use some search on this graph to find all needed for DP Hall systems. In fact, it can be shown that in this way we will obtain all possible Hall systems. Surprisingly but the number of different Hall systems obtained by this way for a given **N** is very low. Table of first 6 values of this number is presented below:

N |
1 | 2 | 3 | 4 | 5 | 6 |

Number of Hall systems | 2 | 5 | 16 | 68 | 406 | 3762 |

So even for **N <= 6** this problem can be solved easily using this idea.

Now we have the entire approach with us. Let us go ahead and try to implement these ideas and view the solution in detail.

In the above explanation, we have seen several steps where we need to deal with **S**, arbitrary subset of **U**. We know that the maximum value of **N** (size of **U**) is **5** and hence, the maximum number of subsets of **U** can be **2 ^{5} = 32**. Thus, we can generate all the subsets using a mask of

**N**bits. For more about masks, go through the following link:

http://en.wikipedia.org/wiki/Mask_(computing)

Now, in the mask, if **i**^{th} bit is **1**, it means that the corresponding subset **S** of **U** contains the vertex **U[i-1]** (note that bits in mask are numbered starting from **0** but vertexes in **U** are numbered starting from **1**). And if it is **0**, the subset **S** does not contain this vertex. So each subset **S** of **U** can be represented by the binary representations of the number between **0** and **2 ^{N} – 1**. In future we will identify subsets of

**U**with their representations.

Next we move on to the representation of **T**, the Hall system. Let **T = {HS _{1}, HS_{2}, … , HS_{R}}** where

**HS**are some different subsets of

_{1}, HS_{2}, … , HS_{R}**U**. Each

**HS**can be represented by a number between

_{i}**0**and

**2**as we saw above. Hence we can identify

^{N}– 1**T**with a mask of length

**2**where

^{N}**j**

^{th}bit is

**1**if and only if the subset of

**U**with binary representation

**j**is in

**T**. More formally we can write

**T = 2**. Clearly the value of

^{HS1}+ 2^{HS2}+ … + 2^{HSR}**T**calculated in such way will always be less than

**2**and hence we can use 32bit integer type to deal with Hall systems.

^{(2N)}<= 2^{32}Now let’s consider these neat notations on example. Consider the case when **N = 2**. So **U = {U[1], U[2]}**. All subsets of **U** are **S[0] = {} = E**, **S[1] = {U[1]}**, **S[2] = {U[2]}**, **S[3] = {U[1], U[2]}**. And it is clear that subset **S[i]** has representation **i**. So we labeled subsets of **U** by numbers **0, 1, 2, 3** as was described above. It can be shown that there are 5 Hall systems in this case in all and they are:

Hall system | Its binary representation | Its decimal value |

{0} | 0001 | 1 |

{0, 1} | 0011 | 3 |

{0, 2} | 0101 | 5 |

{0, 1, 2} | 0111 | 7 |

{0, 1, 2, 3} | 1111 | 15 |

OK, let’s continue. One of the possible ways of solving the problem that can be used next is to simply define **DP[K]** as `map<unsigned int, double>`

(if you are using C++) or as any associative container if your language supports them. In the same way you can deal with **next[T][S]**. This approach allows you to write the whole solution without any further thinking just following this editorial. We don’t know whether it can get Accepted but the feeling is that it can.

More efficient way is to numerate all Hall systems by consecutive numbers starting from **0**. Then you can redefine **next[j][S]** as an index of the Hall system that will be obtained from the **j**^{th} Hall system according to the previous definition of **next[T][S]**. Of course **DP[K]** also can be represented as a usual array of size **H** where **H** is the total number of Hall systems. In this way the step of actual DP calculation will take only **O(M * 2 ^{N} * H)** operations. Indeed, for each

**K**from

**1**to

**M**, for each mask

**S**from

**0**to

**2**and for each index

^{N}– 1**j**of Hall system from

**0**to

**H – 1**we need to do

```
DP[K+1][next[j][S]] += P[K+1][S] * DP[K][j]
```

Where values of **P[K][S]** can be easily found by definition in **O(M * 2 ^{N} * N)** time.

The step of finding of all Hall systems with their values **next[T][S]** can be implemented in **O(H * 2 ^{N} * N * log H)** using some efficient associative container or even in

**O(H * 2**using hash table. See Tester’s solution as a reference for such implementation.

^{N}* N)## SETTER’S SOLUTION:

Will be uploaded soon.

## TESTER’S SOLUTION:

Can be found here.

### APPROACH:

The problem tester used the above solution to solve the problem.