GNUM - Editorial

devuy11 · July 14, 2014, 3:03pm

PROBLEM LINK:

Practice
Contest

Author: Lalit Kundu
Tester: Shang Jingbo and Gerald Agapov
Editorialist: Devendra Agarwal

DIFFICULTY:

HARD

PREREQUISITES:

Max-Flow and Maths.

PROBLEM:

You are given two arrays denoted by A₁,A₂…A_N and B₁,B₂…B_N.
Let us maintain two sets S₁ and S₂ which are empty initially. In one move , pick a pair of indexes (i, j) such that it’s already not present in S₁. Also, B_j > A_i and GCD(A_i,B_j) is not 1. Further,pick another pair of indices (p, q) such that it’s already not present in S₂. Also, B_p < A_q and GCD(A_q,B_p) is not 1. Also, GCD(A_q,B_p) should not be coprime to GCD(A_i,B_j). Then add both pair of indices to S₁ and S₂, respectively.

Find the largest number of moves that can be performed?

Explanation

Let us first find (i,j) pairs which satisfy the property that they are unique [ i.e. (i,j) is considered once ] and B_j > A_i and GCD(A_i,B_j) is not 1. Let us store gcd of each pair in vector U.

Let us now find (p,q) pairs which satisfy the property that they are unique [ i.e. (p,q) is considered once ] and B_p < A_q and GCD(A_q,B_p) is not 1. Let us store gcd of each pair in vector V.

You would have realized that we are not interested in the pair (i,j) but we are interested in their gcd.

So , we are given 2 arrays U and V , we need to find pairs ( U[x] , V[y] ) , such that GCD ( U[x] , V[y] ) is not 1 and x is choosen atmost once and similarly y is choosen atmost once. Our target is to maximise such pairs.

The above stated problem is indeed we need to solve. The condition “x is choosen atmost once” ensures that pair (i,j) is not repeated , and similarly for y. The condition “GCD ( U[x] , V[y] ) is not 1” ensures " GCD(A_q,B_p) should not be coprime to GCD(A_i,B_j)" is fulfilled.

This is a classical max-flow problem.Consider the nodes as (U + V + source + sink) and consider adding edges from source to each index of U with capacity 1 ,add edges from x to y if gcd of U[x] and V[y] is not 1 , and edges from each index of V to sink with capacity 1.

Calculating max-flow of this graph will give us the solution.

But Wait we are not done with the solution!

The Above solution is sure to get time out.

Reason
Size of the vector U in the worst case will be 400 * 400 , and the size of the vector V in worst case will be 400 * 400. So total number of edges between U and V can go upto 400 * 400 * 400 * 400 , which is too high.

I will urge readers to read the problem statement once again and make a small observation which will be the key to solve the problem.

Can we do some observation on the line “add an edge from x to y if gcd of U[x] and V[y] is not 1” ? .

Our only aim is to ensure that the gcd is not 1 , so we can introduce some dummy vertex , which will denote prime factor of some number from the array U or V.

What will be the bound on such vertices ?

Such vertices are bounded by 9* ( sizeof(U) + sizeof(V) ). You can quickly realise that the maximum prime factor a number <= 1e9 is 9 ( Reason : 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 > 1e9 ).

How will one use such vertices to fullfill the aim of the problem ?

Let’s consider the graph ( Nodes , Edges ) where Nodes = U + V + source + sink + dummy vertices , and Edges = E₁ + E₂ + E₃ + E₄

E₁ : Edges from source to each index of U with capacity 1 [ same as described above ]
E₂ : Edges from each index of V to sink with capacity 1 [ same as described above ]
E₃ : Edges from index x of vector U to those dummy vertices which divides U[x] with Infinite capacity.
E₄ : Edges from those dummy vertices to index x of vector V , which dividex V[x] with Infinite capacity .

Number of Edges in this graph <= 400 * 400 + 9 * 400 * 400 + 9 * 400 * 400 + 400 * 400
Number of Nodes in this graph <= 1 + 400 * 400 + 9 * 400 * 400 + 9 * 400 * 400 + 400 * 400 + 1

You need to proof that the max-flow in this graph is indeed same as the max-flow in the previous graph.

AUTHOR’S and TESTER’S SOLUTIONS:

Author’s solution
Tester’s solution

mohit2104 · July 14, 2014, 3:15pm

Exact same method … still tle … I used Dinic’s algorithm for max flow . I tried a few times but left it afterwards as i was unsure about the running time complexity of max flow. Even with the dummy nodes my analysis was telling me that it will be a tle. How is the max flow complexity decided ? For me the worst case analysis for the above code would result in tle .

rsaha77 · July 14, 2014, 3:48pm

@mohit2104: Edge capacity of the resulting flow graph is 1. And Dinic has a stronger time bound for networks with unit capacity O(min(V^(2/3), E^(1/2))E). http://en.wikipedia.org/wiki/Dinic’s_algorithm.

I feel complexity part must also be explained properly in the Editorial!

Also Upper bound on number of nodes can be reduced:

Source = 1

Max. no of nodes in (SET I + SET II) = 400*400

Each element can be connected to at most 9 dummy nodes = 9 * 400 * 400

Destination = 1

Total: 1 + 400 * 400 + 9 * 400 * 400 + 1

Same thing can be done with number of edges!

mohit2104 · July 14, 2014, 3:55pm

ooh ! i wish i could have known it an hour before . I thought that its O(V^2*E).

Anyways thanks. I will read about it

caioaao · July 14, 2014, 5:41pm

@mohit2104 oh, you should’ve tried the push-relabel algorithm. I was sure about the number of vertices, so when i TLE’d I changed the max flow from Dinic’s to Push Relabel.

Unfortunatelly, here in codechef it happens a lot (problems that are solvable using max flow, but only if you use push-relabel).
push relabel works in O(V^3) in worst case, but in random graphs the expected complexity is way lower than this. From Stanford’s notebook:

It solves random problems with 10000 vertices and 1000000 edges in a few seconds, though it is possible to construct test cases that achieve the worst-case.

caioaao · July 14, 2014, 6:08pm

The dummy nodes aren’t necessary if you use a better implementation of Max Flow. This is because the number of distinct GCDs won’t be as big as 400*400. I’m not sure about my analysis, but here it goes:
The smallest distinct numbers to achieve the 400*400 distinct GCDs would have to be powers of 2 (the smallest prime). So, we’d need 400 different powers of two in each side: the biggest number in each set would be 2^400 => way bigger than the 1e9 limit, and would give us (400*400)/2, as A[i] < B[i] (first set of pairs) or A[i] > B[i] (second set of pairs).

Important: as I used only distinct GCDs as vertices for U and V sets, the capacity from the source to U vertices and from V to sink was the amount of pairs that had this GCD (easily done using a hashtable/map), so the construction of the graph actually ran in O(N^2 log(N^2)).

My solution got AC using edges from U directly to V and the Push Relabel algorithm: 4252168

elasolova35 · July 14, 2014, 6:41pm

We can greatly reduce the graph complexity by grouping pairs with same gcd on each side. Then the graph will become s to S1, where capacities will be # occurences of same gcds in S1. S1 to S2, again # occurences of gcd with gcd(s1,s2)>1.(where s1 in S1, s2 in S2. Capacity will be the # of occurences of s1. Then finally S2 to sink with capacities as # occurences of same gcds in S2. If you run Dinic’s on this graph instead of the original the max-flow is nearly instant. I think this solution is conceptialy easier that the prime solution.

caioaao · July 14, 2014, 6:46pm

i did this but Dinic TLE’d, had to use push relabel to make it work

studskull · July 14, 2014, 10:42pm

Is there something terribly slow in my push relabel?? I removed duplicate GCD’s and added dummy nodes. Still TLE’d .

void max_flow()
{

int i;
for(i=0;i<l;i++)
{
	h[i] = 0; 
	ef[i] = 0;
}
h[s] = l;
for(i =1; i<=l1; i++)
{
	ef[i] = cap[i][0];
	ef[0] -= cap[i][0];
}
t = l-1;
queue<int> q;
char* lbl = new char[l];
int u,v,m,lk;
for(i=0;i<l;i++)
	lbl[i] = 0;
for(i=1;i<=l1;i++)
{
	q.push(i);
	lbl[i] = 1;
}
while(!q.empty())
{
	u = q.front();
	m = -1;
	lk = e[u].size();
	for(i=0;i<lk;i++)
	{
		v = e[u][i];
		if(cap[u][v] > 0)
		{
			if(h[u]>h[v])
			{
				push(u,v);
				if((lbl[v] == 0) && (v!=0) && (v!=t))
				{
					lbl[v] = 1;
					q.push(v);
				}
			}
			else if(m==-1) m = h[v];
			else
				m = (m>h[v])?h[v]:m;
		}
	}
	if(ef[u])

		h[u] = 1+m;

	else
	{

		lbl[u] = 0;
		q.pop();

	}
}

}

studskull · July 14, 2014, 10:45pm

Just if you need to know… Random data n = 400 gives a graph with about 700 vertices and (i did not check, reading from the editorial) not more than 9*700 = 6300 edges, though in reality I feel the edges would be much less

xellos0 · July 14, 2014, 11:25pm

LOL everyone complains about their sophisticated flow algorithms and here I solve it using Ford-Fulkerson in O(N^4) memory and O(N^6) time per test case.

A[i] can be replaced by product of its prime divisors, similarly for B[i], without changing the result. Quite obvious.

Now, we construct a bipartite graph whose vertices are values of GCD of non-coprime A[i] and B[j] (in the left part: for A[i] < B[j], in the right one: for A[i] > B[j]) and edges of capacity infinity are between vertices corresponding to non-coprime values. We add a souce and sink and edges from the source to the left part / from the right part to the sink, with capacity equal to the number of pairs (A[i],B[j]) with this value of GCD.

Apparently, even if there can be O(N^2) vertices, there are less than V=1000 in worst case, and thus around E= several thousands of edges. We can find maxflow in this graph with an O(FE) Ford-Fulkerson flow algorithm, which is just “repeat BFS till it increases flow” (which is at most F=O(N^2), but it usually increases much faster, it’s reasonable to consider F approximately equal to V for real runtime calculations).

rishul_nsit · July 15, 2014, 12:17am

Can you please give an example? Sounds great but an example would exemplify the simplicity of the solution.

xellos0 · July 15, 2014, 12:36am

A={1,4,6}, B={2,3,8}

GCDs of A[i] < B[j]: GCD=1 3 times, GCD=2 1x, GCD=4 1x

GCDs of A[i] > B[j]: GCD=2 2x, GCD=3 1x

discard GCD=1, our graph without getting rid of prime powers is:

source -(capacity:1)- 2 -(capacity:inf)- 2 --(capacity:2)- sink
\ / /
(capacity:1)- 4 -(capacity:inf)/ 3 -(capacity:1)/

flow is 2 in here, corresponding to picking (4,8) (6,2) and (6,8) (4,2)

but when we get rid of powers, we have

source -(capacity:2)- 2 -(capacity:inf)- 2 -(capacity:2)- sink -(capacity:1)- 3

(it’s hard to draw it here)

caioaao · July 15, 2014, 2:09am

Well, that’s really good. too bad I didn’t come up with it before switching from Dinic to push relabel… Got AC using Dinic and discarting prime powers now

-OFF: thanks a lot for your comments, here and in codeforces! it’s helping me a lot in my studies (specially in codeforces, as the editorials are quite incomplete sometimes)

beroul · July 24, 2014, 4:10am

I use ford-fulkerson but get TLE in Java…
grmmphh!!!