EXGCD - Editorial

pragrame · December 24, 2012, 1:30am

Problem Link:

Difficulty:

Easy-Medium

Pre-requisites:

Number Theory: Principle of Inclusion-Exclusion, Moebius Function, Euler’s Totient Function, Modular Arithmetic

Problem:

You are given K intervals [A₀, B₀], [A₁, B₁], …, [A_K-1, B_K-1]. From each interval i, you randomly pick a number X_i. What is the expected value of the GCD of the picked numbers (X₀, X₁, …, X_K-1)? If the answer is expressed as a fraction P/Q, then output an integer N in [0, MOD-1] (MOD = 10^9+7) such that P + Q*N is divisible by MOD.

Preliminary Definitions / Lemmas etc:

You may skip whichever you’re familiar with.

Euler’s phi function. Phi(N) = |{n: n <= N and n co-prime to N, i.e. gcd(n, N) = 1}|

Formula for phi(N):
if N = ∏(p_i^a_i) (after prime factorization), then
phi(N) = ∏ p_i^a_i-1*(p_i-1)

Lemma: ∑_d|N (phi(d)) = N for all N.

A simple proof of the above:

Consider N fractions : 1/N, 2/N, …, N/N.

Rewrite these in reduced form: 1/N, …, 1/1.

Now, for each d|N, there will be phi(d) of the above fractions having denominator = d.

Hence the lemma follows, by counting in two ways, the same N fractions.

Moebius function mu(N):

mu(N) = 0 if p² | N for some prime p (i.e. if N is not “square-free”)

else (-1)^(#primes dividing N) otherwise.

Hence, both phi and mu can be calculated efficiently using the Sieve of Eratosthenes.

Lemma: Moebius inversion formula:

∑ _d|N (d * mu(N/d)) = phi(N).

Lemma: gcd(x1, x2, x3, …, xn) = g <=> gcd(x1/g, x2/g, x3/g, …, xn/g) = 1.

This can be seen by taking prime factorizations of xi’s and comparing LHS and RHS of both equations.

Inverse modulo prime:

Given a prime number ‘M’, to every number x in the range [1, M-1], you have a unique number y, denoted by x^-1, such that x * y = 1 (mod M).

Given prime M, and value x, x^-1 can be efficiently found using either of the following methods:

Euclidean algorithm: given x and M, find y and n such that x * y + M * n = 1. Now, by taking modulo M on both sides, we get that x^-1 = y (mod M)
Fermat’s little theorem: for any x not divisible by prime M, x^M-1 = 1 (mod M). This shows that x^M-2 = x^-1 (mod M). Further x^M-2 is calculated efficiently by repeated squaring in log(M) time.

Quick Explanation:

We try to find P = sum of gcd(x₀, x₁, …, x_K-1) over all possible values of x_i.

Also find Q = number of tuples (x₀, x₁, …, x_K-1) = product(B_i - (A_i - 1)).

With such P and Q, output N = -P * Q^-1 (mod 10^9+7).

The question boils down to finding P. Using principle of inclusion-exclusion (and concept of moebius function), get that P = ∑_{R, m, m*R <= min(B_i)}(R * mu(m) * ∏_{0<=i < K}(floor(B[i]/(m * R)) - floor((A[i]-1)/(m * R))

This is equivalent (using moebius inversion formula) to P = ∑_{Y <= min(B_i)} (phi(Y) * ∏_{0<=i < K}(floor(B[i]/Y) - floor((A[i]-1)/Y)).

Detailed Explanation:

Consider a K-dimensional array (lets call it GCD) where the value in the (x₀, x₁, x₂, …, x_K-1)'th cell is gcd(x₀, x₁, x₂, …, x_K-1). We are interested in finding the expected value of GCD[x₀][x₁]…[x_K-1] over all possible values of x₀, x₁, …, x_K-1. Since all x_i’s are picked randomly, the expected value is equal to the arithmetic mean. Finally, this question is equivalent to finding the sum of the values in this K-dimensional GCD array, since the mean is then just dividing this sum by the number of elements.

In fact, looking at P/Q, we can get P = sum of elements, and Q = number of elements. Let MOD = 10^9 + 7. We need to find N such that P + Q * N is divisible by MOD. In other words, P + Q * N = 0 (mod MOD). Since MOD is prime, we can find “Q^-1” modulo MOD, which gives us that N = -P * Q^-1 (mod MOD). Hence, finding P = sum of elements in GCD array modulo MOD, and Q = number of elements = ∏ (B_i - (A_i-1)) (modulo MOD), we would have enough information to get N. Here, Q can be calculated in O(K) time per test-case.

P is calculated in the following manner.
Firstly, let us try to solve the following intermediate question:
Given any arbitrary set of K intervals (not necessarily the A_i, B_i as desired in the question), denoted by (U_i, V_i] (intervals exclusive of U_i, inclusive of V_i), let us ask: How many points are there in the corresponding K-dim space : i’th coordinate lies in (U_i, V_i], whose gcd(point’s coordinates) is 1. Let us call this set of points as C(U, V). The answer to our intermediate question is now |C(U, V)|, where U and V specify the intervals’ endpoints we are interested in.

This helps us because

the number of points in (A_i - 1, B_i] intervals whose coordinates’ gcd is R (for arbitrary R) =

the number of points in (floor((A_i-1)/R), floor(B_i/R)] whose coordinates’ gcd is 1.

The above follows from the fact that gcd(x₀, x₁, x₂, …, x_K-1) = R <=> gcd(x₀/R, x₁/R, x₂/R, …, x_K-1/R) = 1. And the fact that

A_i - 1 < x_i <= B_i <=> floor((A_i - 1)/R) < x_i/R <= floor(B_i/R)

Thus, P = required sum = ∑ _{R > 0} (R * |C(floor((A_i-1)/R), floor(B_i/R))|)

We now proceed to calculate |C(U, V)|. Qualitatively speaking, from the universal set of points ARR(U, V), we remove those whose coordinates’ gcd > 1, to get C(U, V).

Let ARR(U, V) = {(x₀, x₁, …, x_K-1) : U[i] < x_i <= V[i]}.

Let G(U, V, D) = {(x₀, x₁, …, x_K-1) : D | gcd(x₀, x₁, …, x_K-1), U[i] < xi <= V[i]}

From the above 2 definitions, it is easily seen that

C(U, V) = ARR(U, V) \ UNION_{D > 1} G(U, V, D)
: from the set of all possible points, remove points whose gcd is divisible by any D > 1.

In fact, we need consider only D as being prime.

Indeed, UNION _{D > 1} G(U, V, D) = UNION _{prime D} G(U, V, D) since if the gcd of a point is D (> 1), then some prime p|D and that point belongs to the set G(U, V, p). Similarly if p|gcd(point’s coordinates), then that point belongs to the set G(U, V, p). Also, we only consider prime D <= min(V) since for prime D > min(V), G(U, V, D) is ∅.

Now, using the fact that if co-prime m and n are such that m|x and n|x, it is equivalent to (m * n)|x, we get that
given co-prime D1, and D2: G(U, V, D1) ∩ G(U, V, D2) = G(U, V, D1*D2).

The good thing about the set G(U, V, D) is that it is easy to count its size. In fact, |G(U, V, D)| = ∏ (floor(V[i]/D) - floor(U[i]/D)).

This is because (x₀, x₁, …, x_K-1) ∈ G(U, V, D)
<=> D|x_i, and U_i < x_i <= V_i.

|{x : (D|x and x <= n)}| = floor(n/D), which when applied above gives

|{x_i : D|x_i and U_i < x_i <= V_i}| = |{x_i : D|x_i and x_i <= V_i} \ {x_i: D|x_i and x_i <= U_i}| = |{x_i : D|x_i and x_i <= V_i}| - |{x_i: D|x_i and x_i <= U_i}| = floor(V_i/D) - floor (U_i/D).
Since x_i’s are chosen independently of other x_i’s, |G(U, V, D)| = ∏(floor(V_i/D) - floor(U_i/D)).

And now we are well equipped to apply Principle of Inclusion-Exclusion which gives us:
|C(U, V)| = ∑_{M = set of distinct primes} (-1)^|M| * |G(U, V, ∏ _{p ∈ M} p)|)
= ∑_{M = set of distinct primes} (-1)^|M| * (floor(V_i/∏ _{p ∈ M} p) - floor(U_i/∏ _{p ∈ M} p))

This gives us our first possible algorithm.


P = 0
For R from 1 to N(= min(B<sub>0</sub>, B<sub>1</sub>, …, B<sub>K-1</sub>)
	V<sub>i</sub> = B<sub>i</sub>/R, U<sub>i</sub> = (A<sub>i</sub>-1)/R
	for all m = product of distinct primes s.t. m <= min(floor(N/R))
		P += R * (-1)^(number of distinct primes in m) * ∏(floor(V<sub>i</sub>/m) - floor(U<sub>i</sub>/m))

The above algorithm takes time O(∑(floor(N/j)) * K) per test-case, which is O(N * logN * K) per testcase. This is too slow however. It can be sped up by noticing that product(floor(V_i/m) - floor(U_i/m)) = product (floor(B_i/(m * R)) - floor((A_i-1)/(m * R))). This product can be calculated once for all m * R product pairs in O(NK) time. This now gives us an algorithm which is O(N * logN + N * K) per test-case. This is fast enough to pass.

Also, note that in the above, (-1)^(number of distinct primes in m) = mu(m) where m = product of distinct primes. Also, since mu(m) = 0 if m is not the product distinct primes, the above can also be written as
P = ∑ _{m, R s.t. m * R <= N} (R * mu(m) * ∏(floor(B_i/(m * R)) - floor((A_i-1)/(m * R)))

mu(m) can be precomputed for all m <= 2*10^5 using Sieve.

Also, if you rewrite m * R = Y, then you can write ∑(R * mu(Y/R)) as coeff(Y), giving P = sum ( coeff(Y) * ∏(floor(B_i]/Y) - floor((A_i-1)/Y))).

Using the moebius inversion formula, you get that coeff(Y) is actually phi(Y). So using sieve, we can precompute in time O(N * log log N) the values of phi(i) for all i <= 2*10^5. This gives us a O(N * log log N + T * N * K) time algorithm.

Alternate approach:

This approach gives us directly P = ∑ ( phi(Y) * ∏(floor(B_i/Y) - floor((A_i-1)/Y))) using some clever symbolic manipulation.
Here, we use that n = ∑ _d|n (phi(d)).

sum(over A_i <= x_i <= B_i) GCD(x₀, x₁, x₂, …, x_K-1)

= sum(over A_i <= x_i <= B_i) of (sum(over d | GCD(x₀, x₁, x₂, …, x_K-1)) phi(d)

= sum(over A_i <= x_i <= B_i) of (sum(over d | all of x₀, x₁, x₂, …, x_K-1)) phi(d)

= sum(over d <= min(B_i)) of(sum (over d | x₀, and d | x₁, and d | x₂, …, and d | x_K-1 and A_i <= x_i <= B_i)) phi(d)

= sum(over d <= min(B_i)) of ∏(floor (B_i/d) - floor((A_i-1)/d)) * phi(d)

Author’s Solution:

Can be found here

Tester’s Solution:

Can be found here

mmaxio · December 24, 2012, 2:04am

Alternate approach is really smart. Also I think the fact that q is not divisible by MOD is implicitly used in solution, if MOD was less than 200000 BigIntegers would be probably needed.

pragrame · December 24, 2012, 2:11am

Yes, that is implicitly used. However, also note that if MOD was less than 200000, then there is a chance that we would not be able to find any N having the P + Q*N divisible by MOD.

anujkaliaiitd · December 24, 2012, 2:43am

Gassa gave another solution as follows:

"First calculate the number of tuples where each element (and therefore GCD) is divisible by 1,2,3,…,N = 200,000.

After that, the number of tuples with GCD being exactly 1,2,3,…,N can be found in an NLogN pass."

anton_lunyov · December 24, 2012, 5:05am

@anujkaliaiitd In fact, almost all contestants come up with this “another solution”. If nobody will describe it I will probably do this when I have time. In a few words the phrase “can be found in an NLogN pass” means dp of the form
dp[k] = cnt[k] - dp[2 * k] - dp[3 * k] - ... - dp[[N/k] * k]
where cnt[k] is the number of tuples from the first sentence.

MarioYC · December 24, 2012, 9:14am

The alternate approach can be optimized by going only through values where value of B[i] / d or (A[i] - 1) / d changes.

Getting complexity O(sqrt(min B_i) * K^2) per test case.

tornike4 · December 24, 2012, 3:24pm

got it now

the_c0der · January 3, 2013, 7:56pm

plz explain this “another approach” in detail

forthright · August 11, 2015, 7:05pm

Took me some time, but finally got how the alternative approach works. I saw the lemma before but this is the first time I saw its application. Really beautiful method to make solution faster by ln(n).