EXGCD - Editorial

Problem Link:





Number Theory: Principle of Inclusion-Exclusion, Moebius Function, Euler’s Totient Function, Modular Arithmetic


You are given K intervals [A0, B0], [A1, B1], …, [AK-1, BK-1]. From each interval i, you randomly pick a number Xi. What is the expected value of the GCD of the picked numbers (X0, X1, …, XK-1)? If the answer is expressed as a fraction P/Q, then output an integer N in [0, MOD-1] (MOD = 10^9+7) such that P + Q*N is divisible by MOD.

Preliminary Definitions / Lemmas etc:

You may skip whichever you’re familiar with.

Euler’s phi function. Phi(N) = |{n: n <= N and n co-prime to N, i.e. gcd(n, N) = 1}|

Formula for phi(N):
if N = ∏(piai) (after prime factorization), then
phi(N) = ∏ piai-1*(pi-1)

Lemma: ∑d|N (phi(d)) = N for all N.

A simple proof of the above:

Consider N fractions : 1/N, 2/N, …, N/N.

Rewrite these in reduced form: 1/N, …, 1/1.

Now, for each d|N, there will be phi(d) of the above fractions having denominator = d.

Hence the lemma follows, by counting in two ways, the same N fractions.

Moebius function mu(N):

mu(N) = 0 if p2 | N for some prime p (i.e. if N is not “square-free”)

else (-1)^(#primes dividing N) otherwise.

Hence, both phi and mu can be calculated efficiently using the Sieve of Eratosthenes.

Lemma: Moebius inversion formula:

d|N (d * mu(N/d)) = phi(N).

Lemma: gcd(x1, x2, x3, …, xn) = g <=> gcd(x1/g, x2/g, x3/g, …, xn/g) = 1.

This can be seen by taking prime factorizations of xi’s and comparing LHS and RHS of both equations.

Inverse modulo prime:

Given a prime number ‘M’, to every number x in the range [1, M-1], you have a unique number y, denoted by x-1, such that x * y = 1 (mod M).

Given prime M, and value x, x-1 can be efficiently found using either of the following methods:

  1. Euclidean algorithm: given x and M, find y and n such that x * y + M * n = 1. Now, by taking modulo M on both sides, we get that x-1 = y (mod M)
  2. Fermat’s little theorem: for any x not divisible by prime M, xM-1 = 1 (mod M). This shows that xM-2 = x-1 (mod M). Further xM-2 is calculated efficiently by repeated squaring in log(M) time.

Quick Explanation:

We try to find P = sum of gcd(x0, x1, …, xK-1) over all possible values of xi.

Also find Q = number of tuples (x0, x1, …, xK-1) = product(Bi - (Ai - 1)).

With such P and Q, output N = -P * Q-1 (mod 10^9+7).

The question boils down to finding P. Using principle of inclusion-exclusion (and concept of moebius function), get that P = ∑R, m, m*R <= min(Bi)(R * mu(m) * ∏0<=i < K (floor(B[i]/(m * R)) - floor((A[i]-1)/(m * R))

This is equivalent (using moebius inversion formula) to P = ∑Y <= min(B_i) (phi(Y) * ∏0<=i < K (floor(B[i]/Y) - floor((A[i]-1)/Y)).

Detailed Explanation:

Consider a K-dimensional array (lets call it GCD) where the value in the (x0, x1, x2, …, xK-1)'th cell is gcd(x0, x1, x2, …, xK-1). We are interested in finding the expected value of GCD[x0][x1]…[xK-1] over all possible values of x0, x1, …, xK-1. Since all xi’s are picked randomly, the expected value is equal to the arithmetic mean. Finally, this question is equivalent to finding the sum of the values in this K-dimensional GCD array, since the mean is then just dividing this sum by the number of elements.

In fact, looking at P/Q, we can get P = sum of elements, and Q = number of elements. Let MOD = 10^9 + 7. We need to find N such that P + Q * N is divisible by MOD. In other words, P + Q * N = 0 (mod MOD). Since MOD is prime, we can find “Q-1” modulo MOD, which gives us that N = -P * Q-1 (mod MOD). Hence, finding P = sum of elements in GCD array modulo MOD, and Q = number of elements = ∏ (Bi - (Ai-1)) (modulo MOD), we would have enough information to get N. Here, Q can be calculated in O(K) time per test-case.

P is calculated in the following manner.
Firstly, let us try to solve the following intermediate question:
Given any arbitrary set of K intervals (not necessarily the Ai, Bi as desired in the question), denoted by (Ui, Vi] (intervals exclusive of Ui, inclusive of Vi), let us ask: How many points are there in the corresponding K-dim space : i’th coordinate lies in (Ui, Vi], whose gcd(point’s coordinates) is 1. Let us call this set of points as C(U, V). The answer to our intermediate question is now |C(U, V)|, where U and V specify the intervals’ endpoints we are interested in.

This helps us because

the number of points in (Ai - 1, Bi] intervals whose coordinates’ gcd is R (for arbitrary R) =

the number of points in (floor((Ai-1)/R), floor(Bi/R)] whose coordinates’ gcd is 1.

The above follows from the fact that gcd(x0, x1, x2, …, xK-1) = R <=> gcd(x0/R, x1/R, x2/R, …, xK-1/R) = 1. And the fact that

Ai - 1 < xi <= Bi <=> floor((Ai - 1)/R) < xi/R <= floor(Bi/R)

Thus, P = required sum = ∑ R > 0 (R * |C(floor((Ai-1)/R), floor(Bi/R))|)

We now proceed to calculate |C(U, V)|. Qualitatively speaking, from the universal set of points ARR(U, V), we remove those whose coordinates’ gcd > 1, to get C(U, V).

Let ARR(U, V) = {(x0, x1, …, xK-1) : U[i] < x_i <= V[i]}.

Let G(U, V, D) = {(x0, x1, …, xK-1) : D | gcd(x0, x1, …, xK-1), U[i] < xi <= V[i]}

From the above 2 definitions, it is easily seen that

C(U, V) = ARR(U, V) \ UNIOND > 1 G(U, V, D)
: from the set of all possible points, remove points whose gcd is divisible by any D > 1.

In fact, we need consider only D as being prime.

Indeed, UNION D > 1 G(U, V, D) = UNION prime D G(U, V, D) since if the gcd of a point is D (> 1), then some prime p|D and that point belongs to the set G(U, V, p). Similarly if p|gcd(point’s coordinates), then that point belongs to the set G(U, V, p). Also, we only consider prime D <= min(V) since for prime D > min(V), G(U, V, D) is ∅.

Now, using the fact that if co-prime m and n are such that m|x and n|x, it is equivalent to (m * n)|x, we get that
given co-prime D1, and D2: G(U, V, D1) ∩ G(U, V, D2) = G(U, V, D1*D2).

The good thing about the set G(U, V, D) is that it is easy to count its size. In fact, |G(U, V, D)| = ∏ (floor(V[i]/D) - floor(U[i]/D)).

This is because (x0, x1, …, xK-1) ∈ G(U, V, D)
<=> D|xi, and Ui < xi <= Vi.

|{x : (D|x and x <= n)}| = floor(n/D), which when applied above gives

|{xi : D|xi and Ui < xi <= Vi}| = |{xi : D|xi and xi <= Vi} \ {xi: D|xi and xi <= Ui}| = |{xi : D|xi and xi <= Vi}| - |{xi: D|xi and xi <= Ui}| = floor(Vi/D) - floor (Ui/D).
Since xi’s are chosen independently of other xi’s, |G(U, V, D)| = ∏(floor(Vi/D) - floor(Ui/D)).

And now we are well equipped to apply Principle of Inclusion-Exclusion which gives us:
|C(U, V)| = ∑M = set of distinct primes (-1)^|M| * |G(U, V, ∏ p ∈ M p)|)
= ∑M = set of distinct primes (-1)^|M| * (floor(Vi/∏ p ∈ M p) - floor(Ui/∏ p ∈ M p))

This gives us our first possible algorithm.

P = 0
For R from 1 to N(= min(B<sub>0</sub>, B<sub>1</sub>, …, B<sub>K-1</sub>)
	V<sub>i</sub> = B<sub>i</sub>/R, U<sub>i</sub> = (A<sub>i</sub>-1)/R
	for all m = product of distinct primes s.t. m <= min(floor(N/R))
		P += R * (-1)^(number of distinct primes in m) * ∏(floor(V<sub>i</sub>/m) - floor(U<sub>i</sub>/m))

The above algorithm takes time O(∑(floor(N/j)) * K) per test-case, which is O(N * logN * K) per testcase. This is too slow however. It can be sped up by noticing that product(floor(Vi/m) - floor(Ui/m)) = product (floor(Bi/(m * R)) - floor((Ai-1)/(m * R))). This product can be calculated once for all m * R product pairs in O(NK) time. This now gives us an algorithm which is O(N * logN + N * K) per test-case. This is fast enough to pass.

Also, note that in the above, (-1)^(number of distinct primes in m) = mu(m) where m = product of distinct primes. Also, since mu(m) = 0 if m is not the product distinct primes, the above can also be written as
P = ∑ m, R s.t. m * R <= N (R * mu(m) * ∏(floor(Bi/(m * R)) - floor((Ai-1)/(m * R)))

mu(m) can be precomputed for all m <= 2*10^5 using Sieve.

Also, if you rewrite m * R = Y, then you can write ∑(R * mu(Y/R)) as coeff(Y), giving P = sum ( coeff(Y) * ∏(floor(Bi]/Y) - floor((Ai-1)/Y))).

Using the moebius inversion formula, you get that coeff(Y) is actually phi(Y). So using sieve, we can precompute in time O(N * log log N) the values of phi(i) for all i <= 2*10^5. This gives us a O(N * log log N + T * N * K) time algorithm.

Alternate approach:

This approach gives us directly P = ∑ ( phi(Y) * ∏(floor(Bi/Y) - floor((Ai-1)/Y))) using some clever symbolic manipulation.
Here, we use that n = ∑ d|n (phi(d)).

sum(over Ai <= xi <= Bi) GCD(x0, x1, x2, …, xK-1)

= sum(over Ai <= xi <= Bi) of (sum(over d | GCD(x0, x1, x2, …, xK-1)) phi(d)

= sum(over Ai <= xi <= Bi) of (sum(over d | all of x0, x1, x2, …, xK-1)) phi(d)

= sum(over d <= min(Bi)) of(sum (over d | x0, and d | x1, and d | x2, …, and d | xK-1 and Ai <= xi <= Bi)) phi(d)

= sum(over d <= min(Bi)) of ∏(floor (Bi/d) - floor((Ai-1)/d)) * phi(d)

Author’s Solution:

Can be found here

Tester’s Solution:

Can be found here


Alternate approach is really smart. Also I think the fact that q is not divisible by MOD is implicitly used in solution, if MOD was less than 200000 BigIntegers would be probably needed.

Yes, that is implicitly used. However, also note that if MOD was less than 200000, then there is a chance that we would not be able to find any N having the P + Q*N divisible by MOD.

Gassa gave another solution as follows:

"First calculate the number of tuples where each element (and therefore GCD) is divisible by 1,2,3,…,N = 200,000.

After that, the number of tuples with GCD being exactly 1,2,3,…,N can be found in an NLogN pass."

@anujkaliaiitd In fact, almost all contestants come up with this “another solution”. If nobody will describe it I will probably do this when I have time. In a few words the phrase “can be found in an NLogN pass” means dp of the form
dp[k] = cnt[k] - dp[2 * k] - dp[3 * k] - ... - dp[[N/k] * k]
where cnt[k] is the number of tuples from the first sentence.

1 Like

The alternate approach can be optimized by going only through values where value of B[i] / d or (A[i] - 1) / d changes.

Getting complexity O(sqrt(min B_i) * K^2) per test case.

1 Like

got it now

plz explain this “another approach” in detail


Took me some time, but finally got how the alternative approach works. I saw the lemma before but this is the first time I saw its application. Really beautiful method to make solution faster by ln(n).