SEQUAT2 - Editorial

kevinsogo · November 20, 2016, 9:42pm

PROBLEM LINK:

Author: Istvan Nagy
Tester: Kevin Atienza
Translators: Sergey Kulik (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza

DIFFICULTY:

Medium-Hard

PREREQUISITES:

Bitwise operations, sieve, factorization

PROBLEM:

Given A, B, C, N, compute all solutions (x,y) to the following equation:

xy = (x\lor y)(x\land y) + Ax + By + C

where \lor and \land are bitwise OR and AND, respectively.

Compute the sum of all x and sum of all y across all solutions (x,y) with 0 \le x, y \le N.

QUICK EXPLANATION:

Let n = x\land y, a = x-n and b = y-n. Then there’s a one-to-one mapping between solutions (x,y) and triples (a,b,n) such that:

a\land b = a\land n = b\land n = 0.
(a-B)(b-A) = (A+B)n+AB+C.

The mapping is given by (x,y) = (a+n,b+n).

Let L = (A+B)N+AB+C. Notice that (a-B)(b-A) \le L, and the smaller of (a-B) and (b-A) is \le \sqrt{L}.

To compute all solutions (a,b,n) such that (a-B) \le (b-A):

Enumerate all possible values of d := a-B up to \sqrt{L}.
Compute all numbers n such that (A+B)n+AB+C is divisible by d.
Let e = \frac{(A+B)n+AB+C}{d}. If e \le d, then try the solution (d+B, e+A, n).

Similarly, to compute all solutions (a,b,n) such that (a-B) > (b-A):

Enumerate all possible values of e := b-A up to \sqrt{L}.
Compute all numbers n such that (A+B)n+AB+C is divisible by e.
Let d = \frac{(A+B)n+AB+C}{e}. If e < d, then try the solution (d+B, e+A, n).

Using this, every solution will be generated exactly once, but you need to check each candidate (a,b,n) if it satisfies all conditions above.

EXPLANATION:

Reparameterization

The equation xy = (x\lor y)(x\land y) + Ax + By + C is very weird because of the bitwise operations. You don’t normally see those, which makes this problem unusual. Nevertheless, bitwise operations still behave somewhat well algebraically, so let’s see what we can do.

The first thing we note is that if two numbers a and b don’t share any bits, then (a\lor b) = a+b and (a\land b) = 0. But what if they don’t share any bits? Then the first one is incorrect, because it fails to take into account the bits shared by a and b. Specifically, these bits are counted twice in a+b but only once in (a\lor b).

But we can easily know which bits two numbers share: it’s simply (a\land b)! Thus, this gives us the following general identity:

(a\lor b) = a+b - (a\land b)

This is great, because it eliminates one nasty bitwise operator! Now we have one more left. Our equation is now:

xy = (x+y - (x\land y))(x\land y) + Ax + By + C

(You’ll notice that there are too many parentheses in these equations. This is because bitwise operations are that uncommon among math equations.)

If we let n = x\land y, then the equation becomes:

xy = (x+y - n)n + Ax + By + C

Now we’re getting somewhere. Notice that the bitwise operations are disappearing.

In fact, let’s take this idea further, and reparameterize the solutions so that our variables don’t have bits in common. Currently, we have the values (x,y,n), but these numbers may have bits in common. So let’s define new variables a = x-n and b = y-n to denote the bits unique to x and y, respectively. This way, for a triple (a,b,n), we have (a\land b) = (a\land n) = (b\land n) = 0 and

(a+n)(b+n) = ((a+n)+(b+n) - n)n + A(a+n) + By + C

Let’s manipulate this further:

\begin{aligned} (a+n)(b+n) &= ((a+n)+(b+n) - n)n + A(a+n) + B(b+n) + C \\\ (a+n)(b+n) &= (a+b+n)n + A(a+n) + B(b+n) + C \\\ ab+an+bn+n^2 &= an+bn+n^2 + A(a+n) + B(b+n) + C \\\ ab &= A(a+n) + B(b+n) + C \end{aligned}

Notice the nice cancellation going on! This is certainly easier to solve.

At this point though, it isn’t clear how to proceed. But if you assume for a while that n is a constant, then we notice that the equation becomes a conic section on the variables a and b, and it turns out that such equations are usually reducible to one of the following:

A linear diophantine equation
Generalized Pell equation
Factorization problem

Let’s see what we got by assuming n is constant and throwing all “constants” to the right side:

\begin{aligned} ab &= A(a+n) + B(b+n) + C\\\ ab - Aa - Bb &= (A+B)n + C \end{aligned}

Here, we can try to “complete the factorization”:

\begin{aligned} ab - Aa - Bb &= (A+B)n + C\\\ ab - Aa - Bb+AB &= (A+B)n + AB + C \\\ (a-B)(b-A) &= (A+B)n + AB + C \end{aligned}

This means we have reduced the problem to factorizing all numbers of the form (A+B)n + AB + C, where n \le N

Enumerating factors

Now we’re really getting somewhere. We want to enumerate the triples (a,b,n) satisfying all the following things:

(a-B)(b-A) = (A+B)n + AB + C.
a+n \le N and b+n \le N.
(a\land b) = (a\land n) = (b\land n) = 0.

The remaining two conditions can just be checked afterwards, so let’s focus on the first, i.e. enumerating factorizations of (A+B)n + AB + C.

This number is always positive, but the factors on the left side may be negative! Thankfully, we can exclude that possibility because if both factors are negative (and a, b \ge 0), then |(a-B)(b-A)| \le |AB| < (A+B)n + AB + C. Thus, we can focus on positive divisors.

Next, notice that the largest possible right-hand-side value is L := (A+B)N + AB + C. This means that the smaller factor among (a-B) and (b-A) is at most the square root of this number. But the square root of this number is a smallish number considering the problem constraints, which makes it feasible for enumeration!

Specifically, assume first that (a-B) \le (b-A). This means that (a-B) \le \sqrt{L}. So let’s enumerate all these possible values d := a-B. The next step is to enumerate all indices n \le N such that d divides (A+B)n + AB + C. Which such indices are there?

If we let P = A+B and Q = AB + C, then the following are equivalent:

\begin{aligned} d &\mid (A+B)n + AB + C \\\ d &\mid Pn + Q \\\ Pn &\equiv -Q \pmod{d} \end{aligned}

Let g = \gcd(P,d). Then if g doesn’t divide Q, this has no solutions. Otherwise, we can continue by letting P' = P/g, Q' = Q/g and d' = d/g:

\begin{aligned} Pn &\equiv -Q \pmod{d} \\\ P'n &\equiv -Q' \pmod{d'} \\\ n &\equiv -Q'(P')^{-1} \pmod{d'} \end{aligned}

This means that it’s easy to “loop” across all valid indices n! The following is an illustration on how this can be done, using a C+±style loop:

Let g = gcd(P,d)
Let x = modInverse(P/g,d/g)
for (n = (-Q'*x) % (d/g); n <= N; n += d/g) {
    ...
}

Having access to d and n, we can now compute a candidate solution (a,b,n) as:

a = B + d (from d = a-B)
b = A + \frac{(A+B)n+AB+C}{d}
n = n

(You should only count this if (a-B) \le (b-A) is true.)

Using this method, we can enumerate all candidate solutions (a,b,n) satisfying (a-B) \le (b-A).

We can enumerate candidates (a,b,n) satisfying the other inequality (b-A) < (a-B) similarly:

b = A + d
b = B + \frac{(A+B)n+AB+C}{d}
n = n

(You should only count this if (b-A) < (a-B) is true.)

Since every solution (a,b,n) determines a unique value (a-B)(b-A), this means each solution will be generated exactly once! We just need to filter out the candidates with the remaining requirements:

a+n \le N and b+n \le N.
(a\land b) = (a\land n) = (b\land n) = 0.

Running time

The question now is, how fast does this solution run?

Notice that for every number d \le \sqrt{L}, the number of candidates we’re considering is at most:

1 + \frac{N}{d/\gcd(A+B,d)}

Thus, the running time is proportional to the following sum:

\sum_{d\le \sqrt{L}} \left(1 + \frac{N}{d/\gcd(A+B,d)}\right)

Which can be simplified to:

\begin{aligned} T(N) &= O\left(\sum_{d\le \sqrt{L}} \left(1 + \frac{N}{d/\gcd(A+B,d)}\right)\right) \\\ &= O\left(\sqrt{L} + \sum_{g \mid A+B} \sum_{\substack{d\le \sqrt{L} \\\ \gcd(A+B,d) = g}} Ng/d\right) \\\ &\le O\left(\sqrt{L} + \sum_{g \mid A+B} \sum_{\substack{d\le \sqrt{L} \\\ g \mid d}} Ng/d\right) \\\ &= O\left(\sqrt{L} + \sum_{g \mid A+B} N\sum_{d\le \sqrt{L}/g} 1/d\right) \\\ &\le O\left(\sqrt{L} + \sum_{g \mid A+B} N\sum_{d\le \sqrt{L}} 1/d\right) \\\ &= O\left(\sqrt{L} + \sum_{g \mid A+B} N \log \sqrt{L}\right) \\\ &= O\left(\sqrt{L} + N \log L \cdot d(A+B)\right) \end{aligned}

where d is the divisor count function.

Thus, the time complexity is O\left(\sqrt{L} + N \log L \cdot d(A+B)\right). Note that this is a pretty loose bound, but it still gives an idea of how fast this runs.

Time Complexity:

O(\sqrt{L}+N \log L \cdot d(A+B)), where L = (A+B)N + AB + C, and d is the divisor count function.

AUTHOR’S AND TESTER’S SOLUTIONS:

setter
tester