FCTR - Unofficial Editorial

algmyr · September 18, 2018, 4:51am

Links

Problem

Given 2 \leq n \leq 10^{500} and the value of the Euler totient function \varphi(n) print the prime factorization of n.

Small solution

In the smallest case you can easily succeed by just factoring N. The number is well within the range of 64-bit integers and something like Pollard’s rho algorithm should work fine.

Large solution

For large cases you’ll be facing two main issues. How do I even handle these numbers? How does knowing the totient help in the factorization of N?

A simple answer for the first question is to use a language with built in bigints such as python, or painstakingly implement a decently high performance bigint library yourself. I would recommend the former.

The second question comes down to a neat number theory theorem and some cleverness. Euler’s theorem states that if n and a are coprime then

a^{\varphi(n)} \equiv 1 \pmod{n}.

Why is this useful? Let’s say we pick a random a. If a^{\varphi(n)} \not\equiv 1 we have found a factor of n, but that’s very unlikely. What if the equivalence holds? Can we do something clever?

Cleverness

The end goal here is to find p,q \not\in \{1,n\} such that p q \equiv 0 \pmod{n}, for which \gcd(p,n) and \gcd(q,n) are non-trivial factors of n. Specifically we’ll be looking for x such that x^2 \equiv 1 \pmod{n} since that can be rewritten (x-1)(x+1) \equiv 0 \pmod{n}.

Rewrite \phi(n) = 2^K t such that 2 does not divide t. We know that the equivalence holds for a^{2^Kt}, but consider the smallest k \leq K so that it holds. For n>2 we have that \varphi(n) is even, so k \geq 1, and hence (letting q=2^{k-1}t) we can write this as

\begin{aligned} a^{2 q} &\equiv 1 \pmod{n} \\ (a^q)^2 - 1 &\equiv 0 \pmod{n} \\ (a^q - 1)(a^q + 1) &\equiv 0 \pmod{n}. \\ \end{aligned}

We’re very close. We know that for a^q the equivalence didn’t hold (since we picked the smallest k that worked) so the first factor a^q - 1 is not divisible by n. We can’t guarantee that the second factor is not divisible by n, but it’s easily checked and is quite likely. If all checks out \gcd(a^q \pm 1,n) will be non-trivial factors of n.

So extracting factors of n boils down to trying new bases a until we find a factor or decide that we probably can’t reduce things further. Fortunately, we’re quite likely to end up in the good case.

Some gotchas

The above does not work for even n or perfect powers. This can easily be solved by extracting all twos at the beginning and to check if we have a perfect power before trying to find an additional factor (this can be done by iterating over possible bases and binary searching for the exponent).

Summary

Extract any twos from n immediately.

def factor(n):
    if is_power(n):
        # number is a^k
        return k lots of factor(a)
    
    for some number of iterations:
        pick a base and try finding a factor as explained earlier
        if found p,q such that p*q == n:
            return factor(p) + factor(q)
    else:
        # Probably prime
        return {}

factor(n)

Time complexity

The time complexity is not at all trivial to derive, and I wont even try to do it properly. I will however give some insights as to where costs arise and why the limits are as they are.

(Please inform me if I’ve done something horrendous here and I’ll try to amend it, I threw this part together late at night.)

The computational cost for my implementation can be separated into a few parts.

factor

In the rough pseudo-code the factor method will be called at most O(\log n) times (proportional to the number of prime factors). A call factor(k) has two parts.

power

Finding the power (if one exists) was done using \log(k) binary searches which computes powers inside. I kept the cost of the binary search down by doing a decent initial estimate, but even ignoring that cost we’re still dealing with something like \log(\cdot)^3 (one for the number of binary searches, one for exponentiation by squaring, one for multiplication of numbers at this scale).

This is where the limit in the problem statement matters, costs involving small polynomials in \log(\cdot) must be feasible to compute.

looking for a factor

Looking for a factor in itself is not that expensive. It’s roughly O((\text{#factors two in \phi(n)}) \times \log(\cdot)^2). Again the square is due to calculating powers and dealing with large numbers.

This cost will however be done repeatedly until either a result is gotten or a user defined iteration limit L is hit. If this limit is too low you risk getting wrong answers because you didn’t manage to decompose a number that would have decomposed into factors. If this limit is set too high you’ll TLE because you’re checking too much.

The actual cost of this whole process depends on how likely you are to find a good base, the number of factors in the number and probably more costs I’ve forgotten about.

Assuming that we don’t have to do many iterations before finding a factor, the complexity is something like O(L\times(\text{#factors two in \phi(n)}) \times \log(\cdot)^2).

Overall

We’re probably looking at something like O(T L \log(\cdot)^3) or O(T L \log(\cdot)^4). The log part is a mix of different logs, some of which are considerably smaller than n. As said, a detailed analysis is rather hard.

Solution

My solution

l_returns · September 18, 2018, 7:02am

Time complexity ?? Expected ?
Why did author choose 10^500 and not 10^1000 or 10^100 ?

algmyr · September 18, 2018, 7:34am

Short answer: it’s hard to derive or even estimate to some greater degree. I deliberately avoided talking about because of that, but I can add some short talk about the time complexity.

algmyr · September 18, 2018, 8:28am

There, I added a part about it. It’s quite vague but should give you some ideas of what’s costly and why the limit is where it is.

meooow · September 18, 2018, 12:37pm

I have used pretty much the same approach, except I check whether n is prime using Miller-Rabin before attempting to factor it. I found this book pretty helpful, chiefly Section 10.4 Factoring and computing Euler’s phi function
The algorithm mentioned is slightly different and actually results in TLE because there are a lot more gcd calls. However they do claim that the algorithm is able to factor n with probability at least 1/2. Which may shed further light on the time complexity of the solution.

vijju123 · September 18, 2018, 12:39pm

Thanks @algmyr for covering up for me here!

I hope you like the official editorial once I post it

l_returns · September 18, 2018, 1:53pm

Okay thanks for ur efforts…

praveenkumar12 · September 18, 2018, 5:27pm

For calling factor§ and factor(q) we also need \phi(p) and \phi(q). How do we compute that when neither p or q are prime?

meooow · September 18, 2018, 7:28pm

It is not necessary to have \phi(n), actually any multiple of \lambda(n) works, where \lambda is the Carmichael function. Let this multiple be m, and let d be a divisor of n. Since d|n implies \lambda(d)|\lambda(n), m is also a multiple of \lambda(d) and can be used to factorize d in turn.

adhyyan1252 · September 18, 2018, 9:11pm

Can you elaborate more on how to check whether it is a prime power. What exactly are you doing in the binary searches and what is the complexity of it.

praveenkumar12 · September 18, 2018, 9:23pm

we check if n=p^k for some prime p. we binary search on k. If we use fast exponentiation it can be done in log^2(k) and cost of performing multiplications.

adhyyan1252 · September 19, 2018, 8:26pm

I still haven’t understood how a binary search could work. If n = p^m doesn’t work, that doesn’t give us any information about some other prime raised to some other number(lesser of bigger). Also if we do get a m such that n = a^m then how do we know that a is a prime?

meooow · September 19, 2018, 8:42pm

@praveenkumar12 actually you’ve got it backwards. We try to find n = a ^ b by varying b from 2 to \log_2(n) and binary searching for a for each b. This can also be sped up by checking only prime b since a^{pq} = (a^q)^p.
@adhyyan1252 it is not necessary to deliberately find only prime a. Even if you find a composite a you have split n into b factors, each equal to a.