PROBLEM LINK:
Setter- Rajat De
Tester- Teja Vardhan Reddy
Editorialist- Abhishek Pandey
DIFFICULTY:
HARD
PRE-REQUISITES:
Mathematics, Randomized Solutions, Euler’s Totient Function and Carmichael’s function (please see relation between both of these ), Euler’s Theorem
PROBLEM:
Given N and \phi(N), factorize N, where N can be \le 10^{500}.
QUICK-EXPLANATION:
Key to AC- Realizing that you have to use randomization and exploring the properties of above-mentioned functions in this regard.
Strip N of any powers of 2 which might be present. Now, perform the following steps-
- Check if N is prime. Setter did that by simulating algorithm of step 3 \approx 30 times. If N couldn’t be factored even after that he concluded it with good accuracy that its prime.
- Check if N is power of prime via finding N'th root.
- Take a random number in between [2,n-1]. Call if x. Now check gcd(N,x), if its >1 then we found a factor by pure luck - not very likely. If its 1, then x and N are coprime. Use x^{\phi(N)}\equiv 1 \space mod \space N. Now, write \phi(N)=2^K*t and find smallest k \le K such that our equivalence holds ,i.e x^{2^k*t} \equiv 1 \space mod \space N (We want to find k \le K) . Since \phi(N) is even for all numbers greater than 2, we know that K \ge 1. Now, if found, then we get that x^{2^k*t} \equiv 1 \space mod \space N. Let a=x^{2^{k-1}*t}. We get a^2 \equiv 1 \space mod \space N \implies (a^2-1) \equiv 0 \space mod \space N \implies (a-1)*(a+1)\equiv 0 \space mod \space N. Check for gcd(a\pm 1,N) and proceed to factorise likewise.
EXPLANATION:
I will first provide “pre-readings” which would give you a fair idea of what the solution is going to look like. Dont worry if things dont seem familiar in first look, just have a read to see what is to be expected.
- This answer by spam…@nil.nil .
- Unofficial editorial by @algmyr (dont forget to thank and upvote!)
- The pre-requisite links should be visited to know relation between Euler’s Totient and Carmichael’s function.
With this, lets begin with the solution.
Full Solution-
a. Such Large input, Very Much Wow-
The very first thing which intrigues us is, that N \le 10^{500}, and \phi(N) is of around that value as well. So…what now? C++? Better no!
We will be doing LOT of computations in this problem, including fast exponentiation, finding N'th root, finding gcd etc. Its advisable to use a language with in-built support for such big numbers. Python, Pypy and then JAVA were three popular languages used for this, in decreasing order of popularity.
First I will tell the general algorithm on what to do, and then tell its counterpoints or weak points, and how to cover them.
b. General Algorithm-
We have a N which has nothing special, its just any random N which setter threw upon us. Now, we know its \phi(N), and thus know k \lambda(N) where \lambda(N) is Carmichael’s function. We will be making use of Carmichael’s function mostly.
Now, by good Euler’s theorem we know that-
a^{\phi(N)} \equiv 1 \ mod \ N provided our a and N are co-prime.
Now, lets have a look at Carmichael’s function as well. It states that-
\lambda(N) \ is \ defined \ as \ smallest \ m \ such \ that \ -
a^m \equiv 1 \ mod \ N for every positive integer between 1 and N co-prime to N.
Also, notice that the relation between \phi(N) and \lambda(N) is that \phi(N) is always a multiple of \lambda(N). Naturally, the Carmichael’s function seems to be more useful for our case.
One thing to note is that, factorizing a large number, completely by itself and no additional information, is a hard problem which is essence of cryptography and security. So, we need to use the given additional information cleverly to solve be able to solve the problem. But chances of deterministic algorithm seem slim, especially because we can’t even store primes so many as to factorize numbers with 500 digits, and even if we could they’d be too many that we would get TLE before we finish checking even half of them. Usually, this is a hint that we should try to explore for a randomised algorithm, which can give give answer with good probability/accuracy.
Okay, lets give a very trivial try to the problem now. The most randomised brute force way of doing the question would be, to pick any random number < N and see if it divides N. If it does, then yay, I found a factor, else try again. What are the chances for success? Less than 1\%… . We will require too many guesses this way…
Can we do something about the case when our guess is not a factor of N? If we can, then our randomized algorithm will work! Turns out, there is a very clever trick we can use.
Notice the Carmichael function - its always even! Also, it gives us a m such that a^m \equiv 1\ mod\ N, or, because m is even, let me rewrite it as a^{2k} \equiv 1\ mod \ N \implies \ a^{2k}-1 \equiv 0 \mod \ N \implies (a^k-1)*(a^k+1) \equiv 0 \ mod \ N (using (x^2-1)=(x+1)*(x-1)). While we get no clear picture just directly looking at it, it is enough to spark a curiosity to explore this approach further. Lets do it!
Now, since m (let me refer to Carmichael’s function \lambda(N) as m ) is even, let us write it in form -
m= 2^K *b where b is odd.
Lets take a random integer x in range [1,N-1]. If x divides N, then good for us! We found a factor, else we know for sure that x^m \equiv 1 \ mod \ N. Now, there one thing we should pay attention to. Carmichael’s function is a m such that every number in range [1,N-1] which is co-prime to N gives 1 as remainder on modding. But, for one particular value, x in our case, there can exist a smaller such value k such that x^k \equiv 1 \mod \ N. In other words, Carmichael’s function is not the smallest m for above equivalence to hold.
Keep in mind that we are exploring form of x^{2k} \equiv 1 \ mod \ N, and have currently represented Carmichael’s function m=2^K*b. Lets try to find the smallest value of m such that x^m \equiv 1\ mod\ N (The setter did this by stripping \phi(N) by powers of 2. Recall that \phi(N) is always a multiple of \lambda(N)). Lets call it q. Obviously, q is even, so let me say q=2k where k is an integer (pardon my frequent definition of variables XD). Now, look at x^{2k} and read carefully what equations I am going to write below.
x^{2k} \equiv 1 \ mod \ N
\implies (x^k+1)*(x^k-1) \equiv 0 \ mod \ N using the same derivation cited before.
Now, two cases can arise, one is bad, and one is good.
- Either (x^k+1) or (x^k-1) is a multiple of N. This will happen if x^k \equiv \pm 1\ mod\ N This is a bad case as no information can be derived from here .
- Neither (x^k+1) nor (x^k-1) is a multiple of N. This holds if x^k \not \equiv \pm 1\ mod \ N. This case is good, why? Because, NEITHER (x^k-1) is divisible by N, NOR (x^k+1) is divisible by N BUT THEIR PRODUCT IS DIVISIBLE BY N! This means that, there must be some factors of N common to N and (x^k \pm 1), and for factors present in (x^k-1), the complement factors must be present in (x^k+1) (as to give a multiple of N.). But thats not important. Whats important is that, N and (x^k \pm 1) have common factors! We an hence, find gcd(x^k \pm 1,N) to find the factors out. Lets call the common factor F. Now we recurse for N/F and F to get full prime factorization, and at the end of recursion we will be done!
Now, what are the chances of success here? That this bad case does not occur, i.e. we should not find x^k \equiv \pm 1\ mod \ N. Also, note that we found smallest value of power (which we called q earlier) so that x^q \equiv 1 \ mod \ N. And then we defined q=2k as q was even. We know that k < q, and hence chances of x^k \equiv 1 \ mod\ N are almost NIL unless x \equiv 1\ mod\ N from start (which is a very bad random choice for x). This was used to optimize the algorithm by many.
I hope that after the pre-reading, the way I explained the general algorithm was clear to you guys, if not then please let me know!
Now, lets analyze where our algorithm may fail and try to cover them up!
c. Reallllly nasty cases-
If N is prime, then what? All our life will go on factorizing something which is not factorizable. One solution of this is, to see that our algorithm has a good chance of success. So, we can use something like “If no successes found even after X tries, this means its not factorizable.” How does this work? Lets say, our algorithm had a 75\% chance of success, what s the probability it fails 30 times a row? Less than 1\%. So, we can heave a sigh of relief XD.
Another nasty case is, if N is power of prime. For this case, very less good values of x are possible. If not careful, we will call N prime by following above fix! Hence, before going to our general algorithm, we can first check if N is power of prime or not. Many contestants observed that highest power possible is log_2(N)=500*log_2(10) at max, which is not very high. So, for all powers p they binary searched the base a so that a^p \le N. At last if they found a^p =N, then handle this case likewise.
Alternatively, what setter did was, he tried getting p'th root for N for all p from 1 to log_3(N). Call this root as r. He now checked if r^p=N or not and handled this case.
Now, we have used “co-prime” word so much in general algorithm, theres another small nasty case. What if N is even? It wont be co-prime to at least half the values in range [1,N-1]. Not a big deal as we have a lot more chance of “guessing” a value of x such that gcd(x,N)>1 then, but it will save some time and recursion calls if we strip N of all powers of x immediately
With this, we are done with the idea of the question :). Hope you enjoyed the editorial so far!
Setter’s solution is attached for reference, also, I will try to give some interesting exercises for curious to see. Thats it from my end for now
SOLUTION
Click to view
from random import randint
from fractions import gcd
def prime(n , k):
if n <= 1:
return False
if n == 2:
return True
if n % 2 == 0:
return False
t = n - 1
s = 0
while t % 2 == 0:
t >>= 1
s += 1
for i in xrange(0 , k):
a = randint(2 , n - 1)
b = pow(a , t , n)
if b == 1:
continue
okay = False
for j in xrange(0 , s):
if b == n - 1:
okay = True
break
if b == 1:
return False
b = (b * b) % n
if not okay:
return False
return True
def getroot(n , k):
l = 3
r = 6
while pow(r , k) <= n:
l , r = r , r + r
while l < r:
mid = (l + r + 1) >> 1
if pow(mid , k) <= n:
l = mid
else:
r = mid - 1
return l
def factorize(n , phi):
if n <= 1:
return []
if n % 2 == 0:
s = 0
while n % 2 == 0:
s += 1
n >>= 1
res = factorize(n , phi)
res.append((2 , s))
return res
if prime(n , 30):
return [(n , 1)]
cur = 9
b = 2
while cur <= n:
p = getroot(n , b)
if pow(p , b) == n and prime(p , 30):
return [(p , b)]
b += 1
cur *= 3
s = 0
m = phi
while m % 2 == 0:
s += 1
m >>= 1
fac = 0
while fac == 0:
a = randint(2 , n - 1)
if gcd(a , n) > 1:
fac = gcd(a , n)
break
tmp = pow(a , m , n)
g = gcd(n , tmp + n - 1)
if g > 1 and g < n:
fac = g
break
for j in xrange(0 , s):
g = gcd(n , tmp + 1)
if g > 1 and g < n:
fac = g
break
tmp = (tmp * tmp) % n
lis1 = factorize(fac , phi)
res = []
for (p , k) in lis1:
s = 0
while n % p == 0:
n /= p
s += 1
res.append((p , s))
lis2 = factorize(n , phi)
for a in lis2:
res.append(a)
return res
t = input()
for i in xrange(0 , t):
n , phi = map(int , raw_input().split())
res = factorize(n , phi)
print len(res)
res.sort()
for (a , b) in res:
print a , b
Tester
Editorialist
Time Complexity=O(???) (for time ccomplexity, check @algmyr’s editorial in pre-reading section.)
Space Complexity=O(T.B.A.) (TBA are not variables…)
CHEF VIJJU’S CORNER
1. Hall of fame for noteworthy solutions-
- gorre_morre - Shoutout to GORRE-MORRE for an elegant solution. Hes one of the best guys to follow around if you’re hunting for code
- algmyr - You all saw it coming XD
- karolis_kusas - Best JAVA submission
- pwild - 0M memory AND best time
- taran_1407 - Second best solution overall, best solution python.
- ?????? - Who might take this place here? :o
2. Setter’s Notes
Click to view
Write \phi(n) = 2^s * m where m is odd , let a be a random integer between [2 , n-1] , if gcd(a , n) is nonzero , we found a nontrivial factor , otherwise we have that (a^m-1)(a^m+1)(a^2m+1)(a^4m+1)...(a^{2^{s-1}} + 1) = a^{\phi(n)} - 1 which is a multiple of n , but from among the factors on left , with probability \frac{3}{4} , none of them divide n , hence taking their gcd with n , we will find a factor of n with high probability , then we recursively factor x and \frac{n}{x} where x is the factor found (Note that we can use the same \phi(N) for factors as we dont really need \phi but any multiple of exponent of Z_n* which is a factor of \phi).
3. Prove the bad case. That is, prove that we cannot derive any information from (a^k-1)*(a^k+1)\equiv 0\ mod\ N if either one of them is a multiple of N.
4. Prove that \phi(N) and \lambda(N) are always even for N>2