PROBLEM LINK:
Author: Sergey Nagin
Tester: Sergey Kulik
Editorialist: Kevin Atienza
PREREQUISITES:
Modular inverses, long division
PROBLEM:
Given two large numbers A and B in base 7, and L, compute A / B \bmod 7^L (and write it also in base 7). It is guaranteed that B is a divisor of A.
QUICK EXPLANATION:
Compute the digits of C := A / B one by one, starting from the least significant. Each digit can be computed using inverses modulo 7, based only on the last nonzero digits of A and B, and once you extract each digit k for a place value, say, i, then you need to subtract B\cdot k7^i from A, in other words update A := A  B\cdot k7^i.
EXPLANATION:
At first glance, it seems the problem requires implementing some optimized arbitrary precision arithmetic code. One common way of doing generalpurpose division of large numbers is to use Newtonâ€™s method to compute to sufficient accuracy the inverse of the denominator using an iteration that uses only multiplication, and then multiplying it to the numerator. Multiplication can be done using standard techniques such as those based on fast Fourier transforms. However, these techniques are quite complex, and even if they might work, theyâ€™re in fact quite overkill since there is a much simpler solution to the problem at hand. This solution works in our case because of the guarantee that B is a divisor of A which turns out to be quite significant.
In the following, please keep in mind that whenever we deal with A, B, A/B, and their digits, weâ€™re working in base 7. We also introduce the following notation: For a number X in base 7, let X_i be its i th least significant digit (starting at i = 0). X_i can be extracted as follows: X_i = \lfloor X / 7^i \rfloor \bmod 7.
Since B divides A, let C := A / B be their (integer) quotient (thus, A = B\cdot C). We want to know the last L digits of C. Suppose we want to know just the last digit of C (denoted C_0). What can we say about it? Well, we know that A = B\cdot C, so it must be the case that the last digit of B multiplied by the last digit of C is equal to the last digit of A, modulo 7. In other words:
This is similar to the fact that, for instance, in base 10, the product of any number ending 6 and another number ending in 9 always ends in 4, which is equivalent to 6\cdot 9\bmod 10.
Whatâ€™s nice about this congruence is that C_0 can already be calculated! Itâ€™s simply:
where B_0^{1} denotes the modular inverse of B_0 modulo 7. Thus, we already have the last digit, without even looking at almost all other digits of A and B!
However, this only works if B_0 is not 0. If B_0 = 0, then you canâ€™t invert B_0 any more to get C_0. But in this case, we can still do something. If B_0 = 0, then automatically we have A_0 = 0, which means that we can cancel the trailing zeroes. Weâ€™ll be left with a new A and B with one less digit, but the last digit of this new B is the old B_1. If this is again a zero, then we can repeat this process, until the last digit of B is not a zero! So we can reduce every case to the case where B_0 \not= 0!
Now that we have C_0, how can we get the next digit, C_1? Since the last digit is C_0, we know now that C is of the form C'\cdot 7 + C_0 for some C', and that C_1 = C_0', so we want to get the last digit of C'. Letâ€™s see what we can do:
Letâ€™s define A' := (A  B\cdot C_0) / 7 to be the value at the left hand side.
Notice that the latter is the same form as the original problem, so somehow weâ€™ll be able to extract the last digit of C' if we have the value A'. But A' = (A  B\cdot C_0) / 7 can be computed in O(\log_7 B) time! (Note that B has \Theta(\log_7 B) digits.) Because remember that C_0 is just a single digit, and when subtracting B\cdot C_0 from A we only need to iterate through the digits of B, not A. Also, dividing by 7 is simply moving the digits one place to the right, which at first glance seems to take O(\log_7 A) time, but we can easily get around this. The following are a few ways:

We can represent A as an array list with the digits stored in decreasing significance (so the least significant digit is last), because popping from the end of an array list can be done in O(1) amortized time.

We can simply represent A as a pointer to a position to the array and a number representing its length. Then to pop from the end of A, simply decrement the length!
So now that we have this number, we can now get C_0' by recursively using the same method. Remember that C_0' is really C_1. To get the next few digits, we simply repeat this until we get the L digits we are looking for!
Optimizations
Since weâ€™re repeating this step L times, and each step takes O(\log_7 B) = O(\log B) time, the overall algorithm runs in O(L \log B) time, which could pass the time limit if implemented with a couple of optimizations in mind. Here, we enumerate some.
First, after ensuring that B_0 \not= 0, we may use only the last L digits of A and ignore the rest. This is because by analyzing our algorithm above, notice that we wonâ€™t ever have any use of all the higherorder digits of A. For instance, if we only want the last digit of C, then we only need the last digit of A. This reduces the amount of work to be done by about half, because the number of digits that need to be updated at each step decreases one by one.
Another way to look at it is the following:
If B \not\equiv 0 \pmod{7}, then there is a unique solution x to A \equiv B\cdot x \pmod{7^L}, so if A = B\cdot C, then it must be the case that C\bmod 7^L = x \bmod 7^L. Thus, we are really only solving for the solution to the congruence A \equiv B\cdot x \pmod{7^L}, and we only really need the values A and B modulo 7^L, i.e. the last L digits of A and B.
Second, we can further reduce the number of operations needed by operating in base 7^k rather than just base 7. To maximize the effect, choose k as large as possible such that 7^k is still below the signed 32bit limit (to not cause too much headache about overflow). In the editorialistâ€™s solution, k = 10 is used. This reduces the number of â€śdigitsâ€ť of A and B by a factor of k, and also reduces the number of â€śdigitsâ€ť we want, L, by a factor of k, so the running time becomes O((L \log B) / k^2) which is a good speedup! Now, the constant becomes higher because we now have to use larger int types, but still youâ€™ll find that this will give a noticeable improvement in running time.
Time Complexity:
O(L \log B)