# CHEFKNN - Editorial

Contest

Practice

Author: Bogdan Ciobanu

Tester: Triveni Mahatha

Hard

## PREREQUISITES:

Number Theoretic Transform, Combinatorics

## PROBLEM:

You are given an array of length N. Initially, each element of array is painted with color 0. You are given K turns. In i^{th} turn you can paint any non-empty subarray with color i, while overwriting any color already present. You need to answer how many distinct colorings of the final array are possible modulo 163577857.

## EXPLANATION:

Lets try to find the expression for F(K,N). Here F(K,N) will represent the answer to our given problem.

For doing the same, we will iterate over number of visible colors at the end. Say, number of visible colors are r in final array. Then, the number of ways of choosing r colors from K given colors will be \binom {K-1}{r-1} , because you need to select only r-1 colors from K-1 colors, as K^{th} color will always be visible.

Hence, we can write:

$F(K,N) = \sum_{r=1}^{K} \binom {K-1}{r-1} e(r,N)$

where, e(r,N) represents number of ways in which r colors can be visible finally in array of length N.

Finding an expression for e(r,N)

Lets try to think in reverse manner. Say r^{th} color occupy an interval of length x_1. Number of ways of doing the same will be N-x_1+1. Now, say (r-1)^{th} color occupy an interval of length x_2 in remaining N-x_1 places. You can notice that x_1 wont affect x_2. Basically, you can write:

$e(r,N) = \sum_{x_1=1}^{N-r+1} (N-x_1+1) * e(r-1,N-x_1)$

Try to expand the formula more and you will end up here,

$e(r,N) = \sum \limits_{x_1=1} \sum \limits_{x_2=1}... \sum \limits_{x_r=1} (N-x_1+1)(N-x_1-x_2+1)...(N-x_1-x_2..-x_r+1)$

If you observe it closely, you will find that it is nothing but sum of product of numbers taken r at a time from the set \{ 1,2,3,....N \}. Hence, we can say that e(r,N) is coefficient of x^{n-r} in polynomial P_N(x) where:

$P_N(x) = (x+1)(x+2)(x+3).....(x+N)$

All we are left is now computing the polynomial P_N(x) and we will be done.

Computation of P_N(x)

One can compute the product naively using brute force in order to pass these subtasks in O(N^2).

This subtask can be passed with an O(nlog^2n) solution using divide and conquer, along with NTT.

The task here is to compute the product of n polynomials of the form (x+i). Let A(x) be the product of the first half of (x+i)'s, and B(x) be the product of rest (x+i)'s. We compute A(x) and B(x) recursively, and then compute A(x)⋅B(x) by NTT. The time complexity is T(n)=2T(n/2)+O(nlogn), so T(n)=O(nlog^2n). This idea is illustrated by the following pseudocode:

MUL(L,R) // computes (x+L)(x+L+1)...(x+R)
if (L==R) then return (x+L)
mid = (L+R)/2
A = MUL(L,mid)
B = MUL(mid+1,R)
return A*B //A*B is computed by FFT


We will try to devise an O(nlogn) solution because O(nlog^2n) wont pass here because of strict Time Limit.

Informal way down to the solution

Let’s take a closer look at time complexity of divide and conquer solution:

$T(n) = T(left_{half}) + T(right_{half}) + O(merge)$

Since merge here is nlogn, T(n) comes out to be O(nlog^2n). But what if we can somehow compute right_{half} using left_{half} in O(merge) too. This way the recurrence relation will be:

$T(n) = T(left_{half}) + O(merge) + O(merge)$

\Rightarrow T(n) = T(n/2) + O(nlogn)

which eventually results in T(n) = O(nlogn) with some constant factor. So we are left with computing right_{half} using left_{half}.

Formally,

We can compute P_{2N}(x) using the identity:

$P_{2N}(x) = P_{N}(x) * P_{N}(x+N)$

Main problem here is, finding P_{N}(x+N) when P_{N}(x) is known to you. Say,

$P_N(x) = \prod \limits_{i=1}^N (x+i) = \sum_{i=0}^N c_i.x^i$ where all $c_i$'s are known to you. Then, $P_N(x+N) = \prod \limits_{i=1}^N (x+N+i) = \sum_{i=0}^N c_i.(x+N)^i$

\Rightarrow P_N(x+N) = \sum_{i=0}^N h_i.x^i

where,

$h_i = \sum \limits_{j=i}^N c_j . \binom{j}{i} . N^{j-i}$

\Rightarrow h_i = \frac{1}{i!} \sum \limits_{j=i}^N (c_j.j!)\left (\frac{n^{j-i}}{(j-i)!} \right )

$\Rightarrow h_i = \frac{1}{i!} . ($coefficient of x^{N-i} in A(x)B(x))
where,

$A(x) = \sum \limits_{i=0}^{N} (c_{N-i}.(N-i)!) . x^i$, and $B(x) = \sum \limits_{i=0}^{N} \left( \frac {N^i}{i!} \right) . x^i$

Now we have an algorithm to find P_N(x+N) in O(NlogN) using convolution of A(x) and B(x) when P_N(x) is known to us. Now let’s take a look at the pseudocode to solve this subtask:

MUL(N) // computes (x+1)(x+2)...(x+N) in O(NlogN)
if N==1:
return (x+1)
C = MUL(N/2)
H = convolute(A,B) // use C to obtain A
ANS = convolute(C,H)
if N is odd:
ANS *= (x+N) // naive multiplication will do - O(N)
return ANS


Only thing, that is left out is how to multiply two polynomials modulo 163577857. Notice that 163577857 = 39.2^{22} + 1, which is NTT friendly. This reduces pain for us. For those who are not familiar with NTT may refer this link.

### Optimizations

When you implement this algorithm naively, you might notice that it takes too much time to process an input for N=10^6. Unfortunately, the constant hidden in the O notation seems to be too high! The time complexity is correct, so we will need additional tricks to cut down this constant and be able to squeeze the solution into the time limit. Here are a few optimizations:

• Try reducing the mod operation as much as possible everywhere.
• Use array instead of vector for NTT. Basically, use a fast NTT code.
• Memory usage: Try not to use excess memory as it slows down the code. A runtime difference of upto 0.5s can incur between 130M and 70M, depending on other parameters.
• In addition to these, you may need to optimize your code in other ways like changing the data type if it still doesn't pass the time limit.

## Alternate Solution

This solution was not intended. However, 40\% of the solvers used this trick. The idea is to use the solution for subtask #3 to pass subtask #4 too. A naive implementation of subtask #3 for all the test cases leads to O(T.N.log^2N) solution. If we can remove the T part and do some optimization we will be able to pass subtask #4 too.

How to remove the T part? Let’s say that value of N over different test cases is represented by N_1, N_2, N_3, ..., N_T. We will take all input at once and sort the values of N in ascending order. Let n_i's represent the sorted sequence of N now. We will now call MUL(1,n_1), MUL(n_1+1,n_2), … , MUL(n_{T-1}+1,n_T). Now, we can use prefix multiplication of these terms to find the answer for all test cases. The time complexity of the same can be found approximately using following sum:

$O(N_1log^2N) + O((N_2-N_1)log^2N) + ..... + O((N_T-N_{T-1})log^2N) = O(Nlog^2N)$

## Time Complexity:

O(T.N.logN) or O(N.log^2N)

## AUTHOR’S AND TESTER’S SOLUTIONS

Setter’s solution

Tester’s solution

Editorialist’s solution

9 Likes

I didn’t understand this part.

Hence, we can say that e(r,N) is
coefficient of x^(n−r) in polynomial
PN(x) where

How did you come to this conclusion ?

When you got the formula for e(r, N), at that point one can directly notice that it resembles to formula for Stirling Numbers of First Kind.

Check my comment above. You can search for generating function for Stirling Numbers of First Kind. That polynomial is basically that.

1 Like

Actually we can prove the stirling numbers identity via differentiation too. I was playing with differentiating generating functions, and then I got a pattern that lead me to stirling numbers of the first kind.

2 Likes

I got stuck on this relation: F(n, k) = F(n - 1, k) + F(n, k - 1) + (n - 1) * F(n - 1, k - 1). And couldn’t come up with faster solution 2 Likes

We can also compute the DFT directly without any divide and conquer and then compute the inverse DFT.

1 Like

I saw a solution like that. Don’t remember if it was your’s, though. But I am interested in this solution. Can you please explain little more?

Just think of it as having to compute (w+1)…(w+n)(w+1)^(k-1) for n different integer values of w. This can be done efficiently if you precompute a! % mod for all a < mod. This takes about 1.5 seconds. The rest of the stuff was TLEing but fast NTT code + one optimization using Wilson’s will get AC. It is interesting that the setter kept mod as 1e8ish instead of the usual 9e8ish, seems that he anticipated such solutions and was allowing them to pass as well. I initially figured out the straightforward T(n) = T(n/2) + O(n log n) pretty easily but thought it had a runtime of O(n log^2 n) Yeah, I got this too, although I did not take part in the contest. But, I was stuck here. Tried the Generating function approach too after this, but again got stuck at solving a Partial Differential Equation. I just want to know, is this a wrong approach for this problem or a solution is available after this? Anyone?

Just some tricks to optimize mod operations, instead of actually writing c = (ab)%MOD where MOD = 163577857 , one should directly write c = (ab)%163577857 , this is because for the MOD of the form n2^k+1, compiler automatically optimizes the MOD for constant values in terms of some multiplications and additions. Also when you have a operation of this kind = c = (a-b+MOD)%MOD one can optimize it as c = (a-b) ; if(c<0) c+=MOD , but one can have even better version of this as c = (a-b) , c = (c-MOD(c>>63)) ,this basically takes only 4 primitive operations, and it does work better in many cases. Similar analogue exists for the operation c = (a+b)%MOD as c = (a+b) , if(c>=MOD) c-=MOD , can be also done by c = (a+b-MOD) , c = (c-MOD*(c>>63)), although this takes 5 primitive operations and may not always give a speedup.

Note : I assume a,b,c are of long datatype in Java, long long int datatype in C++

4 Likes

Can anyone prove how the coefficient of x^(n-r) in (x+1)(x+2)(x+3)…(x+n) is the sum of product of nos taken r at a time from the set {1,2,3,4,5…N}.
Your answer will be appreciated I am doing exactly what is stated in editorial for O(N*log^2 N) solution, but getting WA (was doing this even during contest, but was getting WA even there.)

1 Like

Found Bug: Overflow in NTT part 1 Like

Think of it this way. When you evaluate the polynomial product, you make n choices from each of the n terms of the form (x + k). From a single term, you choose either x or the number. You then multiply your choices together and get a part of the final product. If you make all possible 2^n selections and add them together, you get the final result.

Now out of these n terms if you choose the number r times, the rest n-r times you must choose x. In this case the result of multiplication is the product of those r numbers you chose, times x^{n-r}. So you can see that all possible selections with r numbers contribute to the coefficient of x^{n-r} in the final result.

You can also take a look at Vieta’s formulas, a more general rule.

Starting from the expression for e(r,N)
\begin{equation}
e(r,N)=\sum_{x=1}^{N-r+1}(N-x+1) \times e(r-1,N-x)
\end{equation}
write down expression for e(r,N+1), and manipulate
\begin{equation}
e(r,N+1) = \sum_{x=1}^{(N+1)-r+1}((N+1)-x+1) \times e(r-1,N+1-x)
\end{equation}
\begin{equation}
~~ = (N+1)e(r-1,N)+\sum_{x=2}^{(N+1)-r+1}((N+1)-x+1) \times e(r-1,N+1-x)
\end{equation}

\begin{equation}
~~ = (N+1)e(r-1,N)+\sum_{x=1}^{(N+1)-r}((N+1)-x) \times e(r-1,N-x)
\end{equation}

\begin{equation}
~~ = (N+1)e(r-1,N)+e(r,N)
\end{equation}

From here one can build a proof by induction argument.

I got the same relation as well. This can be thought of as moving in a grid with up, right and up-right moves allowed and whenever we move up-right, the x coordinate gets multiplied.

Have the test cases been updated after the contest to prevent Alternate solution submissions???

My submission is continuously getting TLE for last file only.
https://www.codechef.com/viewsolution/17848641

1 Like

Cool method to get P(x+N) from P(x). Thanks for the problem, learnt something new 