SERSUM - Editorial




Author: Trung Nguyen

Tester: Suchan Park

Editorialist: Suchan Park




Generating functions, Faulhaber’s formula, modular inverse, how to compute convolutions using FFT


The problem is short, so just take a look.



\begin{aligned} f(x, k) & = \sum_{n=1}^{N} (x + a_{n})^k \\ & = \sum_{n=1}^{N} \sum_{i=0}^{k} \binom{k}{i} a_n^{k-i} x^i \\ & = \sum_{i=0}^{k} \sum_{n=1}^{N} \binom{k}{i} a_n^{k-i} x^i \\ & = \sum_{i=0}^{k} \left\{\binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{k-i}\right ) \right\} \end{aligned}

This form immediately suggests, that we might be able to make use of convolutions. If we let L_x(i) = \binom{k}{i} x^i and R(i) = \sum_{n=1}^{N} a_n^{i}, we can write the above equation as:

f(x, k) = \sum_{i=0}^{k} L_x(i) \cdot R(k-i)

However, the first problem we encounter is that x is not fixed, actually there are at most 10^{18} possible values of x. So let’s stop here, and try to write down the equation of g(t, k).

\begin{aligned} g(t, k) & = \sum_{x=0}^{t} f(x, k) \\ & = \sum_{x=0}^{t} \sum_{i=0}^{k} \left\{\binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{k-i}\right ) \right\} \\ & = \sum_{i=0}^{k} \sum_{x=0}^{t} \left \{ \binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{k-i}\right ) \right \} \\ & = \sum_{i=0}^{k} \left \{ \binom{k}{i} \sum_{x=0}^{t} x^i \times \left (\sum_{n=1}^{N} a_n^{k-i}\right ) \right \}\\ \end{aligned}

So, this also looks like a convolution. If we let L(i) = \binom{k}{i} \sum_{x=0}^{t} x^i and R(i) = \sum_{n=1}^{N} a_n^{i}, we can write g(t, k) as

g(t, k) = \sum_{i=1}^{k} L(i) \cdot R(k - i)

and this can be computed by FFT. Putting aside of dealing with weird modulo 10^9 + 7, the only thing left is to compute L and R.

Computing L

It is well-known to compute \binom{k}{i}. If we precompute all factorials and inverse of factorials modulo 10^9 + 7, \binom{k}{i} \equiv k! \cdot (i!)^{-1} \cdot \left((k-i)!\right)^{-1} \mod 10^9 + 7.

Let’s deal with the \sum_{x=0}^{t} x^i part. The form seems so well-known, so when we google it, we are able to find the solution - Faulhaber’s formula (I changed the names of variables to match with our problem)

\sum_{x=0}^{t} x^i = \frac{t^{i+1}}{i+1} + \frac{1}{2} t^i + \sum_{k=2}^{i} \frac{B_k}{k!}\frac{i!}{(i-k+1)!}t^{i-k+1}

where B_{k} are Bernoulli numbers. Suppose we don’t know what it is.

The first two terms are straightforward to calculate. We just need the modular inverses of 1, 2, \cdots, k. The rightmost term seems complicated, but if we assume we know the values of B_{k}, we can see, from k and i-k,

\begin{aligned} \sum_{k=2}^{i} \frac{B_k}{k!}\frac{i!}{(i-k+1)!}t^{i-k+1} & = i! \cdot \sum_{k=2}^{i} \frac{B_{\color{red}k}}{{\color{red}k}!}\frac{t^{{\color{red}i-k}+1}}{({\color{red}i-k}+1)!} \end{aligned}

are actually convolutions, with k = 0 and k = 1 excluded. So we can easily compute these values for all i, if we know the values of B_{k}.

What is Bernoulli numbers? This wikipedia page explains a lot of knowledge, but what we need is only the values. Take a look at the definition part. The recursive definition says:

B_m^+ = 1 - \sum_{k=0}^{m-1} \binom{m}{k} \frac{B^+_k}{m - k + 1}

where the + part is not important since we are interested in m \ge 2. (see ) If we compute Bernoulli numbers B_2, B_3, \cdots, B_k using this formula, the time complexity would be O(k^2), but since the time limit is 6 seconds and this has really small constant, some people managed to squeeze in the time limit. However, for the editorialist’s sake, we want something more faster.

And the “generating function” part gives a nice formula:

\frac{x}{e^x - 1} = \sum_{m=0}^\infty \frac{B^{-{}}_m x^m}{m!}

where, again, the - part is not important since we are interested in m \ge 2. This means, if we are able to compute the power series of the left hand side until x^{k}, we can easily compute B_m.

Since we want to write LHS in power series form, let’s do it right now:

\begin{aligned} \frac{x}{e^{x} - 1} & = \frac{x}{\sum_{i=0}^{\infty} \frac{x^i}{i!} - 1} \\ & = \frac{x}{\sum_{i=1}^{\infty} \frac{x^i}{i!}} \\ & = \frac{1}{\sum_{i=1}^{\infty} \frac{x^{i-1}}{i!}} \\ & = \frac{1}{\sum_{i=0}^{\infty} \frac{x^i}{(i+1)!}} \\ \end{aligned}

So the thing we need is to invert the power series. However, note that we only need to know the coefficients until x^{k}, which means we are interested in computing the power series of:

\frac{1}{\sum_{i=0}^{\color{red}k} \frac{x^i}{(i+1)!}}

which is now finite. How should we deal with this? The idea was introduced 4 years ago in Codeforces Round #250 Div.1 E, and you can find a nice explanation at the “There is an interesting algorithm which calculate the inverse of a power series F(z)” part.

Computing R

It seems really hard at first sight, but note that we know how to compute the inverse of a power series as many terms as we want.

In the world of power series,

\frac{1}{1-x} = 1 + x + x^2 + x^3 + \cdots

as most of us learned in algebra class. Also,

\frac{1}{1-ax} = 1 + ax + a^2x^2 + a^3x^3 + \cdots


\begin{aligned} \sum_{n=1}^{N} \frac{1}{1-a_{n}x} = & \space{} (1 + 1 + \cdots 1) \\ & + (a_1 + a_2 + \cdots + a_N)x \\ & + (a_1^2 + a_2^2 + \cdots + a_N^2)x^2 \\ & + (a_1^3 + a_2^3 + \cdots + a_N^3)x^3 \\ & + \cdots \end{aligned}

So if we are able to sum up those inverses efficiently, we can extract the coefficients and get the answer. How could we do that? First, write down what we want to compute.

\sum_{n=1}^{N} \frac{1}{1-a_{n}x} = \frac{(1-a_{2}x)(1-a_{2}x)(1-a_{3}x)\cdots(1-a_{N}x) + (1-a_{1}x)(1-a_{3}x)\cdots(1-a_{N}x) + \cdots + (1-a_{1}x)(1-a_{2}x)(1-a_{3}x)\cdots(1-a_{N-1}x)}{(1-a_{1}x)(1-a_{2}x)(1-a_{3}x)\cdots(1-a_{N}x)}

The denominator looks simple, so let’s start from here. If we calculate this naively it takes O(N^2) time. So let’s try to narrow down the time complexity using Divide and conquer. It is really straightforward:

  • Problem: compute (1-a_{1}x)(1-a_{2}x)(1-a_{3}x)\cdots(1-a_{N}x).
  • Divide: Divide into two subproblems (1-a_{1}x)(1-a_{2}x)(1-a_{3}x)\cdots(1-a_{N/2}x) and (1-a_{N/2+1}x)(1-a_{N/2+2}x)(1-a_{N/2+3}x)\cdots(1-a_{N}x) and solve each of them.
  • Conquer: Multiply those two results using FFT.

The time complexity of this algorithm satisfies T(N) = 2T(N/2) + O(N \log N), which gives the solution T(N) = O(N \log^2 N) by Master’s Theorem.

Now, it is really tempting to solve the compute the whole thing using divide and conquer too. Let’s try:

  • Problem: compute \sum_{n=1}^{N} \frac{1}{1-a_{n}x} as power series.
  • Divide: Divide into two subproblems \sum_{n=1}^{N/2} \frac{1}{1-a_{n}x} = \frac{A}{B} and \sum_{n=N/2+1}^{N} \frac{1}{1-a_{n}x} = \frac{C}{D}
  • Conquer: \frac{A}{B} + \frac{C}{D} = \frac{AD+BC}{BD}. Both the numerator and denominator can be calculated using FFT.

The time complexity of this algorithm also satisfies T(N) = 2T(N/2) + O(N \log N), which therefore gives the solution T(N) = O(N \log^2 N) by Master’s Theorem. Note that this algorithm only gives the numerator and denominator in a polynomial form, so you have to additionally divide the numerator by the denominator using the algorithm above, in O(N \log N).

In conclusion, we managed to do several convolutions to compute the answer. The only thing left is how to deal with modulo 10^9 + 7. The most safe way is to do three NTTs using three different large FFT primes \approx 10^9, and then use Chinese Remainder Theorem to recover the answer. I believe this is not the important part and I’m in a rush, so I skip this part.

Implementation for this problem is not available, since the tester’s Kotlin solution is too slow (I’m terribly sorry :())