STRAIGHT - Editorial


Author: Abhishek Kanhar
Tester: Abhishek Sajwan
Editorialist: Abhishek Sajwan




FFT, Sqrt Decomposition


Key trick is to consider the arrays as polynomials. For example, the polynomial A becomes A[0]x0 + A[1]x1 + …. so on.
To execute queries of the first type, we can shift the array B to the corresponding index of array A that needs to be updated. For example, we have the query “1 4”, then we need to shift the polynomial B with x3 and then add the resulting polynomial to polynomial A.
But this solution will still take O(N2log(N)).

To make it more faster, we can split the queries into blocks of size sqrt(Qlog(Q)), and then update the array A at the end of each block.

Now we look at each block separately.

For queries of type 1 l, we now just store the indices l.

For queries of type 2 l r, we need the sum(l,r) of array A updated before this block and the sum of ranges array B according to the stored indices. All this can be done by pre-computing the presum values of arrays. The complexity of this operation is of the order of no. of stored indices which is O(sqrt(Qlog(Q))).

At the end of each block, we update the array A according to the stored indices. For this we need the convolution of array B and the the stored indices. We create the polynomial for stored indices and then do the convolution. This if done by FFT or NTT(with larger modulo).

Minor optmization - Use NTT instead of FFT, calculate transform of B only once in the problem

Expected Complexity - O(Nsqrt(Nlog(N)))

Author’s Solution can be found here

Tester’s Solution can be found here

Hi Abhishek,

Can you please explain in reference to brute force approach(posted below) which hardly takes O(queries*N) time as Why should we opt for fft approach. I couldn’t understand the complexities mentioned in this editorial “O(N2log(N))” and “sqrt(Qlog(Q))”. Please explain while considering me as a beginner in codechef, Your explanation will be very much helpful to me, Thanks in advance

n,k = map(int,input().split())
l = list(map(int,input().split()))
l1 = []
can_swap = [0]*n
for i in range(n):
    if i+k<n-1 or="" i-k="">0:
x = 0
for i in range(n):
    if can_swap[i] == 1:
        l[i] = l1[x]

Why the approach using cumulative sum is also showing TLE? Doesn’t it takes constant time for look up and finding the sum in given range
like c_sum[j] - c_sum[i];

Could anyone tell me why my code is showing wrong answer
thanks in advance.