### PROBLEM LINK:

**Author:** Pavel Sheftelevich

**Tester:** Sergey Kulik

**Editorialist:** Pawel Kacprzak

### DIFFICULTY:

Medium-Hard

### PREREQUISITES:

MO’s algorithm

### PROBLEM:

Given an array A of N non-negative integers, you have to handle M queries. For a single query [L, R], your task is to find the value of (S[2] - S[1])^2 + (S[3] - S[2])^2 + \ldots + (S[R - L + 1] - S[R - L])^2 where S is the **sorted** segment A[L, R].

### QUICK EXPLANATION:

Use MO’s algorithm to handle queries offline.

### EXPLANATION:

#### SUBTASK 1

In the first subtask, both N and M are quite small, up to 200, so the most straightforward solution will work here. For each query [L, R], make a copy of a segment A[L, R], sort it, and compute the result using the equation given in the statement. If you use a quadratic time sorting algorithm, the solution has running time O(M \cdot N^2) which is enough here.

#### SUBTASK 2

In the second subtask, N and M can be up to 10^4. We can solve the problem with these constraints, if we replace a quadratic time sorting algorithm with O(N \cdot \log N) method, and implement the solution efficiently, using static allocated memory. The total time complexity of this method is O(M \cdot N \cdot \log N).

### SUBTASK 3

Here we have to deal with the original limits for N and M, i.e. N, M \leq 5 \cdot 10^4, but elements in the array A are not greater than 100. We can exploit that last constraint and develop the following solution.

Let pos[k] be the sorted vector of indices at which element with value k occurs in A. There are at most 100 such vectors, and we can compute all of them in O(N) time. Now, in order to handle a single query [L, R], we iterate over all possible values from 0 to 100, and check if a particular value v occurs at positions from L to R at least once. This can be done by a binary search on pos[v]. Then, in order to compute the result, we can just compute it using only single occurrences of values of elements in A[L, R], which we just have computed. This is sufficient, because if a certain value v occurs more than once in A[L, R], any of these addition occurrences do not have any impact on the result, because they are consecutive elements in a sorted segment A[L, R], so they add nothing to the sum, because (v - v)^2 = 0. The total time complexity of this method is O(N + M \cdot 100 \cdot \log N)

### SUBTASK 4

Now we are about to solve a more serious problem. We have N, M \leq 5 \cdot 10^4 and there are no extra restrictions on the values in A different than the original constraints.

At the first sight, the problem look really complicated. It is not clear how to figure out the solution for one query knowing the solution for an other one, due to computing their sums in a sorted order. However, we do not have to answer these queries one by one, i.e online. Instead, we can compute and store the answers to all of them in any order and at the end return the stored results. This approach is called an offline processing of queries. More specifically, we are going to use **MO’s algorithm**. The general idea is of this method is also explained here.

We are going to divide array A into bins of size B. We will pick the best suitable B at the end of the analysis. Next, we are going to distribute queries to these bins in the following way: the first bin will contain queries with left end point in a range [0, B - 1], the second one, queries with left end point in a range [B, 2 \cdot B - 1] and so on. We are going to compute answer to queries in each bin independently of another bins, so for now, let’s focus on answering queries from one bin, and remember that we will have to do it for all \lceil N / B \rceil bins.

Now we are about to answer queries from one bin. The first observation we can make, is that all these queries have left point not further away than B to other queries from the bin. This fact was the motivation of distributing queries to bins and we will use it later.

First, we are going to sort queries in the bin by their right end point in ascending order, and process them one by one in this sorted order.

Next, we will need a data structure which allows us to perform fast the following operations: insert, delete, find elements as long as finding predecessors and successors of elements. Any balanced binary search tree is perfect here, and if you are using C++, then std::set is a good choice. Let’s call our data structure S.

At the beginning, we are going to insert all elements from the range of the first query to S and we will compute the answer for this query using the equation given in the statement. This takes O(N \cdot \log N) time, because the range can have up to N elements and inserting each one takes O(N \cdot \log N), if we picked an appropriate implementation of S. The invariant here is that, S contains all elements from the last processed query and that a variable **ans** contains the answer to the last processed query.

We are going to update S, as long as **ans**, for the next query, based on the values from the previous one. How to do that? Well, let’s consider any two adjacent queries in the sorted order [L_i, R_i], [L_j, R_j]. We know that R_j \geq R_i and that the absolute difference between L_i and L_j is not greater than B. Based on these two facts, we are going to insert to S all the elements from a range [R_{i + 1}, R_j] keeping the **ans** up to date after each insertion. Similarly, if L_j < L_i, we will insert to S all elements from a range [L_j, L_{i - 1}]. Otherwise, we will delete elements from a range [L_i, L_{j - 1}]. Notice that we can perform both insertion and deletion keeping **ans** up to date. This is easy to do, because after an insertion, we place a new element in between two elements, so we need to subtract from the **ans** their square difference. In addition, since we inserted the new element, we need to add to **ans** the squared difference between its value and the value of its predecesor. We have to do the same for its successor too. Deletion is analogous. After all these updates, we have computed the answer to the current query, which is very good.

The only remaining question is what is the time complexity of all these operations. Notice that, since all queries are ordered by their right end points in ascending order, we will perform at most N insertions from the right side of queries for the whole bin! This is very nice, because is follows that the total cost of all insertions to right sides of queries for a single bin takes O(N \cdot \log N) time. What about insertions and deletions from the left side of queries? Well, do you remember when we noticed that all these endpoint do not differ more that by B? Using this fact, we know that insertions and deletions from the left side of a query to update **ans** to the answer for the next query take O(B \cdot \log N).

In order to get the total time complexity, we just need to sum up achieved results for all bins. Insertions to all right sides in all bins takes then O(N / B \cdot N \cdot \log N) and insertions/deletions to all left sides to all queries takes O(M \cdot B \cdot \log N). It is easy to notice that picking B = \sqrt N is the best choice here. Doing this, the total time complexity is O(\sqrt N \cdot N \cdot \log N + \sqrt N \cdot M \cdot \log N).

### AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.

Tester’s solution can be found here.