PROBLEM LINK:
Author: Sergey Kulik
Tester: Yanpei Liu
Editorialist: Pawel Kacprzak
DIFFICULTY:
Hard
PREREQUISITES:
Segment tree, sets
PROBLEM:
Your task is to maintain a list of sets of integers numbered from 1 to N. All sets are initially empty. You have to handle M operations, each of one of the following two types:
- \texttt{INSERT(L, R, X)}: inserts integer X to all sets numbered from L to R
- \texttt{QUERY(Q)}: outputs the size of the set numbered Q
QUICK EXPLANATION:
For a partial credit, you can pretty easy use the most straightforward approach by just following the process described in the statement. In order to get full credit, for each inserted element e, you can use set data structure to store the ranges of sets having e. In order to get the number of elements in each set, you can use segment / interval tree to maintain these values.
EXPLANATION:
Simple solution
If N and M are quite small, like in the first two subtasks, where N, M \leq 1000, the problem is very easy to solve.
We can store an array A of N sets, where A[i] represents a set numbered i. Since all integers, which can be inserted to sets are within the range [1..N], we can store each set as an array, i.e. A[i][j] = 1 if and only if integer j is in the set numbered i.
In order to handle a single insert operation, we can iterate through sets numbered from L to R, and insert the requested X into all of them. This operation takes O(N) time.
Handling a query operation is obvious, we just count the number of 1's in the array representation of the requested set.
Since there are M operations to perform, we can achieve O(N \cdot M) time complexity to handle all of them.
The total time complexity of this method is O(N^2 + N \cdot M).
Faster solution
However the above method is sufficient for small inputs, we need a better one to get a full credit here.
Rather than storying, for a fixed set, the information what elements are in this set, we can think the other way around.
Let S[e] be the ordered list of disjoint ranges of sets having element e. For example, if a range [L, R] is in S[e], it means that e belongs to all sets numbered from L to R. We say that a range [A, B] is before a range [C, D] in our order if and only if A < C or A = C and B < D. The disjoint property is very important in the definition, we will use it as the invariant. We are going to implement S[e] as a balanced binary tree in order to be able to handle insert queries. If you are using for example \texttt{C++} you can use \texttt{std::set< pair < int, int > >} to implement a single S[e].
The second data structure that we are going to use is a segment tree T. In this tree, we will store the information on how many elements are in sets numbered from some L to R. We will use it to update the number of elements in a range of sets and to get the number of elements in a particular set.
Initially, both structures are empty. This means that S[e] is empty for all valid e and all nodes in T are initialized to have 0 elements.
Now, we are going to describe, how we handle both operations. We will start with insert, because it exposes the nature of the problem in a great manner.
Insert operation
Suppose that we are going to insert element X to all sets numbered from L to R. The idea of the solution is to find in S[X] the set Y of all overlaping ranges with [L, R]. After inserting X to sets numbered from L to R, all sets in a range Y \bigcup [L, R] will have element X. What we are going to do, is to replace all ranges from Y in S[X] with a single range representing the just mentioned sum of ranges.
In order to do that, for which range A \in Y, we decrease the number of elements in a range A \bigcap [L, R] by one, and delete A from S[X]. At the end, we increase the number of elements in [L, R] by one, and we insert the range Y \bigcup [L, R] to S[X]. Notice that we maintain the number of elements of sets in a range using our segment tree T.
Let’s now explain some details on how to implement the logic described above.
In order to get the set Y, first we find in S[X] the leftmost range of sets which overlaps with [L, R]. Do you remember that we implemented S[X] as a balanced binary tree and that all ranges in that tree are disjoint? Based on these two facts, we can quickly find the leftmost range which overlaps with [L, R], in \texttt{C++} you can use \texttt{lower\_bound} method of \texttt{std::set} to achieve that. After that, we can iterate over all ranges overlapping with [L, R] using \texttt{successor} operation on S[X].
All operations of increasing and decreasing the number of elements in sets of a given range are implemented on segment tree in a standard way.
You may wonder what is the complexity of the insert operation. Well, you can notice that each segment is inserted exactly once and erased at most once. Since a single insert / delete / lower_bound / successor operation on S[X] takes O(\log(n)) time and update operation on T also takes O(\log(N)) time, a single insert operation has O(\log(N)) time complexity.
Query operation
Now, the easier operation, we want to know, how many elements are in the set numbered Q. Because we maintain our segment tree T, we can answer each such query in O(\log(N)) time easily by summing up accumulated values at the path form a leaf corresponding to the set Q to the root of T.
Time complexity
The total time complexity of this solution is O(N + M \cdot \log(N)) and this is a very nice speed up over the simple solution.
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here.
Tester’s solution can be found here.