PROBLEM LINK:
Author: Misha Chorniy
Tester: Kevin Atienza
Translators: Sergey Kulik (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza
DIFFICULTY:
Hard
PREREQUISITES:
Range trees, segment trees, functional data structures, persistent segment trees
PROBLEM:
You are given an array A[1\ldots N] of positive integers. You have to answer Q queries on it of the following type:
- Given l, r and k, let S denote the sorted list of distinct elements of A[l\ldots r]. What is the k th number in this list? If it doesn’t exist, output -1.
Additionally, each query partly depends on the answer to the previous query, so you’re forced to answer the queries in the given order (except in subtask 3).
QUICK EXPLANATION:
First, normalize and compress the values of A[1\ldots N], so they are in \{0, 1, \ldots, N-1\}.
The answer for a query is equivalent to the smallest V such that A[l\ldots r] has at least k distinct values, so we can find it using binary search. We just need to answer the following question:
Given l, r, V, how many distinct elements in A[l\ldots r] are \le V? This can be turned into a geometry query. First, we define N points \{(x_i,y_i,z_i) : 1 \le i \le N\}, where (x_i,y_i,z_i) is defined as follows: Suppose A[i] = v. Also, let j be the index of the previous occurrence of v, i.e. A[j] = v and j < i is the largest such value. If no such j exists, set j = 0. Then (x_i,y_i,z_i) = (v,j,i).
The answer to the query above is now the number of points (x_i,y_i,z_i) such that x_i \le V, y_i < l and l \le z_i \le r. Thus, we now have standard geometric range queries! Using 3D range trees, we can preprocess the points in O(N \log^3 N) time and each query can be answered in O(\log^3 N) time, but with binary search it becomes O(\log^4 N) which might be too slow. Instead, we can use fractional cascading or functional/persistent segment trees to reduce these to O(N \log^2 N) and O(\log^3 N), respectively, and with some clever tree traversals we can even reduce query time to O(\log^2 N).
EXPLANATION:
Subtask 1
In the first subtask, we have Q\times N \le 10^7, so solutions that answer each query in O(N) time, or even O(N \log N) time, is acceptable. In this case, the straightforward approach works:
- Compute the sorted version of A[l\ldots r].
- Remove the duplicates.
- Find the k th number in the list, or if the list has less than k elements, output -1.
This takes O(N \log N) time in the worst case, so doing this Q times takes O(QN\log N) time. With a reasonable implementation, this will pass the time limit of the first subtask.
Subtask 2
In the second subtask, we no longer have Q\times N \le 10^7, but instead we have the additional restriction k = 1. It means that we are finding the first element in the sorted list of distinct elements of A[l\ldots r]. Now, regardless of whether there are duplicates or not in A[l\ldots r], this is clearly the same as the minimum element in A[l\ldots r]. In other words, the query reduces to a standard range minimum query, which has lots of possible solutions! Two fast approaches to this are the following:
Approach 1: Construct a segment tree on top of the array, so that each node represents some subarray of A and also contains the minimum among all elements in this subarray. Constructing this tree takes O(N) time, and afterwards, each range minimum query can be answered in O(\log N) time.
Approach 2: Construct a table of minimums, T[1\ldots N][0 \ldots \log_2 N], where T[i][k] is the minimum element in the subarray A[i\ldots i+2^k-1]. This table can be constructed in O(N \log N) time using dynamic programming. Afterwards, each range minimum query can be answered in O(1) time: The minimum value in A[l\ldots r] is \min(T[l][k], T[r-2^k+1][k]), where 2^k is the largest power of two \le r - l + 1.
Asymptotically faster solutions exist, but they are much complicated, and in any case, this solution doesn’t apply to other subtasks, so there isn’t much use in them. You can read about faster solutions here.
Subtask 3 & 4
Now, let’s move on to answering the hardest subtask, which is the original problem without any additional restrictions. Note that subtask 3 is just an offline version of the problem, which may permit some simpler solutions, but we won’t discuss them here.
Binary search
The first step is realizing that a query can be converted into a form that is suitable for binary search. Each query is equivalent to the following:
Find the lowest value V such that the number of distinct elements of A[l\ldots r] that are \le V is at least k.
This is certainly solvable with binary search for the value V, as long as we can answer the following question efficiently:
Given l, r and V, how many distinct elements of A[l\ldots r] are \le V?
So let’s focus on this query, which now looks more like a geometric range query. Unfortunately, the word “distinct” prevents us from converting this problem directly into a geometric query. Without “distinct”, one fast solution for this problem would be:
- For preprocessing: for each 1 \le i \le N, add the point (x_i,y_i) = (A[i],i).
- For each query (l,r,V): count the number of points in the rectangle [0,V]\times [l,r].
Thus we have reduced each query to 2D geometric queries, for which there are some well-known solutions, for example using 2D range trees.
But since we only need to count distinct elements, we need to be more creative with our query. The solution above counts duplicate entries multiple times, so we must find a solution that doesn’t do that. One way is to only count the first occurrence of each distinct value in the range A[l\ldots r].
How do we accomplish this? Well, for a particular value v, suppose A[i] = v for some l \le i \le r. For A[i] to be the first occurrence of v in A, the following additional condition must hold true:
If j is the largest index < i such that A[j] = v, then j < l. (If A[i] is the first occurrence of v in the whole array, we can set j = 0.)
Thus, we can now refine our geometric query above to the following:
- For preprocessing: for each 1 \le i \le N, let j be the largest index < i such that A[j] = A[i], or j = 0 if no such index exists. Then add the point (x_i,y_i,z_i) = (A[i],j,i).
- For each query (l,r,V): count the number of points in the 3D “rectangle” [0,V]\times [0,l-1]\times [l,r].
Now we have properly converted the query into a 3D geometric range query!
Another thing: due to the binary search, whatever complexity we get in solving this problem will get multiplied by O(\log A_{\max}) time. We can reduce this to just O(\log N) by compressing the values of A to the range [0\ldots N-1]. This preprocessing step can be done in O(N \log N) time, and when printing the answer we just need to remember to revert back to the original list of values.
Geometric queries
Our goal is to answer 3D range queries. Specifically, given N points \{(x_i,y_i,z_i) : 1 \le i \le N \}, answer the following kind of query:
How many points are in the range [a,b]\times [c,d]\times [e,f]?
The usual way of solving k-dimensional range queries is with k-dimensional range trees. You can think of a 1D range tree as a simple segment tree, and a k-dimensional range tree as a balanced tree of (k-1)-dimensional range trees. Such a structure on N points can be constructed in O(N \log^k N) time, and each k-dimensional “rectangle” query can be answered in O(\log^k N) time, by using the usual segment tree recursion at each dimension of the tree.
The following is a pseudocode of a 2D range tree:
// 1D tree
// this is just a normal segment tree
class Tree1D:
constructor(Y, i, j):
// Y is the sorted list of values
// Y[i..j] is the relevant range
this.i = Y[i]
this.j = Y[j]
if i == j:
// this is a leaf
this.count = 1
this.l = this.r = NULL
else:
k = (i + j) / 2
this.l = new Tree1D(Y, i, k)
this.r = new Tree1D(Y, k+1, j)
this.count = this.l.count + this.r.count
range_query(a, b):
// count the number of elements in range [a,b]
if a <= this.i and this.j <= b:
return this.count
else if this.j < a or b < this.i:
return 0
else:
return this.l.range_query(a, b) + this.r.range_query(a, b)
// 2D tree
class Tree2D:
constructor(P, i, j):
// P is the list of points, sorted on "x"
// P[i..j] is the relevant range
this.i = P[i].x
this.j = P[j].x
Y = sorted list of elements of [P[i].y, P[i+1].y, P[i+2].y, ..., P[j].y]
this.tree = new Tree1D(Y, 0, Y.length-1)
if i == j:
this.l = this.r = NULL
else:
k = (i + j) / 2
this.l = new Tree2D(P, i, k)
this.r = new Tree2D(P, k, j)
range_query(a, b, c, d):
// count the number of points in the rectangle [a,b] x [c,d]
if a <= this.i and this.j <= b:
return this.tree.range_query(c, d)
else if this.j < a or b < this.i:
return 0
else:
return this.l.range_query(a, b, c, d) + this.r.range_query(a, b, c, d)
Notice that both classes have very similar implementations, but each node for Tree2D contains a Tree1D which handles the “y” values. You can probably figure out how to construct 3D (or higher-dimensional) range trees from this.
Also, this construction takes O(N \log^2 N) time because of the sorting. Afterwards, each range_query
call runs in O(\log^2 N) time.
Unfortunately, in our case, we will need 3D range tree, and each query takes O(\log^3 N) time. And because we’re doing a binary search on top of that, each query for the original problem takes O(\log^4 N) time, which is probably too slow to pass the time limit.
Instead, we will describe two techniques to improve the running time of each 3D range query.
Technique 1: Fractional cascading
We can use fractional cascading to reduce a 3D range query from O(\log^3 N) to O(\log^2 N). We can illustrate this idea using 2D range trees.
We can think of a 2D range tree as a segment tree on the x coordinate, where each node also contains a segment tree, this time on the y coordinate. The first step is to replace the segment trees on y with simple sorted arrays, and replace range queries with simple binary searches.
Now this doesn’t reduce the asymptotic running time, but it helps reduce the complexity of our structure. For example, we can now construct a 2D range tree in O(N \log N) time instead of O(N \log^2 N) by using a merge-sort-like algorithm; Suppose we want to construct a 2D range tree over the points \{(x_1, y_1), (x_2, y_2), \ldots, (x_N, y_N)\}, and suppose this is sorted by x and then by y. Then:
class RangeTree2D:
constructor(i, j, xi, xj, y_list, left, right):
this.i = i
this.j = j
this.xi = xi
this.xj = xj
this.y_list = y_list
this.left = left
this.right = right
def construct(P, i, j):
// constructs the the tree from the i'th to the j'th point
if i == j:
return new RangeTree2D(i, i, P[i].x, P[i].x, [P[i].y], NULL, NULL)
else:
k = (i + j) / 2
left = construct(P, i, k)
right = construct(P, k+1, j)
y_list = merge(left.y_list, right.y_list) // merge sort's "merge"
return new RangeTree2D(i, j, P[i].x, P[j].x, y_list, left, right)
Notice that instead of sorting y_list
per node, we just “merge” the y_list
s of the two children in linear time, because they’re already sorted.
Now, here comes the “fractional cascading” part. The reason each range query runs in O(\log^2 N) time is that we perform a binary search over every node we visit. But we can speed this up by noticing that we are binary searching for the same value over and over again, and that we are only visiting nodes in a single path from the root to the leaf! The idea is to use this fact so that after a single binary search at the root, we don’t have to perform another binary search again.
We do this by setting up lots of pointers. For each node, and for each i from 0 to y_list.length-1
, we keep track of two values.
- The largest index j such that
left.y_list[j] <= y_list[i]
. - The largest index j such that
right.y_list[j] <= y_list[i]
.
Using these pointers, once we binary search for a value at the root, we don’t have to binary search for this value at the child—we just follow the pointer! This allows us to reduce range query complexity to O(\log N).
Now, I admit that I skimmed over some details, but I hope you at least get the idea behind.
By performing this technique for every 2D range tree inside our 3D range tree, our range query reduces now to O(\log^2 N).
Technique 2: Functional data structures
To improve the running time, we will try to use a different approach, via functional data structures. Functional data structures are data structures that are implemented under the functional programming paradigm. One of the key characteristics of functional programming is that functions don’t have “side effects”, and variables don’t change once assigned. (In fact, the term “variable” becomes a misnomer.) Data structures built under this paradigm are consequently persistent, i.e., older versions of the data structure are still fully available.
To write our structures under this paradigm, we need only ensure that we keep our functions from having side effects, like changing some global variables or attributes. For example, to implement a functional 1D segment tree, instead of updating the attributes when we perform an update, we simply fully copy the nodes that have been updated. On an update, this makes us copy O(\log N) nodes, but the upside is that we still have the old segment tree fully available. Note that the running time of an update and query is still O(\log N)!
Once again, we’ll illustrate with 2D range trees. Suppose we want to make a 2D range tree over \{(x_1, y_1), (x_2, y_2), \ldots, (x_N, y_N)\}, and assume this is sorted in x, and then in y.
Each 2D range query counts the number of points some rectangle [x_1,x_2]\times [y_1,y_2]. If all coordinates are nonnegative, this is equal to the number of points in [0,x_2]\times [y_1,y_2] minus the number of points in [0,x_1-1]\times[y_1,y_2], so we can just assume that x_1 = 0.
If the queries were offline, i.e. we know all queries beforehand, then we can answer all query rectangles in increasing order of x_2. The way we do this is to maintain a segment tree on the y s, initially empty. First, we sort all queries and points according to x, and then process them in that order:
- For every point (x_i,y_i), add y_i to the segment tree.
- For every query rectangle [0,x_2]\times [y_1,y_2], count the number of values in the segment tree in the range [y_1,y_2].
In other words, we’re effectively turning the x coordinate into a “time” dimension. It’s not hard to see why this works.
Unfortunately, the queries are not offline, because some queries are dependent on the result of previous queries (note that we’re performing binary search). The solution to this is to use a functional/persistent segment tree to construct all possible versions of the segment tree, by inserting the (x_i,y_i) points in order. Then, for a query rectangle [0,x_2]\times [y_1,y_2], we simply find the version of the segment tree corresponding to x_2, and query that segment tree!
Using this approach for our 2D range tree makes each query O(\log N), and for a 3D range tree, each range query O(\log^2 N).
Refined “binary search”
The techniques above allow us to reduce the running time of a 3D range query from O(\log^3 N) to O(\log^2 N), but with binary search, our running time is still O(\log^3 N), which might still be too slow. Our goal is to reduce this further.
In fact, we can do this by performing a more clever version of our binary search! The key insight is that we are binary searching over the first coordinate, x: Remember that we are finding the lowest value V such that [0,V]\times [0,l-1]\times [l,r] has at least k points.
Our original approach is the following:
// tree3d is our 3d range tree
def find_answer(l,r,k):
V_low = -1
V_high = N
while V_high - V_low > 1:
V_mid = (V_low + V_high) / 2
count = tree3d.range_query_3d(0,V_mid, 0,l-1, l,r)
if count >= k:
V_high = V_mid
else:
V_low = V_mid
return V_high
which is just a normal binary search. However, we can use the structure itself as our guide for binary search! Here’s an illustration:
// here, ".tree2d" is the underlying 2d range tree of a 3d range tree
def find_answer(l,r,k):
curr = tree3d
while curr.xi != curr.xj:
count = curr.left.tree2d.range_query_2d(0,l-1, l,r)
if count >= k:
curr = curr.left
else:
curr = curr.right
k -= count
return curr.xi
To help you understand this better, this is the same sort of technique that allows us to find the k th smallest value in a normal binary search tree, as long as we keep track of the size of each subtree at each node.
Since each 2D range query runs in O(\log N) time, and we perform O(\log N) of those queries, the overall running time reduces to O(\log^2 N) time, which is good enough!
Time Complexity:
O((N + Q)\log^2 N)