PROBLEM LINK:
Author: Sergey Kulik
Tester: Vlad Mosko
Editorialist: Pushkar Mishra
DIFFICULTY:
Medium-Hard
PREREQUISITES:
DP, Data Structures
PROBLEM:
Given two arrays A and B of lengths N and M (M \leq N) respectively, output the number of subsequences C of A which are of length M such that C[i]+B[i] for i = 1..M forms a non-decreasing subsequence. Note that all arrays are 1-indexed.
EXPLANATION:
Subtask 1
For this case, we are given that N = 50 and M = 5, we can iterate over all possible subsequences of A of length 5 and check which ones form a non-decreasing sequence.
The complexity of this approach is definitely exponential. We can iterate over all the possible subsequences of A of length 5 using recursion. Since, the number of 5 length subsequences can be {N choose M} (counting the number of combinations, the maximum number of operations required would be {50 \choose 5} = 2118760. This will definitely run under the given time constraints.
To avoid recursion, we can simply take an array of length N, fill it up with M 1s and rest 0s, and cycle through its unique permutations using either an inbuilt function like next\_permutation in C++, or a custom function. Wherever 1s appear in a permutation, those elements can be taken as C. We can then check for each permutation whether C[i]+B[i] forms a non-decreasing sequence or not, while keeping a count.
Complexity of this approach is \mathcal{O}(N!).
Subtask 2
For this subtask, we have that N can be as large as 500 and M can be as large as 50. Clearly, the previous approach will not work.
Most of the counting problems can be solved using Dynamic Programming. Below, we formulate the DP.
Let us say that DP[i][j] denotes the number of possible non-decreasing subsequences such that A[i] is taken as the j^{th} number of C. Then we can formulate the recurrence in the following manner:
The base case of the recurrence is clearly DP[0][0] = 1.
DP[i][j] = summation of DP[k][j-1] for all k = 0 to i-1 such that A[k] + B[j-1] \leq A[i] + B[j]. Note that A[0] and B[0] are set to arbitrary negative values so that they are definitely less than anything else in the recurrence.
The final answer is summation of DP[k][M] for k = 1 to N.
The pseudocode for the DP is as follows:
//Setting base conditions.
DP[0][0] = 1;
A[0] = B[0] = -inf;
for i = 1 to N
{
for j = 1 to M
{
DP[i][j] = 0;
for k = 0 to i-1
{
//Checking for non-decreasing criteria
//and accumulating.
if(A[k] + B[j-1] <= A[i] + B[j])
DP[i][j] = DP[i][j] + DP[k][j-1];
}
}
}
final_answer = 0;
for i = 1 to N
{
final_answer = final_answer + DP[i][M];
}
output final_answer;
The complexity of this program is clearly \mathcal{O}(N^2M). For the given constraints, this works well.
Subtask 3
In this subtask, N increases to 2000 and M to 1000. We can’t do the same dynamic programming as before. We need to somehow optimise the \mathcal{O}(N^2M). The step that we can optimise is summing the DP[k][j-1] for all k = 0 to i-1 such that A[k] + B[j-1] \leq A[i] + B[j]. How can we optimise this?
Segment trees! We can keep M segment trees. These segment trees will be used for adding values at particular points in the range 1..2*10^9 and performing range sum queries. The j^{th} segment tree adds the value DP[i][j] to the leaf for the point (A[i] + B[j]). But since (A[i]+B[j]) can be as large as 2*10^9, we can either do a coordinate compression or construct dynamic segment trees. Editorialist’s program uses dynamic segment trees.
Here is a pseudocode for the same:
//Setting base conditions.
A[0] = B[0] = -inf;
DP[0][0] = 1;
//We update the 0th segment tree too.
//The first argument states the tree;
//the second argument gives the point at which
//we need to update. Third argument gives the value
//to be added.
update(0th tree, 0, DP[0][0]);
for i = 1 to N
{
for j = M down to 1
{
//first argument gives the tree,
//second and third specify the range.
DP[i][j] = query((j-1)th tree, 0, A[i] + B[j]);
//Now, we update the jth tree with the value.
update(jth, A[i]+B[j], DP[i][j]);
}
}
final_answer = 0;
for i = 1 to N
{
final_answer = final_answer + DP[i][M];
}
output final_answer;
We have reduced one factor of N down to \log (A_{max}+B_{max}). Therefore, the complexity for this algorithm is \mathcal{O}(NM \log A_{max}).
Editorialist’s program follows the editorial. Please see for implementation details.
Setter uses a different approach to this problem. Setter’s solution uses DP, but a slightly different recurrence. That recurrence allows for the use of Binary Indexed Trees to be employed, thus allowing for a lesser constant of complexity.
OPTIMAL COMPLEXITY:
\mathcal{O}(NM \log A_{max})