PROBLEM LINK:
Author: Kanstantsin Sokal
Tester: Pavel Sheftelevich
Editorialist: Pawel Kacprzak
DIFFICULTY:
Medium
PREREQUISITES:
Dynamic programming, Segment tree, Fenwick tree
PROBLEM:
In this problem, for a given integer sequence S of length N, your task is to count the number of non-empty subsequences of S with exactly A local minimums and exactly B local maximums.
QUICK EXPLANATION:
Use dynamic programming in order to compute the number of non-empty sequences S[1…j] with exactly a local minimums and exactly b local minimums for all 1 \leq j \leq N, 0 \leq a \leq A, 0 \leq b \leq B. Next, speed up the computations using Segment tree or Fenwick tree. Finally, notice that A and B differ by at most 1, because all elements in S are distinct. Use this fact to optimize the solution further in order to pass the last subtask.
EXPLANATION:
SUBTASK 1
In the first subtask N is at most 20, so in order to solve the problem, we can simply generate all possible 2^N - 1 non-empty subsequences of S, and for each one, count the number of local minimums and local maximums in it using a linear scan over its element. This approach leads to O(2^N \cdot N) running time for a single test case, which will pass the first subtask if only it’s implemented efficiently.
SUBTASK 2
In the second subtask N can be at most 200. This is of course way too big for any exponential method used in the first subtask to pass. One thing you should notice is that A and B are not so big here - both are not greater than 10. We can use this fact to come up with the following dynamic programming approach:
Let dp[i][a][b][0] be the number of subsequences of S[1..i] ending in S[i] with exactly a local minimums, exactly b local maximums and the last element smaller than the one before last element. Similarly, let dp[i][a][b][1] be the number of subsequences of S[1..i] ending in S[i] with exactly a local minimums, exactly b local maximums and the last element greater than the one before last element. With these values defined, we can easily show how to compute any dp entry with first index i having all the dp entries for first indexes j < i computed.
Let’s fix index i. We are going to iterate over all indices 1 \leq j < i. For a fixed j, there are two possibilities: either S[j] < S[i] or S[j] > S[i]. Since all elements of S are distinct, equality is not possible here.
If S[j] < S[i], we can create a subsequence of length 2 using just S[j] and S[i], so we add 1 to dp[i][0][0][1]. Similarly, if S[j] > S[i], we add 1 to dp[i][0][0][0].
Next, we are going to extend any previously counted subsequences of length at least 2 to subsequences ending with S[i]. In order to do this, we iterate over all possible values 0 \leq a \leq A and over all possible values 0 \leq b \leq B. Again, there are two cases to consider:
If S[j] < S[i], we add value of dp[j][a][b][1] to dp[i][a][b][1]. Moreover, if a < A, we add dp[j][a][b][0] to dp[i][a + 1][b][1] creating new local minimum at index j.
Analogously, if S[j] > S[i], we add value of dp[j][a][b][0] to dp[i][a][b][0]. Moreover, if b < B, we add dp[j][a][b][1] to dp[i][a][b + 1][0] creating new local maximum at index j.
In order to compute the final result, we sum up all dp[i][A][B][0] and dp[i][A][B][0] values over all 1 \leq i \leq N. At the end, if A = 0 and B = 0, we add N to the result in order to count all subsequences of length 1.
Last but not least, remember to do all necessary computations modulo 10^9 + 9.
The total time complexity of this method is O(N^2 \cdot A \cdot B), because we fill O(N \cdot A \cdot B) dp entries, and we fill a single one in O(N) time.
SUBTASK 3
In the third subtask, A and B are still not greater than 10, but N can be now as big as 5000. It would be nice to get rid of one N factor from the complexity of solution given for the second subtask. It turns out that we can do this using a very natural an common idea of improving dynamic programming solutions using segment trees (or any similar data structure like Fenwick tree).
In the solution given above, we update a single entry in the dp table in O(N) time. Now, we are going to do this in O(\log N) time. Notice, that in the original dp solution for a fixed i, we iterate over all possible 1 \leq j < i in order to compute the dp entries for index i. There were two cases to consider depending on whether S[j] < S[i] or S[j] > S[i].
Here comes a crucial observation: if we accumulate all the values of dp[j][a][b][0] for all j such that S[j] < S[i] and do the same with all values dp[j][a][b][1] for such j, we can easily compute the value of dp[i][a][b][1] and dp[i][a + 1][b][1] using these accumulated sums. Similarly, we can speed up the computation of dp[i][a][b][0] and dp[i][a][b + 1][0] if we accumulate all dp[j][a][b][0] for all j such that S[j] > S[i] and if we do the same with all dp[j][a][b][1] for such j.
Notice that we can maintain these sums using a segment tree with the sum over a contiguous query (or even just prefix and suffix sum queries) and update operation adding a value at the given index. One thing left to do is to compress the input values in order to easily store the segment tree in memory - their absolute values are initially less or equal to 10^9. We can compress them easily to a range 1..N which helps a lot here.
The total time complexity of this approach is O(N \cdot \log N \cdot A \cdot B).
SUBTASK 4
In the last subtask, we still have N \leq 5000, but A and B can now be as big as 200, so the solution used for the second subtask is too slow now and we need a further improvement.
Let’s consider any subsequence P of S and take a closer look at the difference between local minimums in P and local maximums in it. Let’s denote this difference as D. The crucial observation here is that since all elements of S are distinct, D can be only -1, 0 or 1.
In order to prove that, let’s consider any subsequence P with 1 local minimum and 0 local maximums. Let i be the index which is the local minimum. Notice that P is an increasing subsequence from index i, so any other subsequence W which have P as a prefix will be either increasing from index i till the end, or it have two adjacent indexes i_{j} and i_{j+1} such that S[i_j] > S[i_{j+1}], so the value D for W never grows over 1. You can use the same argument to show why the value of D is never less than -1.
We can use this fact in order to speed up the solution given for the third subtask. Notice that now we only need to take care of dp states for which we have 0 \leq a \leq A local minimums and for each such a there are at most 3 possible values of b - the number of local maximums: the only possible ones are a-1, a and a+1. This observation leads to a solution with total time complexity O(N \cdot \log N \cdot A).
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here.
Tester’s solution can be found here.