Problem Link:
Setter: Misha Chorniy
Tester: Ke Bi
Editorialist: Rashad Mammadov
Difficulty:
MEDIUM
Prerequisites:
Dynamic Programming
Problem:
Given a string - S of length N and K. Find the minimum number of characters to erase from the string in order to make the number of different subsequences "chef" equal to K. The string consists of only characters 'c', 'h', 'e', and 'f'.
Explanation:
Let’s rewrite the statement as " Find the longest subsequence of S, in which there are exactly K different subsequences “chef”. "
The problem can be solved using a dynamic approach. Let dp_{i, chef, che, ch, c} be the length of the longest subsequence of the prefix 1...i of S, in which there are exactly chef number of "chef", che number "che", ch number of "ch", and c number of "c" subsequences.
Now, we need to find the transition between the elements of dp array. Let’s figure out it as follows:
To find the longest subsequence of the prefix 1...(i + 1), we only need to consider the prefix 1...i. So that we may add the (i + 1)^{th} character to its end or not.
1. In case of ignoring (i + 1)^{th} character, we have the same subsequence that we have for the previous prefix. So, we need the following update:
- dp_{i + 1, chef, che, ch, c} = max(dp_{i + 1, chef, che, ch, c}, dp_{i, chef, che, ch, c})
2. In case of adding (i + 1)^{th} character, we have the cases below:
2.1 If s_{i + 1} = 'c', then we get 1 more "c" sequence:
- dp_{i + 1, chef, che, ch, min(c + 1, K + 1)} = max(dp_{i + 1, chef, che, ch, min(c + 1, K + 1)}, dp_{i, chef, che, ch, c})
2.2 If s_{i + 1} = 'h', then we get c more "ch" sequences:
- dp_{i + 1, chef, che, min(ch + c, K + 1), c} = max(dp_{i + 1, chef, che, min(ch + c, K + 1), c}, dp_{i, chef, che, ch, c})
2.3 If s_{i + 1} = 'e', then we get ch more "che" sequences:
- dp_{i + 1, chef, min(che + ch, K + 1), ch, c} = max(dp_{i + 1, chef, min(che + ch, K + 1), ch, c}, dp_{i, chef, che, ch, c})
2.4 If s_{i + 1} = 'f', then we get che more "chef" sequences:
- dp_{i + 1, min(chef + che, K + 1), che, ch, c} = max(dp_{i + 1, min(chef + che, K + 1), che, ch, c}, dp_{i, chef, che, ch, c})
Note: If the numbers of sequences overflow K, then we may consider all of them as K + 1. There is no needed to store their exact count because it has no effect on the answer.
Finally, the anwer to the problem will be:
N - MAX(dp_{i, chef, che, ch, c}, where: chef = K, che = 1...(K + 1), ch = 1...(K + 1), c = 1...(K + 1)) or -1 if it is not possible to have exactly K subsequences "chef". In order to handle this case, we may fill the dp array with -1's or \infty's initially.
To implement this idea we need to take a 5-dimensional array of N\cdot(K + 1)\cdot(K + 1)\cdot(K + 1) \cdot(K + 1). Using 5 inner loops we can fill this array and transitions can be done in O(1) as shown above. It will cost us O(N * K^4) time and memory. However, we can improve memory complexity to O(K^4), since we only need to memorize answers for the last prefix every time.
Time Complexity:
O(N * K^4), per test case.
Space Complexity:
O(N * K^4) or O(K^4)