PROBLEM LINK:
Author: Sunny Aggarwal
Tester: Pushkar Mishra
Editorialist: Florin Chirica
PROBLEM
You’re given a string containing only letters c h e and f. You have to answer queries asking how many words start in a given letter and end in other letter (necessarily different by first one) considering only positions from the string between two given numbers.
QUICK EXPLANATION
Let’s calculate ways[A][i] = in how many ways can we choose a word with beginning letter A, ending letter B and its ending position up to i. Also, let cnt[A][i] = how many letters A appear in the first i characters. This information is enough to answer queries in O(1): for two characters A and B and two positions i and j, the answer is ways[A][j] - ways[A]**[i - 1] - cntA * cntB, where cntA = number of characters equal to A from [1…i - 1] and cntB = number of characters equal to B from [i…j].
EXPLANATION
Number of letters is small
A first thing we should observe is the number of letters available is 4 (c h e and f). This low number of letters will help us to solve the problem with a simple and efficient solution.
The idea with having 4 letters is that we can precalculate something, then we can answer all queries in constant time.
Let’s inspect how queries can look like. We start by looking how beginning and ending of a good substring can look like. There are only 12 possibilities.
- (start) c (end) h
- (start) c (end) e
- (start) c (end) f
- (start) h (end) c
- (start) h (end) e
- (start) h (end) f
- (start) e (end) c
- (start) e (end) h
- (start) e (end) f
- (start) f (end) c
- (start) f (end) h
- (start) f (end) e
Since there are only 12 possibilities, we can iterate over them and store something for each configuration. For a configuration, what we computed should be enough to solve all queries that correspond to that configuration.
For a fixed start and end letter
By now we can assume that our start and end letters are fixed (using the iteration). Let’s denote by A the start letter and by B the end letter.
What we need to do is to answer queries: for a subarray [i…j], how many indices i’, j’ exist, such as array[i’] = A and array[j’] = B, with i <= i’ <= j’ <= j.
The trick here is to solve an easier problem firstly: suppose we only had to count pairs (i’, j’) such as i’ <= j’ <= j (we erased i condition). Now the task can be solved by a simple precalculation.
Let ways[A]**[i] = number of ways to choose i’ <= j’ <= i such as array[i’] = A and array[j’] = B. We have two choices for j’.
Choice #1: Let j’ < i. All pairs (i’, j’) such as j’ < i are already calculated in ways[A]**[i - 1], so we can simply add this number and treat all case.
Choice #2: Let j’ = i. Obviously, for this to happen we’re forced to have array[i] = B. What’s left is to find positions i’ such as array[i’] = A and i’ < i. This is easily done by keeping a partial sum for each letter, something like sum[let][i] = how many times does letter let appear in positions [1…i].
Hence, the answer is in ways[A]**[j].
Now let’s consider full restrictions. In reality we have i <= i’ <= j’ <= j. Let’s see what elements are counted above, but should not be counted.
There are two cases
Case #1: i’ < j’ < i <= j
Case #2: i’ < i < j’ <= j
We have to erase from our result the results for those cases.
Case #1 is simple to calculate, as its answer is simply ways[A]**[i - 1].
Case #2 basically asks us to find a letter A in range [1…i - 1] and a letter B in range [i…j]. Let cntA the number of letters A in range [1…i-1] and cntB the number of letters B in range [i…j]. The answer is simply cntA * cntB. The values of cntA and cntB can be calculated by partial sums.
Time Complexity
The computation of ways takes O(n) time (with constant 12). Also, partial sums also take O(n) time. Then, a query is answered in O(1) as stated above.