help needed for solving string and sub-sequences problem?

You are given a String S of length N.

Now, a good subsequence is one that can be represented in the form aibjck, where
i≥1,
j≥1
and
k≥1.
For example ,if
i=2, j=1,k=3, it represents the string aabccc. In short, a good sub-sequence is a sub-sequence that first consist of i ′a′ characters, followed by j ′b′ characters, followed by k ′c′ characters.
Now, you need to find the number of good sub-sequences of S. As the number of such sub-sequences could be rather large, print the answer Modulo 109+7.

Note1: Two subsequences are considered different if the set of array indexes picked for the 2 subsequences are different.

Input Format:

The first and only line of input contains S.

Output Format :

Print the required answer on a single line.

Constraints:

1≤|S|≤105

Sample Input

abcabc

Sample Output

7

2 Likes

Going from left to right, keep track of the number of sequences until the current position, which are of these three forms:

  1. a^i
  2. a^i b^j
  3. a^i b^j c^k
    Suppose that these are stored in three variables a,b,c respectively. Whenever you see the character ‘a’, it can extend the strings of type 1, and also, be the starting position for a string of type 1, so a+=(a+1), whenever you see a ‘b’, it can extend previous strings of type 1 and 2, so b+=(a+b), for ‘c’, it will extend all strings of type 2 and 3, so c+=(b+c).
1 Like

@hemanth_1
can you explain implementation details.

Take three variables, a,b and c, and then iterate over every character in the string, and for each character, if the character is a, then a+=(a+1), if its b then b+=(a+b) and if its c then c+=(b+c)

For the purpose of the answer let’s define:

  • \textbf{abc-good} string is a string in the form a^i b^j c^k, for i, j, k \geq 1
  • \textbf{ab-good} string is a string in the form a^i b^j, for i, j \geq 1
  • \textbf{a-good} string is a string in the form a^i, for i \geq 1

We are interested in the number of \textbf{abc-good} subsequences of the input string S[1 \ldots N].

Recursive approach

Let f_{abc}[x] be the number of \text{abc-good} subsequences of S[1 \ldots x].

The crucial step towards a solution is to notice that any \text{abc-good} subsequence of S[1\ldots x] is either any \text{abc-good} subsequence of S[1\ldots x-1], or if S[x] = c it is any \text{abc-good} subsequence of S[1 \ldots x-1] extended with S[x], or if S[x] = c it is any \textbf{ab-good} subsequence of S[x-1].

Now, let f_{ab}[x] be the number of \text{ab-good} subsequences of S[1 \ldots x]. The value of f_{ab}[x] can be computed similarly to f_{abc}[x], i.e. any \text{ab-good} subsequence of S[1\ldots x] is either any \text{ab-good} subsequence of S[1\ldots x-1], or if S[x] = b it is any \text{ab-good} subsequence of S[1 \ldots x-1] extended with S[x], or if S[x] = b it is any \textbf{a-good} subsequence of S[x-1].

Lastly, let f_{a}[x] be the number of \text{a-good} subsequences of S[1 \ldots x]. Now, any \text{a-good} subsequence of S[1 \ldots x] is either any \text{a-good} subsequence of S[1 \ldots x-1], or if S[x] = a it is any \text{a-good} subsequence of S[1 \ldots x-1] extended with S[x], or if S[x] = a it is an \textbf{unique empty} subsequence of S[1 \ldots x-1] extended with S[x] forming \text{a-good} subsequence of length 1.

Notice that using the above definitions we count all wanted subsequences and we count each one exactly once. We are ready now to formulate a recursive formula for f_{abc}[N] based on the above observations:

f_{abc}[N] = \begin{cases} f_{abc}[N-1] + f_{abc}[N-1] + f_{ab}[N-1], & \text{if } S[N] = c\\ f_{abc}[N-1] & \text{otherwise} \end{cases}

Similarly you can define recursive relations for f_{ab}[N] and f_{a}[N]. Notice that base cases for them are f_{abc}[0] = f_{ab}[0] = f_{a}[0] = 0, because there is no any good subsequence of the empty string since, by the definition, all these good strings have positive lengths.

Speeding it up with dynamic programming

Solving the problem top-down by simply following the recursive relation will cause computation of the same subproblems many times leading to exponential time complexity, which is highly unwanted. This can be avoided by using memoization (just store already computed values in some auxiliary data structure and get their values from there in constant time), or by using bottom-up dynamic programming approach. Let’s take a look how the values of f_{a} can be computed using dynamic programming.

Let’s declare an array f_{a} with N+1 entries. At the beginning we assign f_{a}[0] = 0. Now, we iterate for x = 1 to N and compute f_{a}[x] following the recursive relation and using just already computed values:

for x = 1 to N:
   f_a[x] = f_a[x-1]
   if S[x] == ‘a’:
      f_a[x] += f_a[x-1]
      f_a[x] += 1

Now, having f_{a} table computed, we can compute f_{ab} table in a similar fashion. Finally, having f_{ab} table computed, we can compute f_{abc} table and return the final result as f_{abc}[N].

6 Likes

@pkacprzak

Very nice explanation, can you explain what are the overlapping sub problems in this case and also the recursive brute force solution.