PROBLEM LINK:
Author: Kamil Debowski
Primary Tester: Marek Sokołowski
Secondary Tester: Praveen Dhinwa
Editorialist: Praveen Dhinwa
DIFFICULTY:
simple
PREREQUISITES:
combinatorics
PROBLEM:
You are provided a string s of length N. Suppose t be the string formed by concatenating the string s K times, i.e. t = s + s + \dots + s (K times). We want to find the number of occurrences of subsequence “ab” in it.
Finding number of subsequences “ab” in a given string s.
Finding number of subsequences “ab” in a given string s is same as finding number of pair of indices (i, j), i < j such that s_i = ‘a’ and s_j = ‘b’. The brute force way of iterating over all such pairs of indices i, j and checking the conditions s_i = ‘a’ and s_j = ‘b’ would be \mathcal{O}({|s|}^2).
However, you can do better. In fact, you can find this in a single pass over the string in \mathcal{O}(|s|) time. Consider an index j such that s_j = ‘b’, suppose we want to find number of i's such that i < j and s_i = ‘a’. It will be same as number of occurrences of character ‘a’ till position j. We can maintain the count of a’s by iterating over the array from left to right. This way, we will be able to find the answer in single iteration over the string s in \mathcal{O}(|s|) time. Pseudo code follows.
cnta = 0;
ans = 0;
for i = 1 to |s|:
if s[i] == a:
cnta++;
if s[i] == b:
ans += cnta;
Can we use this idea directly?
Now we construct the string t and find the number of subsequences “ab” in it in \mathcal{O}(|t|) time, which will be \mathcal{O}(N \cdot K). Constraints of the problem say that N \cdot K could go up to 10^9. So, this won’t work in time.
Towards a counting based solution idea.
Let t = s_1 + s_2 + s_3 + \dots + s_K, where s_i = s. We call s_i the i-th occurrence of string s.
Let us view the occurrences of subsequence “ab” in t as follows. One of the below two cases can happen.
-
“ab” lies strictly inside some occurrence of string s in t, i.e. “ab” lies strictly inside some s_i. We can find the number of occurrences of “ab” in s. Let us denote it by C. The total number of occurrences “ab” in this case will be C \cdot K.
-
In the other case, “a” lies inside some string s_i, whereas “b” lies in some other string s_j such that i < j. Finding the number of occurrences of “ab” in this case will be same as choosing the two strings s_i, s_j (\binom{K}{2} ways), and multiplying it by the number of occurrences of “a” in s_i (denote by cnt_a) and the number of occurrences of “b” in s_j (denote by cnt_b), i.e. \binom{K}{2} \cdot cnt_a \cdot cnt_b. As s_i = s, cnt_a will be number of occurrences of “a” in s and cnt_b will be the number of occurrences of “b” in s.
We can find C, cnt_a, cnt_b in \mathcal{O}(N) time. Thus, overall time complexity of this approach is \mathcal{O}(N) which will easily pass in time for N \leq 10^5. Pseudo code follows.
cnta = 0
cntb = 0;
C = 0;
for i = 1 to N:
if s[i] == 'a':
cnta++;
if s[i] == 'b':
cntb++;
C += cnta;
// First case, i.e. "ab" lies strictly inside some occurences of s, i.e. s_i in t
ans = C * K
// The other case, i.e. "a" lies inside some string s_i, where as "b" lies in other string s_j such that i < j
ans += K * (K - 1) / 2 * cnta * cntb ;