ABREPEAT - Editorial

dpraveen · April 23, 2017, 8:17pm

PROBLEM LINK:

Author: Kamil Debowski
Primary Tester: Marek Sokołowski
Secondary Tester: Praveen Dhinwa
Editorialist: Praveen Dhinwa

DIFFICULTY:

simple

PREREQUISITES:

combinatorics

PROBLEM:

You are provided a string s of length N. Suppose t be the string formed by concatenating the string s K times, i.e. t = s + s + \dots + s (K times). We want to find the number of occurrences of subsequence “ab” in it.

Finding number of subsequences “ab” in a given string s.

Finding number of subsequences “ab” in a given string s is same as finding number of pair of indices (i, j), i < j such that s_i = ‘a’ and s_j = ‘b’. The brute force way of iterating over all such pairs of indices i, j and checking the conditions s_i = ‘a’ and s_j = ‘b’ would be \mathcal{O}({|s|}^2).

However, you can do better. In fact, you can find this in a single pass over the string in \mathcal{O}(|s|) time. Consider an index j such that s_j = ‘b’, suppose we want to find number of i's such that i < j and s_i = ‘a’. It will be same as number of occurrences of character ‘a’ till position j. We can maintain the count of a’s by iterating over the array from left to right. This way, we will be able to find the answer in single iteration over the string s in \mathcal{O}(|s|) time. Pseudo code follows.

cnta = 0;
ans = 0;
for i = 1 to |s|:
    if s[i] == a:
        cnta++;
    if s[i] == b:
        ans += cnta;

Can we use this idea directly?

Now we construct the string t and find the number of subsequences “ab” in it in \mathcal{O}(|t|) time, which will be \mathcal{O}(N \cdot K). Constraints of the problem say that N \cdot K could go up to 10^9. So, this won’t work in time.

Towards a counting based solution idea.

Let t = s_1 + s_2 + s_3 + \dots + s_K, where s_i = s. We call s_i the i-th occurrence of string s.

Let us view the occurrences of subsequence “ab” in t as follows. One of the below two cases can happen.

“ab” lies strictly inside some occurrence of string s in t, i.e. “ab” lies strictly inside some s_i. We can find the number of occurrences of “ab” in s. Let us denote it by C. The total number of occurrences “ab” in this case will be C \cdot K.
In the other case, “a” lies inside some string s_i, whereas “b” lies in some other string s_j such that i < j. Finding the number of occurrences of “ab” in this case will be same as choosing the two strings s_i, s_j (\binom{K}{2} ways), and multiplying it by the number of occurrences of “a” in s_i (denote by cnt_a) and the number of occurrences of “b” in s_j (denote by cnt_b), i.e. \binom{K}{2} \cdot cnt_a \cdot cnt_b. As s_i = s, cnt_a will be number of occurrences of “a” in s and cnt_b will be the number of occurrences of “b” in s.

We can find C, cnt_a, cnt_b in \mathcal{O}(N) time. Thus, overall time complexity of this approach is \mathcal{O}(N) which will easily pass in time for N \leq 10^5. Pseudo code follows.

cnta = 0
cntb = 0;
C = 0;
for i = 1 to N:
    if s[i] == 'a':
        cnta++;
    if s[i] == 'b':
        cntb++;
        C += cnta;
// First case, i.e. "ab" lies strictly inside some occurences of s, i.e. s_i in t 
ans = C * K 
// The other case, i.e. "a" lies inside some string s_i, where as "b" lies in other string s_j such that i < j
ans += K * (K - 1) / 2 * cnta * cntb ;

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter
Tester1
Tester2

akshaycs · April 24, 2017, 12:57am

#include<stdio.h>
int main()
{ long long int t,i,j,k,sum,x;
scanf("%lld",&t);
while(t–)
{
long long int a,b,y;
scanf("%lld%lld",&a,&b);
char ch[a+1];
long long int ab[a];
scanf("%s",ch);
x=0;
for(i=0;i<a;i++)
if(ch[i]==‘a’)
{
ab[x]=0;
for(j=i;j<a;j++)
if(ch[j]==‘b’)
ab[x]++;

         x++;	
	 }
	 
	 
sum=0;	 
for(i=0;i<x;i++)
sum=sum+ab[i];	
 
 
k=b*(b-1)/2;	
 
if(k>0) 	
y=b*sum+x*ab[0]*k;	
else
y=sum;	
 printf("%lld\n",y);	
 }
return 0;
}

plss help me w

akash_321 · April 24, 2017, 1:37am

#include 

int main(void){
	int t ;

	scanf("%d",&t);

	for(int i =0; i<t; i++){

		unsigned long long int n,k;
	
	scanf("%d%d",&n,&k);
	
	char s[n+1];
	
	scanf("%s",s);
	
	int j=0,a= 0,b =0, sum = 0;
	
	while(s[j] != NULL){
	
		if(s[j] == 'a'){
	
			a++;
	
		}
	
		else if(s[j] == 'b'){
	
			sum = sum + a;
	
			b++;
	
		}
	
		j++;
	
	}
	
	unsigned long long int fin;
	
	fin = k*(k-1)/2*a*b+ k*sum;
	
	printf("%llu\n",fin);	
	
}
	
return 0;

}

```


same solution as of editorial but showing wrong answer.

vijju123 · April 24, 2017, 1:46am

Fixed formatting…again.

vijju123 · April 24, 2017, 1:50am

There is a big blunder in your code. Its HERE-

scanf("%d%d",&n,&k);

n and k aren’t int, they are usigned long long int. Due to wrong string/format specifier (i.e. “%d”) they, n and k were being assigned garbage values.

c0d3_k1ra · April 24, 2017, 1:56am

I had a bit of different idea and that is Arithmetic progression. For example, if a string(without repeating) has x number of ‘ab’ sequences, then that is dependent on position of each ‘b’ and number of 'a’s before it. Let us say the string is ‘ababaab’. There are 3 ‘b’ and 4 ‘a’. The first ‘b’ has 1 a, 2nd b has 2 a and 3rd b has 4 a. Which comes to a total of 1+2+4 = 7.

Now if we add another string to it the new string becomes ‘ababaabababaab’.

The point to be noted is the number of a will be added constantly. So for new b added the values become 1+4 = 5, 2+4 = 6 , 4+4 = 8. Which comes out to 5+6+8 = 19.

Similarly after adding the string 3rd, 4th and 5th time the ‘ab’ sequences are 31, 43 and 55.

See the pattern yet?

Every time we are adding the string one more time the total sequences increase by 12. it is no. Of a multiplied by no. of b in original string. (Why?)

So if we add string K times then the problem simplifies to finding sum of k elements of a A.P. whose first term is A(no. Of ab sequences in original string) and difference is D(#a * #b).

akash_321 · April 24, 2017, 1:56am

But still it is showing wrong answer.

vijju123 · April 24, 2017, 1:58am

I used the same. AP gave a very nice solution to me!

vijju123 · April 24, 2017, 2:04am

Give me time to debug.

akash_321 · April 24, 2017, 2:18am

correct answer is 5 for this test case not 3.

akash_321 · April 24, 2017, 2:31am

2 ‘ab’ pairs for first ‘b’ and 3 ‘ab’ pairs for second ‘b’.
You can verify by trying it in setter’s solution.

vijju123 · April 24, 2017, 2:36am

Yes, i got that. My bad. This is what happens when compres keep you sleep deprived haha XD

vijju123 · April 24, 2017, 3:14am

Got AC for your code. You are suffering from overflow.

Changed-

int j=0,a= 0,b =0, sum = 0;

to

long long int j=0,a= 0,b =0, sum = 0;

Link- https://www.codechef.com/viewsolution/13382195

Reason for error- Overflow caused program to behave unpredictable. and gave error when judge ran it.

shbhm_53 · April 24, 2017, 3:51am

A simple approach to the problem. Here’s my code.

int main() 
{
    int t;
    cin>>t;
    while(t--)
    {
        ll n, k;
        cin>>n>>k;
        string s;
        cin>>s;
        int i;
        vector<int> v(n,0);
        for(i=n-1; i>=0; i++)
        {
            if(s[i]=='b')
            v[i]=1;
            if(i<n-1)
            v[i]+=v[i+1];
        }
        ll sum=0, a=0;
        for(i=0; i<n; i++)
        {
            if(s[i]=='a')
            {
                a++;
                sum+=(v[0]-v[i]);
            }
        }
        ll ans=a*(k*(k+1)*v[0])/2 - sum*k;
        cout<<ans<<endl;
    }
    return 0;
}

akash_321 · April 24, 2017, 3:57am

Thanks a lot bro…

vaitan · April 24, 2017, 11:25am

My approach is -:
Given s - abcb and k=2 —> (abcbabcb) —>t=s1 + s2

Considering only s, for a the count of b is 2. This will be repeated k times. So , for ‘a’ in s1 b is k times in s. Similarly for a in s2, k=1 and only 2 b is accounted. Generalizing, it comes down to the formula, (for each a in s) :

sum+=(Cb * k)+(Cb * (k-1)) + (Cb * (k-2)) +…(Cb * 1);

where Cb is the number of ‘b’ ahead of given ‘a’
For given s=abcb and k=2 and ‘a’ at position 1 : sum+= (2 * 2) + (2 * 1)=4+2=6

But i didnt get correct answers for the last test case, where s=abzbabzbazab and k=80123123 :
To summarize my approach, i can break down my formula to this :
Count of number of pair of ‘ab’ in s :-- Cab;

Sum= Cab (k(k+1)/2;=10*(80123123 * 80123124)/2----this does not give me the correct answer.

I cant figure out, what did i miss in here.

vijju123 · April 24, 2017, 11:40am

If we repeat “aba” 2 times, it becomes abaaba. Your formula gives answer as-

Sum = 1 x (2 x 3/2)= 3. (If I get your approach correctly) and hence your code fails. Try to account of trailing 'a’s in the end.

arpit728 · April 24, 2017, 12:04pm

@vijju123 @dpraveen

Why my solution is not working
here is my approach:
what I did is I calculated the no. of occurrences of sub-sequence “ab” in s and say it x.
Then ans will be ans=x*(k*(k+1)/2).

vijju123 · April 24, 2017, 12:51pm

@arpit728 How will you deal with case of-

“baa” repeated twice == “baabaa”

Your x is 0, so your ans reduces to 0, but there are occurances of ab in the final form.

The concept of trailing 'a’s has actually given a headache to many. As a hint I will say that there is a A.P. perspective possible to the problem. The number of “ab” formed by a particular a in FINAL string is an AP with common difference = No. of b in original string.

Summing for all a, we get the answer.

ezio27 · April 24, 2017, 1:13pm

Watch this code

Ezio’s Python

Program

t=int(input())
while t>0:
count=0
t=t-1
n,k=input().split()
n,k=int(n),int(k)
a=list(input())
a=a*k
for i in range(len(a)):
if a[i]==‘a’:
for j in range(i,len(a)):
if a[j]==‘b’:
count=count+1
print(count)