GOODPREF -Editorial

PROBLEM LINK:

Div1
Div2
Practice

Setter- Praveen Dhinwa
Tester- Misha Chorniy
Editorialist- Abhishek Pandey

DIFFICULTY:

EASY

PRE-REQUISITES:

Strings, Basic Math, Simulation.

PROBLEM:

Given a string S , with only characters being 'a' and 'b ', find number of good prefixes in the string T which is made by concatenating N copies of S, i.e.-

T = \underbrace{S + S + \dots + S}_{N\text{ times}}

A prefix is a good prefix if and only if number of occurrences of 'a' is strictly greater than the number of occurrences of 'b'.

QUICK EXPLANATION:

We make cases for this problem. If frequency of 'a' is equal to frequency of 'b' in string, then calculating answer is simple- for we can simply use Ans=G(S)*N where G(S) denotes number of Good Prefixes in S.

If frequency of 'a' is more than to frequency of 'b' , we notice that after at most 1000 appendations, each new appendation will contribute maximum possible value only, which is K (K=Length of string) .

Similarly we can see that if frequency of 'a' is less than to frequency of 'b', after 1000 appendations, each new appendation will contribute 0 to answer.

So if frequency of character 'a' is not equal to that of 'b', we make a string T by appending S to it 1000 times, and then use mathematical formulas to find contribution of every other appendation.

EXPLANATION:

As usual, we will be discussing Editorialist’s and tester’s approach. Editorialist’s approach is based on same lines as that of setter- and hence, will form main part of editorial. We will discuss better optimized solutions (tester’s approach) and common mistakes of contestants at the bonus part of editorial :slight_smile:

Before that, please first get acquainted with the notations I will be using.

  • G(S)= Number of Good Prefix in string S
  • freq(a,S)= Frequency of character ‘a’ in string S.
  • freq(b,S)= Frequency of character ‘b’ in string S.
  • {S}^{p}= String formed by concatenation S p times.

1. Editorialist’s/Setter’s Solution-

If you want, you can scroll down to refer to my solution side by side while reading the editorial.

The first thing to note are the constraints. 1\le N \le {10}^{9} while 1\le |S| \le {10}^{3}.

We start by calculating number of Good Prefixes in S , i.e. G(S). Along with that, we also find freq(a,S) and freq(b,S). Now we make cases-

a. When freq(a,S) == freq(b,S)-

In this case, there is no affect of previous appendations of S on incoming/new appendations of S. In other words, when freq(a,S) \neq freq(b,S), then there is some change in value of (freq(a,S)-freq(b,S)) after each appendation of S to T, which affect the contribution of any future appendation of S towards G(T). In most basic words, it means that since freq(a,S) \neq freq(b,S), then number of 'b' in T will not increase at same rate as number of 'a' at each appendation, which will affect contribution of next appendation to the final answer.

This is not the case when freq(a,S) == freq(b,S). Hence, each appendation of S will contribute G(S) towards the final answer. There are N appendations of S and we already found G(S) by simple looping earlier. Hence, for this case, Answer=G(S)*N

b. When freq(a,S) < freq(b,S)-

In this case, because freq(a,S) < freq(b,S), each new appendation of S to T will decrease (freq(a,S)-freq(b,S)). In other words, the number of 'b' in T will increase more quickly than number of 'a', which will reduce the contribution of every future appendation of S towards final answer.

We see that, after every appendation, the contribution of next appendation must reduce by at least 1 , (\because freq(a,S) < freq(b,S)). Remember that length of S is only upto 1000. This simply means, that after at most 1000 appendations, the contribution of incoming appendation reduces to 0. (In fact, 1000 is a loose upper bound, we can go closer to the real value, but for sake of explaining lets take it 1000 right now :slight_smile: )

So, we form the string T by appending string S min(N,1000) times as-

T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}

and our answer is equal to G(T) as any future appendation has 0 contribution towards final answer.

Answer=G(T)

b. When freq(a,S) > freq(b,S)-

Clearly, each new appendation of S to T will increase (freq(a,S)-freq(b,S)). Thus, the number of 'a' in T will increase more quickly than number of 'b', which will increase the contribution of every future appendation of S towards final answer.

The maximum possible contribution of an appendation to our answer can be |S|, i.e. the length of string S. Again, as seen above, after at most 1000 appendations, the next appendation will reach its maximum contribution. Hence, we do same as above, we form T by appending string S min(N,1000) times as-

T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}

Now each future appendation will have maximum contribution, equal to |S|. Hence, our answer will be-

Answer=G(T)+|S|*max(0,N-1000)

Now, what is the time complexity? How many of you would agree if I said-

Time Complexity= O(min(N,1000)*|S|) \equiv O(1000*|S|) \equiv O(|S|)

Is it correct?
.
.
.
(PS: The above time complexity is a lie :slight_smile: )

SOLUTION:

Setter
Tester - O({|S|}^{2})
Tester - O(|S|)
Editorialist - O(???)

CHEF VIJJU’S CORNER:

**1.**Most of the contestants did a good job in framing the simulation part correctly, checking for overflows, and coming at right expressions. But they still got a TLE. And let me reveal the culprit here! It was-

T=S+T

Yes, thats right! This expression should had been T+=S. The + operator takes time comparable to O(|S|+|T|) while the += symbol takes time comparable to O(|S|). Treat it like “Creating a new array with characters of T and S v/s Adding |S| characters to back of char-array T .” For reference, have a look at this solution. Replace the + operator with += and see the time diffference!

2. Tester’s O(|S|log(|S|)) solution is discussed here. When freq(a,S)> freq(b,S), what he does is, keep a track of freq(a,S)- freq(b,S) at various points of the string, and then sort them. Now, we keep a variable curr which stores the count excessive 'a ' we get after each appendation. He uses this to check at most how many prefixes more he can get by the condition have[ptr - 1] + cur > 0. Depending on that, he adds ans += (int)have.size() - ptr; to the answer. In other words, tries to go checks the least value in of freq(a,S)- freq(b,S) which curr, number of excessive 'a' s can afford, and adds to answer accordingly. Same approach used for freq(a,S)< freq(b,S) case, which is left for the readers to work out :).

3. Testers O(|S|) solution is a bit similar. Firstly, Misha eliminated sorting, and replaced it with a cumulative summation’d prefix array. Meaning, he stores the frequency of each value of freq(a,S)- freq(b,S) and does a cumulative sum of that array. Again, curr holds freq(a,S)- freq(b,S). Now, his expression is ans += cnt[2 * SHIFT - 1] - cnt[SHIFT - cur];, meaning “Max possible contribution - Contributions which cannot be afforded due to low curr”. The solution is same in essence to the first one, but he cleverly got rid of sorting, leading to a much simpler O(|S|) approach.

4. Now you have seen two optimized approaches, I will have a question for you. Have a detailed look at the time complexity I mentioned. No doubt, my solution does ,plain and simple, have a complexity of at least O(min(N,1000)*|S|) , but dont you get something fishy? There is a, very common, and intentional error involved here :). Note that, there is no sense in the "1000" I kept in O(min(N,1000)*|S|). Ask yourself, what is this 1000? It is nothing but to denote |S|!! Hence, the real complexity would be O(min(N,|S|)*|S|) \equiv O({|S|}^{2}). Dont be like this editorialist when computing time complexity xD.

5. The string in this question was made up of only 2 characters. Replace 'a' and 'b' with 1 and 0 and you have a binary string. Some questions on similar topics-

4 Likes

I guess we could save space by not actually appending the s to T rather running a loop till (n or 1000) * len(s). and taking a mod (n or 1000) each time we want to know that character.

Perhaps .But I preferred a solution which is easy to explain and for others to understand :slight_smile:

1 Like

Can anyone tell me what is wrong with my solution?

https://www.codechef.com/viewsolution/18299503

On line 29, put \n instead of space. I.e. for your case n==1. Other error are also there. For which I will get back later. :slight_smile:

1
bbbbaaaaa
2

Your code prints 0. Rather it should be 3.

1
aaabbbb
2
Your code prints 13. Rather it should be 8.

So many people got WA on final testcase of subtask 2 during the contest, and even now…And… that TC was suggested by me…xD xD

#Evil_Evil_Editorialist hahahaha

2 Likes

But I escaped. I got WA only for 30 pts. First it brute forced for 30pts so got 30pts+70pts. And then corrected it.

hahahaha

I have another soln. https://www.codechef.com/viewsolution/18064632 Similar to @mgch O(|S|) soln.

It find for cases cumulative summation’d prefix array. And then used floor to find for Max possible contribution.

Nice job @aryanc403 . My TC was based on one of the common mistakes I predicted :stuck_out_tongue: . Looks like you did a good job here :slight_smile:

Solutions are now linked. You can have a look :slight_smile:

Thank you so much @aryanc403

1 Like

I use binary search to find that position after which the numbers of good prefixes becomes constant if possible.I think this may be a unique approach for this particular problem.If anyone already did this approach please share :slight_smile:

link of my solution: https://www.codechef.com/viewsolution/18082824

If any one like my approach please upvote :).

if there is any query related to my approach please comment :slight_smile:

1 Like

@vijju123 I have one doubt Nice job comment was for my test cases or for escaping ?

For you escaping them. Its definitely allowed to help and give failing cases. But not during the contest!! xD

Commented codes are always appreciated. :slight_smile: . Upvote gives 10 karma. Will give 15 more if you properly document it :slight_smile:

Hello, I have been trying to find out the intuition behind “at most 1000 appendations” Could anybody give a proof?

Ask yourself 3 questions.

  1. If freq(‘a’)\neq freq(‘b’) , what is the minimum change in contribution of next appendation w.r.t. first one?
  2. What is the largest possible length of original string S?
  3. How many appendations will it take for all future appendations to reach |S| or 0, provided that the contribution behaves monotonically (eg if increasing, always increase and vice cersa).
1 Like

i also did it with binary search https://www.codechef.com/viewsolution/18140172