XYHUMOQ - Editorial

PROBLEM LINK:

Practice
Contest

Author: Sajal Agrawal
Tester: Alexey Zayakin
Editorialist: Oleksandr Kulkov

DIFFICULTY:

MEDIUM

PREREQUISITES:

Euclidean algorithm

PROBLEM:

You’re given string S which starts with 1 and ends with 0. You have to flip minimum possible number of characters in the string to make it containing exactly X humongous subsequences (i.e. such that they’re of form 1010\dots10).

QUICK EXPLANATION:

If you consider number of (10)^k1 subsequences A and number of (10)^k subsequences B, you can unwind all possible strings having B=X+1 and proper length |S| via Euclidean algorithm.

EXPLANATION:

Consider some 10-string. How many humongous subsequences does it contain? Let’s keep number of (10)^k 1-subsequences A and number of (10)^k subsequences B. Note that empty string is counted in B as (10)^0. We can see that if we add 1 at the end of the string, we will have \begin{cases}A'=A+B,\\B'=B\end{cases} and if we add 0, we will have \begin{cases}A'=A,\\B'=A+B\end{cases}. Now if we fix A and B, it will be possible for us to recover the initial string.

We should note by now that recovering string given A and B is similar to Euclidean algorithm. While A is greater then B we will subtract B from A, we can compress that steps by making \begin{cases}A'=A-\lfloor A / B \rfloor \cdot B\\B'=B\end{cases} and adding \lfloor A / B \rfloor ones at the end of string. Everything is similar for the case B > A. We should do it until we reach empty string for which A=0, B=1. That means that we should use co-prime A and B and since last digit is necessarily zero, we will have A < B at the end so we can consider every possible A which is co-prime with B=X+1. One is added since we consider empty string now. And since basic case is A=0,B=1 you should be careful when you reach A=1 to not fall in A=1,B=0 instead. Here is code which generates a string given A and B:

string gen(int a, int b) {
    if(a == b) {
        return "1";
    } else if(a < b) {
        return gen(a, b % a + (a == 1)) + string(b / a - (a == 1), '0');
    } else {
        return gen(a % b + (b == 1), b) + string(a / b - (b == 1), '1');
    }
}

This one is slow however you should rewrite it in non-recursive way and consider the case when string gets too long. Now you need to brute every single A, there are O(X) of them and generate a string in O(\log X + |S|). This will solve the problem.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

RELATED PROBLEMS:

5 Likes

There is an interesting angle to this problem. The 2 possibilities of (A, B) \rightarrow (A+B, B) and (A, B) \rightarrow (A, A+B) make it analogous to the [Calkin-Wilf tree][1] which contains all positive rational numbers as fractions A / B. The root of 1 / 1 represents the string “1”. So adding 0 to a string is the same as going to the left child on this tree, and adding 1 means going to the right child. If we add d characters, we will be at level d of the tree. Now, we are only interested in the string that ends in 0, which means we only want to look at the even positions at this level.

There is no simple way to only access the interesting positions here, but let’s consider another tree related to the Calkin-Wilf, the [Stern–Brocot tree][2]. The fractions at any level in the Stern-Brocot tree are located at the bit-reversed positions in the Calkin-Wilf tree. Therefore, all even positions in the Calkin-Wilf tree will be present in the left half of the Stern-Brocot tree. And there is an easy way to find a fraction in the Stern-Brocot tree: [binary search][3]. So we can attempt to find all fractions <1 with required denominator at depth d in the Stern-Brocot tree. If we find one, we bit reverse its position and compare with given S to find how many swaps are required to change it. Complexity is \mathcal{O}(X \cdot |S|), and here’s the


[4].

Bonus: It turns out that the string "$1P0$" has the same number of $(10)^k$ subsequences as "$1Q0$" where $Q$ is the reverse of $P$, so the bit-reversal step is not necessary. I can't seem to figure out why that is so, can somebody prove this?

**EDIT:** Got the proof. Represent a string using a 4-tuple $(a, b, c, d)$ where $a, b, c, d$ are the number of $(10)^k1, (10)^k, (01)^k, (01)^k0$ subsequences respectively. Then 3 relevant operations are defined as  

$$
(a, b, c, d) \xrightarrow{reverse} (a, c, b, d)
$$


$$
(a, b, c, d) \xrightarrow{append \ 0} (a, a+b, c, c+d)
$$


$$
(a, b, c, d) \xrightarrow{append \ 1} (a+b, b, c+d, d)
$$


If you have a string $P$ and you perform on it $reverse, append \ 1, reverse, append \ 0$ you get "$1P0$" with the corresponding tuple $(a+c, a+b+c+d, c, c+d)$.  
But if your reverse $P$ and apply the same procedure you get $(a+b, a+b+c+d, b, b+d)$.  
And you can see that the number of $(10)^k$ subsequences is same in both :D

  [1]: https://en.wikipedia.org/wiki/Calkin%E2%80%93Wilf_tree
  [2]: https://en.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree
  [3]: https://en.wikipedia.org/wiki/Stern%E2%80%93Brocot_tree#Mediants_and_binary_search
  [4]: https://www.codechef.com/viewsolution/17043895
6 Likes

It’s really really cool.
Thank you for letting us know these trees.

1 Like

And thanks for this nice problem in the first place!

1 Like

Hey, It feels nice to hear that at least some folks including you appreciated the problem.

Btw, with that Proof thing, your provided proof is flawless. But, I’d like to provide one on the grounds of the editorial.
If you went through the editorial, you can see that we talk about all such numbers ‘a’ which have GCD(a,X+1)=1.
The 10-string is formed from the steps of the euclidean algorithm when applied on the (a,X+1) pair. We can also see that, the strings always start with a ‘1’ and ends strictly with a ‘0’.

Without loss of generality let’s assume our string to be some P = 1(x 0s)(y 1s)0. This will cover every case.
The let Q be the special reverse, Q = 1(x 1s)(y 0s)0.
The answer pair for P comes out to be
(1, 1) -> (1, 1+x) \implies (1+y(1+x), 1+x) \implies (1+y(1+x), 1+x+1+y(1+x))(As an exercise for the reader).

The answer pair for Q comes out to be
(1, 1) \implies (1+x, 1) \implies (1+x, 1+(1+y)(1+x))
One can notice that the second value in both the pairs is same and equal to- 2+x+y+xy.

Now the interesting thing is, Both the first values of the pairs sum up to the second value \implies
(1+x)+(1+y(1+x))=2+x+y+xy

And that’s cool because, now we know we get the same answer for a and (X+1-a), where a is any such number with gcd(a,X+1)=1. Eg: X=12, a=5, P=1(1010)0 and Q=1(0101)0. Pair for P = (5, 13) and for Q = (8, 13).

PS- Please read the editorial if you haven’t already yet, it will open new gates of problem solving, really. And , sorry for the bad formatting, I’m not so good with this here :frowning:

This one is slow however you should rewrite it in non-recursive way and consider the case when string gets too long. 

Can you elaborate this part?

In the recursive solution, the code creates the string for every ‘A’ such that gcd(A,X+1)=1. It even does so for those 'A’s also which do not give a length equal to |S|
The complexity is O(TX|S|*logX)

In a different solution, one should firstly store all such 'A’s which are co-prime to X+1 and giving a string of length |S| without actually creating strings by the help of retracing the steps of Euclidean algorithm as mentioned above.

Then run this again for these stored 'A’s to finally calc. the strings & the min flip.
The complexity of this comes to be O(T*(X*logX + |S|)).

2 Likes

Did somebody request formatting? :3

Haha, No no, I just always see answers that are always beautifully formatted.

Is there any way to solve this problem using Meet in The middle tecnique?

yes there are. Many people had solved it using that but i didn’t understand them. You can refer to @soham1234 code and please let us know the approach if you understand it!

Okay @pk301 , I am busy with my exams right now, but as soon as I get free, I’ll write here whatever I understand :slight_smile:

@krshubham @pk301

Sorry I don’t know what is meet in the middle technique, however i would like to share my approach.

See lets say i have a string, i store two things no. of substrings of form “1(01 k times) 1”,“1(01 k times) 0”
so under first case i can have “1”,“101”,“10101”

similarly for 2nd case “10”,“1010”,…

so (x,y) denotes no. of substrings of ending with 0 and 1 of above form
so consider string “10” have (x,y),so if i append “0”:“100” has state(x+y,y)

Similarly if i append “1”: “101” has state(x,x+y+1).

So what i did was,simple recursion starting with string “1” and went upto some length around 16(dont quite remember the exact number),and then at that point i checked if i can reach my desired state (X,*),if i cant then i stopped.

So the point was how to check that.Lets say i am at a state (x,y),i can go to (x,x+y) and (x+y+1,y)
Or framing it other way i could deal with coeficients of x and y,I mean if i am at coefficients(1,1,0)(x,y,const) which coeficients can i reach,so each of coefficients could be expressed in terms of x,y,const

So coefficients(f(x,y,const),f(x,y,const),f(x,y,const))

,so i precalculated which states can i reach by a simple reccurr with memoization for a given depth(i did it upto a depth of 15 around)

So know i just see if i can reach the destination with this depth(which is (length - 16))

My solution:

https://www.codechef.com/viewsolution/16997246

(If anyone could tell the complexity plz tell(as i too couldnt exactly figure that))

Sorry late reply. I understand the proof but how can we assume P = 1(x0s)(y1s)0?
And I didn’t understand the last part- what is common between the answer strings for (a,X+1) and (X+1-a,X+1)? :confused:

1 Like

First of all, Taking P to be 1(x0s)(y1s)0 covers up all the cases, it is just the general case, you can see it after proving this conjecture.

With your second query, you can see that, with P and Q, second values of both the pairs are same and first values of both the pairs sum up to the second value. Considering a to be 1+x and X to be 1 + x + y + xy, you can see that things go well.

Has anyone figured out the Tester solution for this problem? The code seems to be well-written. Unable to understand what ‘p’ and ‘q’ keep track of.
Maybe @alex_2oo8 can himself help?