Given a lucky string which contains only ‘4’ and/or ‘7’ characters. The following process is repeatedly executed while it is possible to do so:
- Find all the pairs of consecutive ‘4’ and ‘7’ in the current string and add to the result (at the beginning the result is set to zero) the position of the character ‘4’ in each pair (the string is 1-indexed). If there is no such position, then STOP else continue.
- Erase all of those pairs.
Your mission is finding the result.
Let’s take an example first. Suppose our lucky string is ‘44474777’ so after the first iteration of the process it becomes ‘4477’ and the result is added with 3 + 5. After the second iteration the string is ‘47’ and we need add 2 into the result. Finally we erased the entire string and add 1 to the result. Our final result is 3 + 5 + 2 + 1 = 11.
Can we simulate the whole process? The simple solution takes O(N) to find all the pair of ‘4’ and ‘7’ and erase them from the current string. For some simple strings where just a small number of iterations of the process are needed, this solution may work in reasonable time. For example, in a string like “474747474…4747”( it becomes empty after the first iteration). However, it’s not hard to find a test case that make this solution run out of time for instance a string like “4444…477777…7”. It will require another N iterations. to do the same.
So now, the strategy will be that for each character, we need to find at which iteration that character will be erased.
Let T[i] be the index of iteration when the ith character is erased then for the string above we have T[1 … 8] = [3, 2, 1, 1, 1, 1, 2, 3].
We will discuss about finding T later, for now we will see how to use T to calculate the result.
Can you try to answer this question: right before the kth iteration of the process, what is the position of the ith character in the current string (of course we care only for i with T[i] ≥ k)?
At that time, only the characters that have the value of T larger than k remain in the current string. So the position of the ith character will be the number of T[j] such that j ≤ i and T[j] ≥ k. Let G(i, k) be the number of T[j] such that j ≤ i and T[j] ≥ k, then the position of the ith character at the time it is erased is G(i, T[i]). Our needed result will be the sum of all G(i, T[i]) where the ith character in the string is ‘4’ and it will be erased in the end.
Now we face with a classic data-structure problem. Given the array T, we need to answer the query G(u, v) which requires the number of T[i] such that i ≤ u and T[i] ≥ v = T[u]. The idea to solve this problem is using Binary Index Tree or Segment Tree and answering those queries in the increasing order of v. I suggest that you refer to the problem of counting the inverses in array using Binary Index Tree in this link.
Now let us come back to the question of how to find T. We call a pair of ‘4’ and ‘7’ as matched if in a specific iteration of the process, both of them will be erased. It is obvious that a matched pair will be erased after all the characters between them in the original string are already erased. More formally given a match pair of ‘4’ and ‘7’ where the position of ‘4’ and ‘7’ are u and v respectively then T[u] = T[v] = max(T[i], u < i < v). From this observation we can derive an algorithm to find the matched pairs as well as T. Let’s look at the pseudo code given below.
ST is an empty stack. S is our lucky string FOR i: 1 -> length of S IF (S[i] == '4') ST.push(i) ELSEIF (S[i] == '7' and ST is not empty): U = st->top; St.pop V = I; //U and V is a matched pair T[U] = T[V] = max(T[i], U < i < V) (*) ENDIF ENDFOR
Finally we still miss two things:
- Some characters will not be erased at the end: So just let the corresponding T value be large enough.
- In the line marked with the ‘*’ of the pseudo code above we need to calculate the maximum value in a segment of the T array. This should be done with data-structure like BIT(Binary Index Tree) or IT. As an exercise you can also think creatively to find a way using the stack itself to come up with the O(N) algorithm for this part?
AUTHOR’S AND TESTER’S SOLUTIONS:
It’s interesting that the tester had a totally different approach. Firstly, he stored the string (actually the indices) in a set (C++ STL). So erasing a character of the string can be done in O(Log(N)). The real position of each index can be managed by the Binary Index Tree. The pair of ‘4’ and ‘7’ that needed to be erased can be located based on the position of the pair that is erased in the previous iteration.
For example, let’s say in the previous iteration, we erased a ‘4’ at the position i. Then, in the current position,
let l be the biggest index that is smaller than i and still un-erased and r be the smallest index that is bigger than i and also un-erased (we can find them by using lower_bound or upper_bound method of Set).
If there is a ‘4’ at the position l and a ‘7’ at the position r then we need to erase those indices in the current iteration.
The run time complexity for each iteration is O(number of character need to be erased × Log(N). So the overall run time complexity will still be O(Nlog(N)).
Finding the matched characters is similar to the algorithm of checking the balanced parentheses.