PROBLEM LINK:
Setter- Stepan Filippov
Tester- Misha Chorniy
Editorialist- Abhishek Pandey
DIFFICULTY:
MEDIUM
PRE-REQUISITES:
Square Root Decomposition, or Segment Trees, or Data Structures such as Deque
PROBLEM:
Given an array A[] representing height of plants, and another array B[] representing the required height of plants, find minimum time taken to convert array A[] to array B[] using the specified operation. The operation is, choose valid indexes L and R and a target height H_i and cut all plants in the range to height H_i. Their initial height must be \ge H_i for this to be valid.
QUICK EXPLANATION:
We can clearly see that the only way of saving time is to make a valid operation in which maximum number of plants need to be cut to same height B_i. The problem hence boils down to, making the optimal query by finding such L and R and checking if its valid or not. The case where answer is not possible is trivial to check.
EXPLANATION:
Click to view
"Oh no! Its that editorialist again. Guess what Misterrr Editorialist. I solved the question in O(N\sqrt{N}) so I dont need to read this editorial "
Well, my solution is quite better than O(N\sqrt{N})
“Really Mister Editorialist? Guess what? My solution runs in O(NLogN) Ha! Beat that X)”
Ok. We will discuss an O(N) solution then
So, as some of you might have perceived, we will be discussing three approaches here. First we will discuss my approach which is O(N), and then we will see Misha’s (tester) O(N\sqrt{N}) approach and finally discuss how we can convert/optimize it to become O(NlogN) approach of the setter.
1. Editorialist’s Solution-
The very first thing we see is if the answer is possible or not. Clearly, no answer is possible if for some plant i, its current height A_i is less than the required height B_i. If its possible, we proceed further.
First I will try to give the intuition. See what the problem demands. How can we save time? We can save time for every valid query which sets multiple plants to same required height B_i. Hence the problem boils down to proper identification of L and R for the query, and checking if such a query can be made or not.
The first problem which we face in brute force is that, to verify the query is possible, we might have to check all plants and their required heights in range of L to R. Not to mention getting exact L and R for query might seem problematic to some. Perhaps we can make some observations first to make our life simpler?
- Optimal query will have to set at least one plant to its required height. Hence, its value must be between one of the values B_i such that L\le i\le R.
- We further refine our first observation, claiming that the query height must be max(B_i) for L\le i\le R. Because if its not so, one of the plants will be cut well below its required height!
Also, imagine if we "have got the required L and R" where we got multiple plants which are to be set to same height B_H and are checking if the next element can be included in the range. When will we discard this element at index i from being inside the valid query? We will do so if-
- If the required height of B_i is more than B_H.
- If the plant height A_i is less than B_H.
With these two principles in mind, lets move towards the solution. I will first describe my algorithm, and then we will discuss what it does, and why is it correct.
- Make a deque/list/appropriate data structure which supports insertion and deletion at both ends.
- For all plants in range 1 to N, do following-
- If my deque is not empty, and the current plant (at index i) being considered has a required height B_i which is more than the height at the back of deque, keep popping from back of deque till height at back (B_j) satisfies B_j \ge B_i .
- Now check if the height of plant at i, (i.e. A_i) is less than the required height B_L at the front of deque. If yes, keep popping from front till this condition becomes false.
- After performing step 3 and 4, we can assure that we can make a valid query for all plants currently in the deque, because steps 3. and 4. assure that B_i are stored in descending order, and that the height of final plant (A_j) is sufficient enough to execute query from start to end of deque.
- Now check if required height of the plant (B_j) at back of deque is equal to required height of plant in consideration (B_i). If yes, then we can successfully make a query from the last element of deque to current element, saving an operation. If we find that B_i \neq B_j, then we cannot save an operation and must increment time needed by 1.
Why is this correct?
Notice that, we maintain the condition that, element is in the deque only if its possible to make a query from plant B_L at start of deque to plant at end of deque/plant being considered right now (B_R or B_i). This is possible because B_i gets stored in descending order in deque. Then, we also check before adding a plant if its possible to execute a query from start of deque to it by making sure that the current height of plant being added (A_i) is enough to support query for plant at index L (i.e. A_L). L and R are, obviously, starting and ending range of query.
Why descending order?
We will discuss this in Setter’s approach as well. I will mention my intuition here. Lets take a simple example, we have this configuration of elements in array B[] such that B[]=\{B_j,B_i,B_j\} with B_j < B_i > B_j. Can we make a query from B_j at left of B_i to B_j at right of B_i to save time? No, we cannot as it would cut plant A_i to a height less than B_i. Hence, once we encounter a B_i, we can safely eliminate all previous B_j which are smaller than B_i as we cannot make a query from them across B_i. Also, any queries which could had been done before B_i had already been considered when we added the element B_j and every other element thereafter.
Hence, the algorithm is correct.
Time Complexity- O(N)
Tester’s Aprroach-
The tester follows Square root decomposition.
If you got the intuition (even if only a little) of my solution, then this one is even easier. What Misha did after taking input is, he made two buckets. First bucket sqa[] stores minimum height of plant A_i in the range of bucket, and second bucket sqb[] stores the maximum of the required heights height B_i in range of bucket.
Then, he checks if the configuration is valid or not. While doing so, he also makes a vector of pairs, which stores < B[i], i> . He sorts this vector in descending order. Now, he does the following-
- If required height and current height of plants being considered are equal, do nothing. Else, goto 2.
- If this is the first element being considered, or if the required height B_i of this element is not equal to the required height of previous plants, B_z which we were considering, add 1 to answer and set B_z=B_i and set the index of plant being considered (index z) to i. Go back to 1. and consider next plant.
- If required height of this plant B_z is equal to required height of previous plant being considered, get the index y to that previous plant in the original array (He had made pairs of < B[i], i> for this purpose). Now check if we can make a valid query from the previous plant’s index z to this index y. This checking happens in O(\sqrt{N}) on basis of the observations about invalid queries I mentioned earlier (i.e. if there is a B_k in between which is more than B_z preventing valid queries across it, or if current height A_k of some plant is less than the required height B_z). If the query is possible, do nothing, else add 1 to answer and set current plant of consideration to plant at index y. (i.e. set B_z=B_y and z=y.)
Time Complexity- O(N\sqrt{N})
Setter’s Solution-
Basically, the change is in his’ and tester’s solution is in the step where we check the possibility of queries. He optimizes the step where we check if the query is valid or not.
What he observed was, that, consider two queries which set height of plants (in some range) to H_i and H_j. Now, suppose H_i happens after H_j, and H_i > H_j.
He observed that, we can swap the order of these two queries, i.e. execute query i before query j without any effect on the final answer. This is because, any plants getting affected by both query i and query j can be ultimately reduced to height H_j because H_i > H_j. He uses this to argue that, there always exist a solution where queries are executed in non-increasing order of heights H_q. That is, he will execute query which with greatest height H_q first, &etc. This also justifies tester’s step of sorting the vector on basis of required heights B_i, as each optimal query will make height (A_i) of at least one plant equal to its corresponding required height B_i.
With this clear, the only difference comes in checking if the query is valid or not. If we closely see what tester did, and what my definitions say about invalid queries, we can reduce it to finding maximum required height B_i in the range of [L,R] and minimum current height A_i in the range [L,R] where [L,R] are the range of query. This can be easily done via appropriate data structure like Segment Tree & similar.
Time Complexity- O(NLogN)
SOLUTION:
Commented Solution of Editorialist
CHEF VIJJU’S CORNER:
1. This is a very fit data structure problem, and is really beautiful in this way. There are very little problems where you can solve the question elegantly using the apt data structure. The correct code of algorithm with deque was hardly 6-7 lines for me. This question is really appreciable in this regard, and we must appreciate the setter for such an interesting question.
2. As you can see, the setter and tester have O(N\sqrt{N}) and O(NLogN) solutions respectively. The best part of being an editorialist is you can claim the best solution for your share
3. I will like to leave out some of the apt problems on data structures which I found elegant-
- Alternating Current - A very nice data structure problem.
- Soldier and Cards - One word, Deque.
- Square Root Decomposition - List of famous problems of this topic