MINSUBAR - Editorial

Problem Link

Practice

Contest

Setter: Trung Nguyen

Tester: Hasan Jaddouh

Editorialist: Bhuvnesh Jain

Difficulty

EASY-MEDIUM

Prerequisites

Binary Search, Prefix Sums, Divide and Conquer

Problem

Find the length of the minimum subarray such that the sum of its elements is greater than D.

Explanation

Let us first maintain a prefix sum of the array. Now the problem reduces to the following:

For every index j, find the closest index i less than j, such that \text{Prefix[j]} - \text{Prefix[i]} ≥ \text{d}.

The equation can also be written as \text{Prefix[j]} - \text{d} ≥ \text{Prefix[i]}. To understand the process, let us consider an example. Let array \text{A = } \{2, 4, 3, -2, -5, 1\}.

The prefix array would be \{2, 6, 9, 7, 2, 3\}. Say, we want to find the answer for index j = 6 i.e. we want to find the nearest i such that (3 - d) ≥ prefix[i]. We see that if index 3 satisfies the inequality, so does all the indices with a value less than 9 and ahead of it, i.e. indices 4 and 5 too. Thus, in general, we say that if for index j, say index i satisfy the inequality, all such indices ahead of i and having a value less than or equal to it will also satisfy it. Since we would like to find the closest index i for every j (if it exists), the previous indices having the value greater than current prefix value, although may satisfy the inequality but will never lead to an optimal answer, (meaning smallest subarray size). We thus keep an increasing temporary array, (along with indices) based on the prefix sums. Since, this array is increasing (i.e. monotonic), we can apply binary search on it to find the optimal index. Let us complete this through the example array above.

The temporary index will look as follows:

  • T = {$(2, 1)$}. This first part is the prefix value and the second part is the index.
  • T = {$(2, 1), (6, 2)$}
  • T = {$(2, 1), (6, 2), (9, 3)$}
  • T = {$(2, 1), (6, 2), (7, 4)$}. This is because for every inequality which 9 satisfy, 7 will also satisfy and it being closer will lead to optimal answer. So, value 9 was discarded.
  • T = {$(2, 5)$}
  • T = {$(2, 5), (3, 6)$}

To understand, more about this approach see author’s or tester’s solution below.

Editorialist Approach

Generally, in problems where we need to calculate some function over all subarrays and queries are not involved, divide and conquer seems a good idea assuming merging can be done efficiently. Assume, in the recursive step, we calculate the answer for the left and right halves. To understand the complete solution, we just need to understand the merging part.

We compute all the prefix sums on the right-hand side and store them in a map along with the indices. Now we iterate through all the suffix sums on the left-hand side and try to find the first index on the right side which satisfy this inequality. This can be easily done by applying a “lower_bound” on the map build. The time complexity of this approach will be as follows:

T(N) = 2*T(\frac{N}{2}) + O(N*\log{N})

meaning T(N) = O(N * {\log}^{2}{N}).

For more details, refer to the editorialist solution. The idea is more or less similar to the author’s solution (although less efficient in this case) but can be used in other problems asking us to compute some function over all subarrays. Thus I included this part in the editorial. (Though it is an overkill for the problem).

Time Complexity

O(N * \log{N}), per test case

or O(N * {\log}^{2}{N}), per test case.

Space Complexity

O(N)

or O(N * \log{N})

Solution Links

Setter’s solution

Tester’s solution

Editorialist’s solution

3 Likes

Can you please elaborate on the temporary array in authors approach? I am not able to get it even after checking the author’s and tester’s solution solution. Like, what exactly are they doing? A pair of prefix and index of elements?

Also-

This is because for every inequality which 9 satisfy, 7 will also satisfy and it being closer will lead to optimal answer. So, value 9 was discarded.

What was your d, and how are you arguing about this? Why can I not apply this argument to case of “(6,2) (9,3)”? Perhaps the doubts might seem trivial to you, apologies for that, but I am geneuinely having difficulty understanding.

Thanks! :slight_smile:

1 Like

@vijju123, the value of “d” doesn’t matter. Let LHS = 3 - d. we want to want LHS >= something. So, if something = 9 satisfies it, so does 7. After this try to see the construction of temp array for example in the editorial. Then things would be clear.

1 Like

Oh…I was taking 9 in LHS instead of taking it in “something”. Thanks. I will re-read,re-check and re-analyse tomorrow morning and get back to you :slight_smile:

1 Like

I tried an algorithm where I move left and right pointers in search of the optimal solution, combining negative numbers with the numbers on the left until positive (or skipping if overall negative). I wrote the solution in Python (https://github.com/bilalakil/challenges/blob/master/codechef/cook89/minsubar.py).

Unfortunately this ended up wrong, but I can’t figure out why. Could anybody help provide a test case where my algorithm produces an incorrect answer so I can try to learn from my mistakes?

Very much enjoyed this question though! Thanks Trung Nguyen.

@bilalakil check this input :

2

5 5

1 2 3 1 -5

6 6

-1 -2 -3 3 3 -5

your output:
2
-1

correct output:
2
2

@beginner_1111
Can you tell why my solution gives WA.It is giving same ans for above input

Solution Link :
https://www.codechef.com/viewsolution/16654650

I tried using Algorithm(geeksforgeeks) with two pointers and solve it for d>0 and for d<=0 I simply iterated over all elements and checked if any elements was>=d, but I got WA. Anyone can find test cases for which it fails or error in my solution?
Solution Link

Can’t this question be solved using the sliding window algorithm, i.e. maintaining left and right pointers and maintaining the sum between the two.
Consider my solution to be a variation of this.

Here is my solution. Where is my code failing ?

@anushi , @rj25 , @jagreetdg , here is a test case:

1
7 10
2 1 6 -8 9 -2 4

Your code gives 5, Correct output is 3 (the last 3 elements).

I did this sum using segment tree and coordinate compression.

We calculate the prefix sum array,now lets say for a particular index i, i want to calculate the minimum valid subarray ending at i,so if there exists such subarray (such that the sum is >=d),then for 0<j<i,(pre[i]-pre[j])>=d,or pre[j]<=pre[i]-d,so indirectly we have to find the largest j that satisfy this.We can do this easily using segment tree,lets say at index i,i have a segment tree of prefix sum elements upto i,which store the maximum index.Like if pre[]={3,2,1,3} the segment[3:3]=4,segment[2:2]=2,segment[1:1]=3.
so in segment tree i could simply search for range min_val to (pre[i]-d).However the max value could be 10^9.
So just map all elements possible to an index by sorting them.

My solution:-https://www.codechef.com/viewsolution/16653216

3 Likes

I tried the 1st approach given in the editorial but still getting wrong answer on submission!
@likecs @vijju123 @scaly_raptor I have tried all above test cases given in this thread also and getting correct…
Link to my solution. Any test case giving wrong ans or fault in my approach is welcome!!!

I used SQRT decomposition.

First, I calculated the prefix sums and stored the largest one for each block. Then, for each starting subarray index, I checked if there is a good ending index in the same block (pref[j]-pref[i]>=d), if not, I looped through blocks until I found one that had max_bl_pref[bl_j]-pref[i]>=d. If I found one, I had to loop through the individual elements of that block to find the best answer.

1 Like

Your solution gives “-1” for

1

5 4

-1 1 1 1 1

but the right answer is 4.

1 Like

I tried with 2 pointers. not sure why this gives WA. It passes all cases mentioned above.
https://www.codechef.com/viewsolution/16665379

I have a solution that is like the moving window methods that others have been trying to use. We have to be careful though. The moving-window algorithms on geeksforgeeks are for arrays with positive values. That is extremely important because it means that the sum of all previous values is a monotone increasing sequence. That is not the case here.

To make such a method work we need to create an auxiliary sequence (the same that is used in the editorial). The trick is that the auxiliary sequence needs to be non decreasing. We can do that by removing terms. Each time we would add a term to this auxiliary list we first remove terms from the end until the last term is smaller than the new term. In that way we ensure that the sequence is always monotone increasing. Since we are deleting terms we will also need to keep track of the index of each value in this list.

In total we will add N terms to this auxiliary list. This should give us both time and space complexities of O(N). I have really just eliminated the need for the bisection that was used in the editorial.

Python: https://www.codechef.com/viewsolution/16665495

C++: https://www.codechef.com/viewsolution/16670816

3 Likes

can you please explain the “editorialist approach” a little more. I didn’t get it. Why does we are storing prefix of right and suffix of left?

1 Like

thanks for test case

Basicslly … the zest of the question is clear

task one:->you have to maintain a prefix array.

task two:->iterate over prefix array,for each item you have to find an element (>=prefix[i]-d if looking forward) or (<=prefix[i]-d if looking backward) of the item being processed (and closest also)…this can be done using expensive data structures for time optimality like segment trees , priority queues,sqrt decomposition technique,or maintaining maps etc…

task three:->find the optimal answer

for more clarity refer : http://codeforces.com/blog/entry/10219

i have a doubt in the first approach given above…

in the test case-

1

6 0

-1 -1 -1 0 -1 -1

the final array of pairs on which binary search has to be applied ends up being:

({-5,5})(as per your approach)

this automatically diminishes the chances of getting any sub string ending with 1, 2, 3 or 4 as its ending
index while in the given test case answer should be 3. please help.