MEANARRAY - Editorial

vijju123 · July 4, 2018, 1:25am

PROBLEM LINK:

Contest
Practice

Setter- Adhyyan Sekhsaria
Tester- Adhyyan Sekhsaria
Editorialist- Abhishek Pandey

DIFFICULTY:

Easy

PRE-REQUISITES:

Intuition/Logic, Two-Pointers, Binary Search on Array

PROBLEM:

Given a non-decreasing array of length N, we need to tell number of sub-arrays whose arithmetic mean is less than K

QUICK EXPLANATION:

We first observe that the array is non-decreasing. Hence, we find upto which position, all numbers are < K. As all numbers are < K \implies their mean is also < K. \therefore ALL subarrays upto that length are to be counted. We can simply use the formula of "Number of sub-arrays in an array of length N= N*(N+1)/2. for this.

After that, we notice this- if we keep more elements < K we keep in the sub-array, we can keep a higher number of elements \ge K without having mean become \ge K. In other words, say we reached a position/index pos upto which we were able to formed a valid sub-array after including all elements < K. If we remove one of the elements < K, we might have to go backwards (i.e. remove an element \ge K) and can definitely not go forward (i.e. include another element \ge K).

This simple observation allows us to use two-pointer approach to count remaining sub-arrays.

EXPLANATION:

The editorial will be divided into 2 sections. One will focus on part where "All elements are < K" and the second section will use that and give further solution using two-pointers. It is expected that you know the two-pointer algorithm. Please go to pre-requisites if you dont know and check links there

1. If all elements are < K-

This is the first step towards solution. What will we do if all elements are < K? Obviously, include ALL possible sub-arrays! Remember-

\because All elements are < K, their mean can never be \ge K.

Hence, we would include all the sub-arrays possible. In an array of length N, the number of sub-arrays is given by Num=N*(N+1)/2 where Num is the number of sub-arrays (obviously!).

But, how can we apply this to a general array? Give it a thought first, before proceeding for discussion in next section.

Using results above & Two Pointers-

Recall that our array is sorted! This means, we can easily find a position, say pos such that all elements up to pos are < K. We can use the above formula to directly count those many sub-arrays at once!

Now comes the tricky part.

Recall the explanation at quick section about two-pointer approach. The more elements < K we have, the more elements \ge K we can add, and still not have mean \ge K.

Check the scenario below.

Say we are currently trying to count those valid sub arrays, which start from index 1 (we will generalize later). All elements till position/index pos are < K. All sub-arrays ending till pos are included in answer by formula above. We need to find how many more valid sub-arrays exist, i.e. how many valid sub-arrays exist with at least one element \ge K.

Lets say, we kept travelling ahead of pos, and arrived at a position pos2 after which we cannot add any more element \ge K (because it will make mean >K). Recall we are currently talking about sub-arrays starting only from index 1. We have also, already included sub-arrays ending upto pos. How many new sub-arrays did we got? Obviously, pos2-pos.

Click to view

Not convinced? Lets count how many sub-arrays are there. The sub-arrays are, pos+1, pos+2, pos+3,…,pos+k such that pos+k=pos2.

Now, lets say I want to calculate similarly for index 2. If I do everything again, then finding new position of pos2 for every index will take lot of time, and make complexity O({N}^{2}), which time outs! Can we somehow use the data of index 1 to determine data of index 2?.

Yes! We can! Recall what I have been saying till now since the quick explanation section! If we reduce number of elements < K, we can definitely NOT include any more elements \ge K. This means that pos2 for index 2 is not more than pos2 for index 1!

Let me denote pos2_i to represent "pos2 calculated for index i". Now, pos2_1 >pos2_2 >pos2_3.... So, what if, instead of starting from i, going to pos and from there finding pos2_i, why dont we start at pos2_{i-1} (i.e. pos2 of previous index) and move backward until we find pos2_i!

This, is nothing but the basic concept of two-pointer which brings the complexity down from O({N}^{2}) to O(N). In O({N}^{2}) approach, we start from i, go to pos and from there find pos2_i and repeat all this again for next index, while in two-pointer approach, we only start at pos2_{i-1} and find pos2_i, from there we find pos2_{i+1} and so on. We can see that no element is visited twice in two-pointer approach, while in O({N}^{2}) approach, all elements are visited multiple times ,again and again which wastes time.

Thus, we can loop from i=1 to i=pos (1-based indexing!) and determine appropriate pos2_i and directly add the number of sub-arrays to the answer (using the formula I gave above ).

A formal implementation is given below, but I request you to first to first yourself draft atleast an informal implementation of above idea, and compare yours with ours :).

Click to view

      ans+=(1LL*less*(less+1))/2;//All subarrays before index less are counted. 
        //For an array of length n, subarrays= n(n+1)/2 .
        r=less;
        for(int i=0;i<less;++i)currsum+=arr[i];
        while(l<r and r>=less)//We use two-pointer algorithm
        {
            while(r<n and currsum+arr[r]<k*1LL*(r-l+1))//Can we add r?
            {
                currsum+=arr[r];
                ++r;
            }
            ans+=r-less;
            currsum-=arr[l++];//Remove front element OR Shift l to next position
            while(r>=less and currsum>=(r-l)*1LL*k)//Adjust r
            {
                currsum-=arr[--r];
            }
        }
        cout<<ans<<endl;

SOLUTION:

In case the links dont work, the solutions are also pasted in tabs below for reference. This is, so nobody has to wait while @admin references the links

Setter

Click to view

#include<bits/stdc++.h>
 
using namespace std;
typedef long long ll;
#define NMAX 1000000
ll a[NMAX];
ll n, k;
ll solve(){
	ll ans = 0;
	int pointer = n-1; // initialsize our pointer to last index. the pointer will store the last index which so that the mean < k
	ll sum = 0;
	for(int i = 0; i < n; i++) sum += a[i]; // currently sum holds the sum of all elements
	// as we iterate, sum will change so that it will hold the sum from [i, pointer]
	for(int i = 0; i < n; i++){
		while(pointer >= i && sum >= k*(pointer-i+1)*1LL){ // if pointer is not greater than i, then that means no subarray exists starting at i
			// if it is inside this loop then the mean of [i, pointer] >= k and we have to reduce the number of elements
			sum -= a[pointer];     // since we are going to push the pointer to the left, we have to remove the contribution of that element to the sum
			pointer--;
		}
		if(pointer >= i) ans += (pointer - i + 1)*1LL; // calculate the contribution of the subarrays starting at index i;
		sum -= a[i]; // since all subarrays will now start after the ith index, we must remove a[i] from sum
	}
	return ans;
}
 
int main(){
	//taking input
	cin>>n>>k;
	for(int i = 0; i < n; i++){
		cin>>a[i];
	}
	cout<<solve()<<endl;
	return 0;
}

Editorialist

Click to view

/*
 *
 ********************************************************************************************
 * AUTHOR : Vijju123                                                                        *
 * Language: C++14                                                                          *
 * Purpose: -                                                                               *
 * IDE used: Codechef IDE.                                                                  *
 ********************************************************************************************
 *
 Comments will be included in practice problems if it helps ^^
 */
 
 
 
#include <iostream>
#include<bits/stdc++.h>
using namespace std;
 
int main() {
	// your code goes here
	#ifdef JUDGE
    freopen("input.txt", "rt", stdin);
    freopen("output.txt", "wt", stdout);
    #endif
	ios_base::sync_with_stdio(0);
	cin.tie(NULL);
	cout.tie(NULL);
	int n,k;
	cin>>n>>k;
	int arr[n],i,j;
	for(i=0;i<n;i++)cin>>arr[i];//Taking input
	int l=0,r=0;
	long long ans=0,currsum=0;
    int less=lower_bound(arr,arr+n,k)-arr;//Less= Position of first element >=K. Read about lower_bound function
    //from C++ STL.
    if(less==0)
        cout<<0<<endl;
    else
    {
        ans+=(1LL*less*(less+1))/2;//All subarrays before index less are counted. 
        //For an array of length n, subarrays= n(n+1)/2 .
        r=less;
        for(int i=0;i<less;++i)currsum+=arr[i];
        while(l<r and r>=less)//We use two-pointer algorithm
        {
            while(r<n and currsum+arr[r]<k*1LL*(r-l+1))//Can we add r?
            {
                currsum+=arr[r];
                ++r;
            }
            ans+=r-less;
            currsum-=arr[l++];//Remove front element OR Shift l to next position
            while(r>=less and currsum>=(r-l)*1LL*k)//Adjust r
            {
                currsum-=arr[--r];
            }
        }
        cout<<ans<<endl;
    }
	
	return 0;
}

Time Complexity= O(N)

CHEF VIJJU’S CORNER

1. No one asked me why we iterate only upto i=pos when I said -

“Thus, we can loop from i=1 to i=pos (1-based indexing!) and determine appropriate pos2_i and directly add the number of sub-arrays to the answer (using the formula I gave above ).”

Click to view

Because after pos the smallest element will be \ge K and no valid sub-array is possible!

Click to view

**Also, I couldnt hear you even if you would had yelled and asked. So…

2. Some practice problems-

B. Wrath
Blocked Websites - Although editorial mentions tries, a solution using two-pointers is easier

aryanc403 · July 21, 2018, 12:54am

We first observe that the array is non-decreasing. _/\_
Observations are key to success.

vijju123 · July 21, 2018, 12:56am

HAHAHAHAHHAHA. Happened to me as well xD. I spent 15 minutes at this question and then I saw non-decreasing. And I immediately sent the setter a message that he is way too cruel to not put non-decreasing in bold xD

aryanc403 · July 21, 2018, 1:10am

Upd - Since it is already bumped again. So sharing one more approach.
Generalised Soln

Soln starts here

I have subtracted k from all elements. And then made a prefix array.
Now subarray l,r has average less than k. then prefix[r]-prefix[l-1] < 0 . Because this value is equivalent to sum from l to r - k*(no of elements in this range)
Which can be done in (n*logn) (Suggestions required.).

Then By iterating over l=0 to n. With logn time to find no of elements less than prefix[i] are is answer for no of subarray with starting index l.

Now the problem reduces to counting no of inversions in an array.
Ref here.

Soln ends here

//Ignore this.

Click to view

My AC approach - https://www.codechef.com/viewsolution/19292310 .
Edit - Generalised Soln - https://www.codechef.com/viewsolution/19295262
This approach gives AC even if no are in non increasing order - https://www.codechef.com/viewsolution/19295293

Time Complexity - O(n*logn).

Previous Text with approach explained.
If someone can provide me A data structure with
Data structure is required which support for insert, delete, and query for no of elements less than x in log n time.

One data structre which seems possible for this. But I had not tried it.
I will use array word to mean prefix array calculated above.
Create BST with node values of array. Make sure its height is logn.

Then preprocess array. With information that should be stored in each node -> No of times present value is present. No elements in right and left subtree.

Insertion -> add 1 to corresponding node value.

Delete -> subtract 1 to corresponding node value.

During pre processing -> Add one for each array value. n*logn

During iteration - n*logn

No of elements less than a[i] tree - log n

Delete a[i] from tree - log n

General Case - n*logn.

And this passes TL.

the_extractor · July 21, 2018, 1:30am

I subtracted K from each element in the array and took the mean as 0 to make things a bit simpler:
https://www.codechef.com/viewsolution/19285196

This works because:

\frac{\sum_{i = 1}^n a_i}{n} = k \implies \frac{\sum_{i = 1}^n a_i - n \times k}{n} = 0 \implies \frac{\sum_{i = 1}^n (a_i - k)}{n} = 0

vijju123 · July 21, 2018, 1:33am

Yes, thats also correct

vijju123 · July 21, 2018, 9:09pm

Good job dear. Well done

hackerwizard · July 21, 2018, 9:40pm

@aryanc403 we can also use BIT with coordinate compression

aryanc403 · July 21, 2018, 9:57pm

Relevant tutorial ??

taran_1407 · July 21, 2018, 10:24pm

I too solved for general array. I didn’t use prefix array, but compressed values to make segment tree.

By the way, I think this DS is called Order Statistic Tree.

hackerwizard · July 21, 2018, 11:38pm

See my code https://www.codechef.com/viewsolution/19289787 for implementation.