MAXSEGM - Editorial

pkacprzak · June 24, 2017, 4:08am

PROBLEM LINK:

Practice
Contest

Author: Ramazan Rakhmatullin
Testers: Lewin Gan, Marek Sommer
Editorialist: Pawel Kacprzak

DIFFICULTY:

Easy

PREREQUISITES:

Two pointers

PROBLEM:

You are given 2 arrays W = [W_1, W_2, \ldots, W_N] and C = [C_1, C_2, \ldots, C_N] with N elements each. We say that a range [L, R] is unique if all the elements C_L, C_{L+1}, \ldots, C_R are unique. The goal is to find the maximum possible sum W_L + W_{L+1} + \ldots + W_R for a unique range [L, R].

EXPLANATION:

First of all, notice that in both subtasks, values of W are up to 10^9, which means that the final sum can exceed 32-bit integer range, so you should use 64-bit integers to represent the sums.

Subtask 1

In the first subtask, there are up to 100 test cases to solve, but we know that the sum of N over all these test cases doesn’t exceed 10^4, so we can apply the following quadratic time approach. For all L = 1, 2, \ldots, N, let’s iterate over all R \geq L such that a range [L, R] is unique and update the result during this iteration. The key observation here is that if for some A, the range [L, A] is not unique, i.e. there exists two indices L \leq i < j \leq A, such that C_i = C_j, then any range [L, B] for B > A is also not unique. In order to track uniqueness of elements in a range during iterations, we can use a hash-table, or even better, after noticing that values in C are non-negative integers less than N, we can use a simple boolean array for this purpose.

The following pseudocode illustrates the above appraoch:

result = 0
for L = 1 to N:
    for i = 0 to N-1:
        seen[i] = False # initially all are unseen
    currentSum = 0
    for R = L to N:
        if seen[C[R]]:
            break
        currentSum += W[R]
        result = max(result, currentSum)
        seen[C[R]] = True
print result

The total time complexity of this method is clearly O(N^2) for a single test case, which is sufficient in this subtask.

Subtask 2

In the second subtask, there are also up to 100 test cases to solve, but now the sum over N over these test cases doesn’t exceed 10^6. Thus we need a faster approach, ideally a linear one.

The key observation is to notice that values in the array W are non-negative, which means that if a range [L, R] is unique, and we can extend it in any direction, then we can always do that. It follows, that if we have a unique range [L, R], we want to extend in both directions as much as it stays unique. This leads to the following approach:

Let’s start with L = 1. We are going to use a boolean array for tracking already seen elements in the current range - similarly as in the solution for the first subtask. Now, let’s find maximum R such that the range [L, R] is unique. It means that either R = N or value C[R+1] appears in range [L, R] in array C. We do this by simply iterating from R = L until we find this maximum R. Now we know that [L, R] is a unique range and we can update the result with its sum, which can be computed while iterating. Now comes the second key observation. If R = N we are done, because there are no remaining elements to the right. Otherwise, in order to extend the current range to the right, we have to find the minimum i \geq L, such that C[i] = C[R+1], update L to i+1 and start extending the right endpoint of the range from R while it remains unique. We repeat this process until R becomes N. The following pseudocode illustrates this approach:

result = 0
currentSum = 0
L = 1
R = 1
for i = 0 to N-1:
    seen[i] = False # initially all are unseen
while True:
    while R <= N and not seen[C[R]]:
        currentSum += W[R]
        seen[C[R]] = True
        R += 1
    result = max(result, currentSum)
    if R == N+1:
        break
    while seen[C[R]]:
        seen[C[L]] = False
        currentSum -= W[L]
        L += 1
print result

The time complexity of this approach is O(N) for a single test case because both L and R can be updated at most N times in the loops and the total complexity is dominated by the loops where these values are updated in every iteration. This method is often called “Two pointers” approach and it’s useful in approaching many problems, even very advanced ones.

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter’s solution can be found here.
First tester’s solution can be found here.
Second tester’s solution can be found here.
Editorialist’s solution can be found here.

hrshp · June 24, 2017, 11:25pm

Could you please tell me where I went wrong?
This is the link to my code https://www.codechef.com/viewsolution/14336038

prajneya · June 24, 2017, 11:26pm

I think my approach was the same. Just instead of a boolean array, I kept checking elements with a set, which is less memory-consuming I think. Still, the code couldn’t pass the second subtask because of TLE. Can anybody point out the mistake?

https://www.codechef.com/viewsolution/14331818

vehemos · June 24, 2017, 11:41pm

If someone has some test cases cant they please help me out? My code (https://www.codechef.com/viewsolution/14336258) got WA I’m not able to understand why.

Thank You

prakhar_26 · June 25, 2017, 12:24am

@prajneya
Problem is where you do

right=left

Now your code is like O(n^2)

Instead, you should do

right=left=last occurrence of duplicate element encountered+1

siddharthp538 · June 25, 2017, 1:07am

1
6
4 4 5 5 6 6
1 2 3 4 5 6
Ans should be 9 and not 5

hrshp · June 25, 2017, 1:28am

I updated the link to my problem. Pl check

bobra · June 25, 2017, 7:03am

it seems to me that my solution is more ergonomic, and works for O(N)

here is the link : https://www.codechef.com/viewsolution/14333217

the main idea was :

in W[i] keep partial summs from W[1] to W[i], so you can easy find LR segment sum by W[R] - W[L - 1]
due to 1 ≤ N ≤ 1000000 , 0 ≤ Ci < N, we are able to have a mask array with N-ranged size and keep in the i-th element of the mask the last index of C[j] that is equal to i
the last movement was :

const int max_size = 1000001;
int mask_[max_size];
int last_duplicate;
long long max_sum;
int n;
long long W[max_size];
int C[max_size];

for each test we do:

memset(mask_, 0, sizeof(int) * max_size);
max_sum = 0; last_duplicate = 0;

// reading input

// calculating partial summs

for (int i = 1; i <= n; ++i) {
	if (mask_[C[i]] == 0)
		mask_[C[i]] = i;
	else {
	    max_sum = MAX(max_sum, W[i - 1] - W[last_duplicate]);
		last_duplicate = MAX(mask_[C[i]], last_duplicate);
		mask_[C[i]] = i;    		
	}
}
max_sum = MAX(max_sum, W[n] - W[last_duplicate]);

printf("%lld\n", max_sum);

shahidul_brur · June 25, 2017, 11:05am

I think my solution is easier to understand and the code is also simple.

This is my solution:

vishnu1146 · June 25, 2017, 11:33am

I think i have also used a O(n) approach but i am getting a wrong answer.Could anyone help me out what is wrong in my code or for what test cases it is giving wrong output? Here is the link to my code:https://www.codechef.com/viewsolution/14336093

hemant_dhanuka · June 25, 2017, 12:22pm

is really second approach is O(n)… i can see that, its O(n^2)…is i m right? there exist last while loop as well which is as follows–

 while seen[C[R]]:
    seen[C[L]] = False
    currentSum -= W[L]
    L += 1

here it is somewhat taking extra time than o(n)… can anyOne explain what exactly time Complexity of second approach… because here loop inside loop exists which suggests its O(N^2)

ksmukta · June 25, 2017, 1:45pm

@hemant_dhanuka Regarding time complexity in second approach

Loop 1 can execute maximum of N steps as in each step R increases by one so at (N+1)th loop iteration R<=N condition will be violated and the outer while loop will also break.

Loop 2 can execute maximum of N-1 times. At one loop instance, maximum L can go is upto R-1 and L is never decreased, so sum of all the steps in this loop will be less than R and R can go upto N.

Hope this clears the doubt. I don’t know how much clear was that.

mayankkvpy214 · June 25, 2017, 1:47pm

can somone please tell where i went wrong?

here’s my code-https://www.codechef.com/viewsolution/14340260

ayushagg31 · June 25, 2017, 2:00pm

If you want to know more about two pointers technique, then read it here.

For practice, different level of difficulty problems, refer here, hope it helps.

vehemos · June 25, 2017, 2:27pm

Thanks a lot siddharth bro

raks8877 · June 25, 2017, 6:32pm

Can anyone help why one is giving runtime error and other AC

runtime error : https://www.codechef.com/viewsolution/14342187

AC : https://www.codechef.com/viewsolution/14342108

what am i doing wrong for runtime error solution?

sonu_628 · June 26, 2017, 7:04am

No set uses BST and each insertion or deletion in BST uses /logn time

mangalnimish · June 27, 2017, 11:21am

I don’t understand why this is showing Wrong answer…

#include <bits/stdc++.h>
using namespace std;


int occupied[1000001];
long long int wrr[1000001];

int main()
{

ios::sync_with_stdio(false);
int t,n,crr[1000001],pos;
long long int presum[1000001];
long long int sum=0,maxi=0;
cin>>t;
while(t--)
{
    cin>>n;
    sum=0;
    memset(presum,0,sizeof(presum));
    memset(occupied,0,sizeof(occupied));
    maxi=0;
    pos=1;
    for(int i=1;i<=n;i++)
    {
        cin>>crr[i];
    }
    for(int i=1;i<=n;i++)
    {
        cin>>wrr[i];
        presum[i]=presum[i-1]+wrr[i];
    }
    for(int i=1;i<=n;i++)
    {
        if(occupied[crr[i]]==0)
        {
            sum+=wrr[i];
            occupied[crr[i]]=i;
        }
        else
        {

            int temp=presum[occupied[crr[i]]]-presum[pos]+wrr[pos];

            sum=sum-temp+wrr[i];
            pos=occupied[crr[i]]+1;
            occupied[crr[i]]=i;

        }
        if(sum>maxi)
        {
            maxi=sum;
        }
    }
      cout<<maxi<<endl;
   }
   return 0;
   }

chunky_2808 · June 27, 2017, 7:24pm

In subtask 1 How is O(n^2) Approach possible???
Time Complexity is O(n^2 * T)

karthik2910 · July 3, 2017, 11:25pm

Can someone help me out with what’s wrong in my code? It’s in python. I’m getting WA.
Here is the link: 14403325