CLONEME Editorial (Unofficial)

anushi · June 12, 2017, 5:27pm

Problem Link

Prerequisites

Binary Search

Quick Overview

Given an array and two ranges you need to find whether the subarray lying in these two ranges have mismatch of less than 1 or not when elements are compared correspondingly in a sorted order

Solution

For 10 Points

It’s quite easy you can just copy the two subarrays in the new array and sort it in O(N log N) and check the corresponding values for mismatch.if mismatch is more than 1 output NO else YES.

My Solution using this click here

For 30 Points

I used a simple approach in O(n) with some optimization.

What I did was made a frequency array for all values.Since A[i]<=100000 it’s feasible.

When u will create this frequency array we can easily maintain that values inserted in the vector are in sorted order only.

Now you can simply iterate over all values of given array from a to b and check the frequency of that value in two range [a b] and [c d] using Binary search.Two cases arise then:

IF diff in freq==1

we store the index of that value which is more in either range and increment our counter.

IF diff in freq>1

it means that 2 or more elements are different & hence no need for further search we can break the operation and print NO.

Now if till end diff in freq ==0

Answer is obvious YES

but if diff in freq ==1

You need to check that if there is some element which lies between these two different elements or correspondingly you can found the rank of those different element in both arrays.

If rank is same:

They will be at same position & hence output YES

else

They will lie at different position & hence output NO

Optimization

I did was avoid checking same values by maintaining a bitset. It compensates for the time taken in the binary search to count frequency.
I changed a,b,c,d for overlapping ranges to avoid searching for the same element.

Here is the solution to my code implemented that way.

fallandrise · June 12, 2017, 5:49pm

I tried running your solution on extreme case:

t = 1

n = 10^5

q = 10^5

A_{i} = 1 \forall i

Then for each test; a, b, c, d are choosen randomly such that b - a = d - c.

Now your solution takes nearly 36 seconds to run.

Test link: input.txt (nearly 2.5 MB)

Generator link: https://ideone.com/sHSOBj

UPD1 - Log:

$ /usr/bin/time -f "Elapsed time: %E seconds\n" ./a.out < in > o
Elapsed time: 0:36.60 seconds

lohit_97 · June 12, 2017, 6:00pm

Yes. Each query can take O(nlogn) time in worst case. So it is more of QN*logN, bad to see this pass.

anushi · June 12, 2017, 6:04pm

Something might be wrong in your calculation bcoz bitset allows to make only one binary search in this case O(10^5) and it then directly goes to print answer.
I think it might be 0.36 sec

fallandrise · June 12, 2017, 6:05pm

@lohit_97 Yep, just wondering if there is a case similar to this in official test cases!

fallandrise · June 12, 2017, 6:08pm

@anushi Nope. It’s 36 seconds. Observe that in the worst case, you are using O(N*logN) per query. So It’s O(Q*N*logN). The only advantage in your solution is that you are breaking the loop if you find mismatches which are greater than 2. So that gives some optimization. But in the given test, both the sub arrays are same. So no mismatches so your code runs till the length of given subarrays. So O(N) per query.

You can check your code with given test to see your self.

anushi · June 12, 2017, 6:10pm

No, it can’t take O(N log N) bcoz if all arrays elements are same then Binary Search range is large which can be at max O(log n) but then the search is only for 1 element.But in worst case, if all array elements are different then binary search is across one element and it takes at most O(N) thus worst case is O(QN/2)=O(QN)
Divide by 2 bcoz of modified search range

anushi · June 12, 2017, 6:13pm

See the optimization is done using bitset and in this test case it for loop runs only once & it then prints the answer.
If you have any doubt in how many times the loop has run you can put a counter & check its value!

fallandrise · June 12, 2017, 6:17pm

@anushi

Observe this:

.....
while(q--){
   cin>>l>>r>>x>>y;
   .....
 
   for(i=l;i<=r && ans<2;i++){
      .....
   }
}

Here ans is number of mismatches you are calculating in the loop. You are breaking the loop if it’s greater than or equal to 2. But in the given test, this never happens. Infact, answer if always 0. i.e “YES” always. So you will run from l to r for each query. which is O(Q.N) in total. And this solution(\approx 10^{10} operations) can’t run in 2 seconds.

anushi · June 12, 2017, 6:30pm

Yes the loop runs for all l to r but see that just after loop there is one if condition and u can say that loop is effective only for i=l for other times it is just not doing anything that is why it don’t add to complexity.Specially it can’t be O(nlogn) bcoz for that each time loop needs to do a binary search which is not the case.

for(i=l;i<=r && ans<2;i++){
if(bit[a[i]]){
…
bit[a[i]]=false;
…
}
}

fallandrise · June 12, 2017, 6:34pm

Well, leave about logN part, atleast you accepted that it’s O(N) per query. What I am saying is, O(Q*N) itself is very high to pass in 2 seconds. And it’s taking nearly 36 seconds to run.

vijju123 · June 12, 2017, 6:38pm

Hey dear! Can you please tell how you generated the test case?? Thanks!

fallandrise · June 12, 2017, 6:46pm

@vijju123 With the attached generator.

abx_2109 · June 12, 2017, 6:50pm

A very intuitive and nice solution is as follow:

First let us make a prefix sum , sum of squares, sum of 3/2 powers and denote them by sum,sum2,sum3/2 respectively.
Now when these three are equal for a given l1,r1,l2,r2 i.e (sum[r1]-sum[l1-1]==sum[r2]-sum[l2-1] )&&(sum2[r1]-sum2[l1-1]==sum2[r2]-sum2[l2-1] )&&(sum3/2[r1]-sum3/2[l1-1]==sum3/2[r2]-sum3/2[l2-1] ) then ans is YES.

Else,
Let us assume EXACTLY one mismatch.Now find the values of the mismatched elements(let them be a in [l1,r1] and b in [l2,r2]).These can be calculated using the sum and sum of square equations.Now if our assumption is corrct(of exactly one mismatched element) then a and b must come out to be integers and also they must satisfy the sum3/2 equation.Also they must share atleast one common index in both the ranges(Think why!).This can be checked using any suitable method.If these conditions are ture ans is YES.Else ans is NO.

anushi · June 12, 2017, 7:30pm

Your input is not even worst case it is the best case in fact in which complexity is only O(Q*log(N)) which can easily be obtained in 2sec but u r getting it in 36 sec.
Make sure that the time for giving input is not counted.

Also, my submission passed in 1.03 sec which is not just around the constraints & I don’t think that code chef test cases don’t have q=10^5 & n=10^5 so for any case with this q and n yours input was the best! so if were the case I would have got TLE

sonu_628 · June 12, 2017, 8:07pm

@anushi @fallandrise this indicates poor test cases. @fallandrise is correct about the complexity. @anushi you are correct Codechef doesn’t have n = 10^5 and q = 10^5, but that doesn’t it shouldn’t.
you had nice luck anyway

ania7 · June 12, 2017, 8:20pm

I believe that swapping ranges for overlapping segments is incorrect. If the array is 1 2 3 and we ask about ranges [1,2] and [2,3] then the answer should be NO but shrinking it to [1,1] and [3,3] yields YES.

saurabheights · June 12, 2017, 8:42pm

@anushi: Why does this range trimming fails? https://www.diffchecker.com/mTNjMBYA
The left diff passes for first subcases, while the right diff fails.

pkacprzak · June 12, 2017, 8:52pm

I solved it with sqrt-decomposition in O((n+q) \cdot \sqrt n). The idea is to clone (yeah, why not) the input array B = \sqrt n number of times In particular, the i-th copy contains only numbers from range [i \cdot B, (i+1) \cdot B)

Next, for each such copy of the array, we compute its prefix hashes, where h(i, j) is the hash of the j-th prefix of the i-th array, where integer v occurring c times in such a prefix adds value c \cdot P^v to this hash, and P is a prime greater than 10^5. This allows us to compute hashes of any range of every copy of the array.

Next, for each query, we compare hashes corresponding to ranges in the query one by one and there are few cases to consider. For implementation details refer to my solution here: https://www.codechef.com/viewsolution/14105123

anushi · June 12, 2017, 11:53pm

Range trimming is only for checking values till that we running loop from l to r but while calculating rank you have to consider full range