 # CHFCH Editorial

Tester: Teja Vardhan Reddy
Editorialist: Hussain Kara Fallah

Easy-Medium

Greedy,sorting

### PROBLEM:

Given an array A of N integers A_1,A_2,...,A_N. An array is called beautiful if it has a chain of K consecutive identical elements (K is given).
In one operation you can swap 2 adjacent elements. What’s the minimum number of operations to make this array beautiful.

K \leq N \leq 300000

Numbers are less than or equal to 300000.

### Explanation:

It’s obvious that we should solve this problem for each different number separately.

Let’s maintain a sorted list of each different number’s occurrences, and calculate minimum number of operations we need to gather K of them consequently.

First of all, let’s suppose the current number has X occurrences. If X < K then we can do nothing.

For each consecutive K entries in this occurrences list, let’s find the cost to gather them, and pick the best option among the X-K+1 possible windows (segments of K consecutive entries).

It’s always better to gather consecutive ones. (Think why it’s better than if they were discrete, it’s easy).

So now, we have K elements positioned at different places in our array. How should we gather them?

It’s always better to gather them around the middle element (the middle one among these K). We will prove this one.

Assume that we have different items positioned at integral points on an axis line, and we need to gather them at one point. It’s clearly our problem but reformulated.

Let’s assume we decided to gather them at the point X with A items to its left and B items to its right (A>B). Assume that the cost of doing this is C. You can notice that if we move the destination point to X-1 our cost will change by B-A<0, so it’s a better solution.

Also, if B>A we should move it to the right in the same logic. So the target point must have the same number of elements to its left and to its right. Otherwise, we can decrease our total cost by moving it in direction with more elements than the other.

Let’s get back.

For each consecutive K entries in this occurrences list, let’s find the cost to gather them in the middle entry (the middle one of K entries, not the whole list).

In my implementation, each time I process a number’s occurrences list L. I create a prefix sum array such that sum_i contains the sum of positions of first i entries in the list.

Each time I maintain a window (l=i \,, r=i+K-1 \,, mid = \frac{l+r}{2}). I calculate the cost of gathering the elements (L_l,L_{l+1},...,L_{r}) at L_{mid} through my prefix sum array.

You can find the details of my calculations inside the implementation. It’s easy to come up with it though.

It’s recommended to check the Editorialist solution for a better understanding of this implementation.

Complexity: O(N)

### AUTHOR’S AND TESTER’S SOLUTIONS:

TESTER’s solution

Editorialist’s solution

1 Like

Just wanted to know why don’t you/they just submit it on problem in practice and provide solution link instead ?
PS: I did that when I was setter on CodeChef.

I was not able to understand how the left cost and right cost have been calculated ,please explain in detail the implementation of the formula (line 40 to line 42) ,rest 0of the thing i understood

okay so let’s say we have a k sequence s[1, 2, 4, 5, 8] and therefore about the mid i.e. element 4 we’ve to bring other elements, then new position will be like n[2, 3, 4, 5, 6], that means we’ve to make position s[i] to position n[i], i.e. elements on left need to be moved towards right and elements on right needs to be moved toward left to get closer to middle element.

leftcost = abs(sum of element of n[] left to middle - sum of element of s[] left to middle).
Similarly
rightcost = abs(sum of element of n[] right to middle - sum of element of s[] right to middle).
Hope this helps…

2 Likes

Hi, I understand that the cost would be least, when me merge at the middle element from both the sides.

But, in the proof it’s written, that moving X to the left to X-1 would change the cost by B - A, where A were the elements on the left and B were the elements on the right, which I believe is wrong, because the cost won’t change by B - A.

It depends on the value of elements (to the right of X and to the left of X) and not just the total number of elements on the left and right side. The condition still holds but I don’t think that the proof is valid.

For Example:
Consider the following array: [1 4 8 10 12]
Let X (index) = 3, which has element 10.
Cost of merging at X = 3, is 12 and A = 3, B = 1.

Now, lets move X to left such that X = 2
Cost of merging at X = 2, is 11.

It changes the cost by -1 and not B - A = -2

1 Like

/* package codechef; // don’t place package name! */

import java.util.;
import java.lang.
;
import java.io.*;

/* Name of the class has to be “Main” only if the class is public. */
class Codechef
{
public static void main (String[] args) throws java.lang.Exception
{
Scanner sc = new Scanner(System.in);
long t = sc.nextLong();
for (long e = 0; e < t; e++) {
long n = sc.nextLong();
long k = sc.nextLong();
long[] a = new long[(int) n];
for (long i = 0; i < n; i++)
a[(int) i] = sc.nextLong();
HashMap<Long, ArrayList> map = new HashMap<>();
for (long i = 0; i < n; i++) {
long ch = a[(int) i];
if (map.containsKey(ch)) {
ArrayList val = map.get(ch);
map.put(ch,val);
} else {
ArrayList val=new ArrayList<>();
map.put(ch,val);
}
}
boolean flag=false;
long min=Integer.MAX_VALUE;
//System.out.println(map);
for(long val:map.keySet()) {
long sum=0;
ArrayList idx=map.get(val);
if(idx.size()>=k) {
flag=true;
long count=0;
long mid=0;

				   mid=(idx.size())/2;

long l=mid-1;
long r=mid+1;
while(count<k-1) {
if(l>=0) {
count++;
sum+=idx.get((int)mid)-idx.get((int)l)-1;
l--;
}
if(r<idx.size()) {
count++;
sum+=idx.get((int)r)-idx.get((int)mid)-1;
r++;
}
}

min=Math.min(min, sum);
}else {
continue;
}
}

if(flag) {
System.out.println(min);
}else {
System.out.println(-1);
}

}

}


}

Well that’s true, the reason is that there’s one element standing at this place. This is a special case which doesn’t contradict the proof.I also tried to mention the line axis to make it general. It’s really frustrating to write every case in a proof. I mentioned the general logic of the proof, you would eventually find that some cases like this exist. I am not obligated to write proofs for simple stuff like this in editorials. But I tend to write the logic always.

If you read the code you will be able to understand. Translating code into words is a waste, especially if it’s simple to come up with it yourself. you just need to know how a prefix sum array works and then sit for few minutes and figure out the formulas.

maybe I will do that next time Ok let’s see an easy approach. Note the condition N<=300000 and A_i<=300000.So find max of A_i and build a vector v[max(A_i)] then v[A_i] contains all occurrences(positions) in original array. Now for all v[A_i].size()>=k take continuous groups of “k” elements.There would be n-k+1 groups ({1,2,3…,k} , {2,3,4,…,k+1} ,…{n+1-k,…n}) , where n = v[A_i].size(). For each A_i we need to consider these groups only(Prove this!) . Now for each one of n+1-k groups , u need to find distance from the middle element only(Prove this also).Find minimum of all such distances for each group for fixed A_i and then for variable i.

Check my solution here:- https://www.codechef.com/viewsolution/23223458

Whats the difference between your solution and editorial’s solution?

How is time complexity not n^2 ??? Can anyone tell me??

anyone can explain this solution please

Explanation of leftCost and RightCost from editorial code

Leftcost = (mid - l) * v[mid] - ((mid - l) * (mid - l + 1))/2 - get(l , mid - 1);

1. (mid-l) * v[mid] : This represents the cost of taking all the elements which are on the left of mid from 0 to mid.

2. (mid-l)*(mid-l+1)/2 : But that`s not we want right, hence we subtract 1 from element to be placed at mid-1, 2 from the element to be placed at mid-2, thats how we end up with this formula as their are mid-l elements.

3. get(l,mid-1) : Now we have distances for 0 to mid-1,0 to mid-2,third term now gives us the cumulative sum of the indices from where we are starting, which makes our distances accurate, i.e from where the element is present to mid-1, mid-2 etc.

Same funda applies to the Right cost also.

So I applied similar logic but I’m facing a TLE for bigger sub-problem. Can you help me understand how did you get the complexity as O(N)? Also, I am not fluent in C++ co the solution code doesn’t help unless you have a Python or Java version of it.

//