PROBLEM LINK:
Setter-
Tester- Multiple
Editorialist- Abhishek Pandey
DIFFICULTY:
EASY-MEDIUM
PRE-REQUISITES:
Dynamic Programming , Sorting
PROBLEM:
You are given an array A, and you ahve to partition it into exactly K disjoint sub sets, such that-
- Size of each subset is at least 2.
- For all subsets, define D_i=Difference between maximum and minimum elements. The sum of D_i over all of the K disjoint subsets must be minimized.
QUICK EXPLANATION:
It is nothing but a simple/intuitive dp problem. We make a dp table dp[n][k], where dp[i][j] represents “the minimum sum of D_i obtained when all i elements are divided into j disjoint subsets”.
With this, we easily get recurrance formula of-
curr=min(curr,dp[j-2][i-1]-arr[j-1]);
dp[j][i]=curr+arr[j];//We already subtracted arr[j-1], hence the complete term arr[j]-arr[j-1]
//is taken into account.
OR
for(i=3,i< n+1, i++) {
for(j=1,j< k+1, j++) {
dp[i][j] = min(dp[i-1][j], dp[i-2][j-1]) + arr[i] - arr[i-1];
}
}
EXPLANATION:
I will first describe the intuitive approach. That is enough to solve the question. After that, I will try to explain the setter’s memory optimal approach as an extra in my corner as well. The intuitive way will first talk of reducing it to finding K good sub arrays , and then the dp part.
Intuitive way-
a. Reducing it to a sub array problem-
First notice the constraints on N and K. They are enough to allow an O(NK) solution to pass. The problem itself smells of dp, so figuring out what to do shouldnt be a problem. But wait a minute, it talks about picking subsets. Involving subsets, can, always complicate the question. Hence, before we begin our dp solution, lets focus on how to deal with this sub set problem.
When wanting to minimize difference, it always makes sense to put the “immediate next greatest element” in the subset. Anything else will surely raise the absolute difference, thats already known. If I put an element greater than the immediate next greatest element
then the absolute difference will be greater (remember, we replaced a small element of this set with a larger element of another set! The absolute difference between minimum and maximum element in both sets can only increase (or stay constant).)
This allows us to use sorting- we can sort the array and pick the next greatest element quickly. We reduced the problem from “Partitioning into K good subsets” to “Partitioning into K good sub-arrays”.
The essence of this section is that, the optimum choice is that, starting from minimum possible element and including next greatest element will only lead to minimum sum of D_i. If, at any place, we put in any element which is not the next greatest element (and obviously isnt originally included in the set by our algorithm above), then the sum will either be constant, or increase.
b. The dp part-
After the first part clarifies which elements are best to include in the set, we need to decide on only one more crucial step. Is it better to include that element in the set OR start a new set, starting from that element?
This, is easily handled by dp.
The first focus is on base cases.
dp[i][0]=0 or INF
-depending on your approach- what your paradigm defines when asked to make 0 subsets.
dp[i][1]= arr[i]-arr[1];//1-based indexing
If we have to partition entire array into 1 subset, the answer is known.
Now, consider this. Suppose we are asked to partition i elements into j subsets (i.e. calculate dp[i][j] ), and we already know dp[m-2][l-1] for 1\le m\le N and 1\le l \le K, i.e. and we know the optimum ways of dividing m-2 elements into l-1 subsets for all 1\le m\le N and 1\le l \le K.
Now, what will be my answer like?
Will it not be of form-
Optimal Division of last subset + Optimal division of (l-1) subsets for remaining elements
In other words, let last subset consist of X out of i elements. Then, will my answer not be of form-
Optimally Divide (X) elements into 1 set + Optimally Divide (i-X) elements into (l-1) sets
We already know the answer for Optimally Divide (X) elements into 1 set
as “Element at end of sub array- element at start of sub array” (Remember, we sorted the array! So min and max elements are at start and end respectively). And the answer for Optimally Divide (i-X) elements into (l-1) sets
is already stored in dp[i-x][l-1].
We are done!
We defined the base case, from that base case we can use recurrence relation to go 1 step ahead, and from there, use the recurrance relation again to go another step ahead, and another, and another… until we calculate the required values of dp[i][j]. At last, when the table is computed, dp[n][k] will have the answer.
You can refer to Editorialist’s solution for commented implementation.
SOLUTION:
TimeComplexity=O(N*K)
SpaceComplexity=O(N*K)
CHEF VIJJU’S CORNER:
1.Its worthwhile discussing setter’s solution which uses just O(K) space. Its based on the observation that, we only need previous column (j-1) to be able to calculate the answer. Before seeing his solution, check out this one- https://www.codechef.com/viewsolution/16520099 . Notice how he related dp[i][j] with dp[i-1][j]. Then, look at this-
sort(arr + 1, arr + 1 + n);
for (int i = 0; i <= k; i++) {
dp[0][i][0] = dp[0][i][1] = INF;
dp[1][i][0] = dp[1][i][1] = INF;
}
dp[1][1][0] = 0;
dp[1][1][1] = INF;
for (int a = 2; a <= n; a++) {
for (int j = 1; j <= k; j++) {
int i = (a & 1), pi = (i ^ 1);
dp[i][j][0] = min(INF, dp[pi][j - 1][1]);
dp[i][j][1] = min(INF, min(dp[pi][j][0], dp[pi][j][1])) + (arr[a] - arr[a - 1]);
}
}
In his solution, dp[i][j][k] represents dp[Parity of a (odd/even)][K][whether the subset was closed at i-1 or not] The variable pi represents previous i, and is nothing but parity of previous i.