PROBLEM LINK:
Author: Alexey Zayakin
Testers: Hasan Jaddouh
Editorialist: Alexey Zayakin
DIFFICULTY:
Easy
PREREQUISITES:
Ad-hoc, math, median
PROBLEM:
Your are given two n \times m matrices A and B and you are allowed to perform the following actions:
- Choose one of the matrices
- Choose either one row or one column of the selected matrix
- Increment all the numbers in the selected row or column by 1
What is the minimal number of actions required in order to make the two given matrices equal?
QUICK EXPLANATION:
We can reduce the problem to: given matrix C = A - B and allowed operations are increase or decrease one row or column of the matrix C by 1, the goal it to make matrix C contain only zeroes.
The order of operations doesn’t matter, so we can just think of “balance” on every row and column. If we fix the “balance” of the first row, all the other balances can be uniquely determined. To find the optimal value for the first row, either use binary search or take a median of some values
EXPLANATION:
First of all, let’s define a matrix C = A - B, i.e. C_{ij} = A_{ij} - B_{ij}. Every operation with matrix A will increment an appropriate row or column in matrix C, and every operation with matrix B will decrement an appropriate row or column in matrix C. Our goal of making matrices A and B equal can be reformulated to make matrix C contain only zeroes. From now on we will work only with matrix C assuming allowed operations are both increment and decrement.
It is easy to notice that the order of operations doesn’t matter, thus for every row we can just calculate its balance, i.e. (how many times we will increment it) minus (how many times we will decrement it). Let’s denote the balance of row i with row_i and similarly ― the balance of column j with col_j.
After applying all the operations matrix C should contain only zeroes, which can be reformulated as:
For every pair (i, j): C_{ij} + row_i + col_j = 0 \;\;\;\;\; (*).
If we fix the value of row_1 = x, then using the above equations with i = 1, we can get:
C_{1j} + x + col_j = 0 \Leftrightarrow col_j = -C_{1j} - x
For j = 1 we have col_1 = -C_{11} - x and plugging j = 1 into (*) we get:
C_{i1} + row_i - C_{11} - x = 0 \Leftrightarrow row_i = C_{11} + x - C_{i1}
So far we have used only the first row and the first column of the matrix C. To answer the question whether there exists a solution at all, we should check that the above balances make the whole matrix C to contain zeroes only, i.e. rewriting (*):
For every pair (i, j): C_{ij} + (C_{11} + x - C_{i1}) + (-C_{1j} - x) = C_{ij} - C_{i1} - C_{1j} + C_{11} = 0
If at least one of the above equalities fails, there is no solution at all, otherwise we should find the optimal values of row_1 = x.
The values we want to minimize (i.e. the total number of actions) is just:
|row_1| + |row_2| + \dots + |row_n| + |col_1| + |col_2| + \dots + |col_m|
Or rewriting it using the fixed value of row_1 = x:
\displaystyle \sum_{i=1}^{n} |C_{11} - C_{i1} + x| + \sum_{j=1}^{m} |-C_{1j} - x| = \sum_{i=1}^{n} |C_{i1} - C_{11} - x| + \sum_{j=1}^{m} |-C_{1j} - x|
Now the problem reduces to “Given (n + m) points on line, find a point x that minimizes the total distance from x to all the given points”. The (n + m) points are n points for rows: (C_{i1} - C_{11}) and m points for columns: -C_{1j}. This problem has a well-known solution which is just the median of the given points.
The logic behind the median idea is: if we move the point x to the left, then distance to all the points that were located before x decreases, and distance to all the points that were located after x increases. For the optimal position of x moving it to the left will not reduce the total distance, which means that (count of points before x) \le (count of points after x). By a similar argument, moving x to the right will not reduce the total distance as well, which means that (count of points before x) \ge (count of points after x). The only solution to the two inequalities above is (count of points before x) = (count of points after x), which in turn means that x is the median.
Time Complexity:
\mathcal{O}(n \cdot m + (n + m) \cdot \log{(n + m)}) per test case.