PROBLEM LINK:
Author: Archit Karandikar
Testers: Animesh Fatehpuria, Istvan Nagy
Editorialist: Vaibhav Tulsyan
PROBLEM
Given K arrays A_1, A_2, A_3, ... , A_K, report f(P, Q) for M queries.
f(P, Q) = \sum\limits_{i=1}^{s_P} \sum\limits_{i=1}^{s_Q} |P_i - Q_j|, where s_A represents length of array A.
Constraints:
1 \le K, M \le 10^5
1 \le \sum\limits_{i=1}^{K} s_i \le 10^5
-10^9 \le A_{ij} \le 10^9
EXPLANATION
Observations
- The arrays can be sorted as a pre-processing step as ordering of the arrays doesn’t matter to calculate the query.
- The function |P_i - Q_j| can be reduced to 2 sub-problems:
a. (P_i - Q_j) where P_i \gt Q_j.
b. (Q_j - P_i) where Q_j \gt P_i. - We can pre-compute prefix sums of all arrays after sorting them to compute sub-array sums in O(1).
Approach
- For each query, we can iterate over i and for each i, we can find the largest Q_j such that Q_j < P_i.
- Note that Q is monotonically increasing every time. Hence, we can binary search the value of Q_j.
- For each query, we should keep P as the smaller of the two arrays and Q the larger array. This would result in a better time complexity overall without affecting the answer.
Processing every query
- Contribution to total sum by P_i, Q_{1..j}:
- j * P_i - sum(Q_{1..j})
- Contribution to total sum by P[i], Q[j+1…sQ]:
- sum(Q_{j+1..s_Q}) - (s_Q - j) * P[i]
Time Complexity per query: O(s_P * log(s_Q))
Will this approach TLE?
Let N denote \sum\limits_{i=1}^{K} s_i.
Let’s call arrays of size \ge \sqrt{N} heavy arrays and of size \lt \sqrt{N} light arrays.
- Observe that there can be at most O(\sqrt{N}) heavy arrays.
Each query can be of one of these 3 types:
- Light-Light
- Light-Heavy
- Heavy-Heavy
Light-Light
- Max possible size of P: O(\sqrt{N})
- Max possible size of Q: O(\sqrt{N})
- Worst-case complexity: O(\sqrt{N} * log(sqrt(N)))
Light-Heavy
- Max possible size of P: O(\sqrt{N})
- Max possible size of Q: O(N)
- Worst-case complexity: O(\sqrt{N} * log(N))
Heavy-Heavy
- Max possible size of P: O(N)
- Max possible size of Q: O(N)
The worst-case complexity of Heavy-Heavy queries may seem to be O(N * log(N)), but it is not so.
In the worst case, array P is a heavy array at most O(\sqrt{N}) times.
Hence, across all heavy arrays P, the no. of iterations on average is O(\sqrt{N}).
The complexity for Heavy-Heavy queries is: O(sqrt{N} * log(N)).
Memoization of queries
Consider a case in which K=2 and s_1 = N/2, s_2 = N/2. All queries are of type f(A_1, A_2).
In this case, the approach discussed so far would exceed time limit as the overall time complexity would be O(Q * N).
We need to memoize answers to each query to make sure answers to previously calculated queries are not re-computed.
CHALLENGE PROBLEM
Can you solve this problem in O(Q\sqrt{N}) time complexity?
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here.
Tester 1’s solution can be found here.
Tester 2’s solution can be found here.