SADQ - Editorial

PROBLEM LINK:

Practice
Contest

Author: Archit Karandikar
Testers: Animesh Fatehpuria, Istvan Nagy
Editorialist: Vaibhav Tulsyan

PROBLEM

Given K arrays A_1, A_2, A_3, ... , A_K, report f(P, Q) for M queries.

f(P, Q) = \sum\limits_{i=1}^{s_P} \sum\limits_{i=1}^{s_Q} |P_i - Q_j|, where s_A represents length of array A.

Constraints:

1 \le K, M \le 10^5

1 \le \sum\limits_{i=1}^{K} s_i \le 10^5

-10^9 \le A_{ij} \le 10^9

EXPLANATION

Observations

  1. The arrays can be sorted as a pre-processing step as ordering of the arrays doesn’t matter to calculate the query.
  2. The function |P_i - Q_j| can be reduced to 2 sub-problems:
    a. (P_i - Q_j) where P_i \gt Q_j.
    b. (Q_j - P_i) where Q_j \gt P_i.
  3. We can pre-compute prefix sums of all arrays after sorting them to compute sub-array sums in O(1).

Approach

  • For each query, we can iterate over i and for each i, we can find the largest Q_j such that Q_j < P_i.
  • Note that Q is monotonically increasing every time. Hence, we can binary search the value of Q_j.
  • For each query, we should keep P as the smaller of the two arrays and Q the larger array. This would result in a better time complexity overall without affecting the answer.

Processing every query

  • Contribution to total sum by P_i, Q_{1..j}:
    • j * P_i - sum(Q_{1..j})
  • Contribution to total sum by P[i], Q[j+1…sQ]:
    • sum(Q_{j+1..s_Q}) - (s_Q - j) * P[i]

Time Complexity per query: O(s_P * log(s_Q))

Will this approach TLE?

Let N denote \sum\limits_{i=1}^{K} s_i.

Let’s call arrays of size \ge \sqrt{N} heavy arrays and of size \lt \sqrt{N} light arrays.

  • Observe that there can be at most O(\sqrt{N}) heavy arrays.

Each query can be of one of these 3 types:

  1. Light-Light
  2. Light-Heavy
  3. Heavy-Heavy

Light-Light

  • Max possible size of P: O(\sqrt{N})
  • Max possible size of Q: O(\sqrt{N})
  • Worst-case complexity: O(\sqrt{N} * log(sqrt(N)))

Light-Heavy

  • Max possible size of P: O(\sqrt{N})
  • Max possible size of Q: O(N)
  • Worst-case complexity: O(\sqrt{N} * log(N))

Heavy-Heavy

  • Max possible size of P: O(N)
  • Max possible size of Q: O(N)

The worst-case complexity of Heavy-Heavy queries may seem to be O(N * log(N)), but it is not so.

In the worst case, array P is a heavy array at most O(\sqrt{N}) times.
Hence, across all heavy arrays P, the no. of iterations on average is O(\sqrt{N}).

The complexity for Heavy-Heavy queries is: O(sqrt{N} * log(N)).

Memoization of queries

Consider a case in which K=2 and s_1 = N/2, s_2 = N/2. All queries are of type f(A_1, A_2).

In this case, the approach discussed so far would exceed time limit as the overall time complexity would be O(Q * N).

We need to memoize answers to each query to make sure answers to previously calculated queries are not re-computed.

CHALLENGE PROBLEM

Can you solve this problem in O(Q\sqrt{N}) time complexity?

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester 1’s solution can be found here.
Tester 2’s solution can be found here.