PROBLEM LINK:
Author: Sunny Aggarwal
Tester: Roman Furko
Translators: Vasya Antoniuk (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza
DIFFICULTY:
Easy
PREREQUISITES:
Dynamic programming, binary search
PROBLEM:
Given an N\times M grid of numbers \{H_{i,j} \}, answer Q queries of the following kind:
- Given G, find the size of the largest set of “triangles” each of whose sum is positive and the sum of their sums is \le G.
There are eight kinds of triangles, depending on orientation.
QUICK EXPLANATION:
- Compute all sums of all triangles in O(N^2 M) time using dynamic programming. (Handle each of the eight kinds separately.)
- Sort these sums in increasing order. (Counting sort is useful because the sums are at most H_{\max} NM.)
- Compute the cumulative array of sums.
- To answer each query, perform binary search in the cumulative array.
EXPLANATION:
How many triangles?
The problem suggests that we need to find the list of sums of all possible triangles.
However, that looks infeasible because there might be too many triangles to begin with, which seems to suggest we shouldn’t enumerate them all. But with a little counting, we can actually determine the number of triangles.
Let’s consider the following type of triangle first:
. . . . . . . . . . .
. . . . . # . . . . .
. . . . # # # . . . .
. . . # # # # # . . .
. . # # # # # # # . .
. . . . . . . . . . .
The way to count triangles like this is to count bounding boxes instead. (We define the bounding box as the smallest rectangle that contains the figure.) This works because any given bounding box (of the right height-width ratio) determines exactly one triangle of this type. In the example above, the bounding box has dimensions 3\times 6. In general, bounding boxes of this type of triangle all have a height-width ratio of 1:2. Thus, the number of triangles of this type is the number of 1:2 rectangles contained in a rectangle of dimensions (N-1)\times (M-1). (Remember that N\times M is the size of the number grid, but the rectangle itself has dimensions (N-1)\times (M-1).) This can be computed as the following sum:
We can find a very simple upper bound for this:
Thus, we have shown that there are at most \frac{1}{2} N^2 M triangles of this type. For the remaining seven types, we can perform a similar calculation and find out that there are also at most \frac{1}{2} N^2 M triangles of every other type. Thus, the total number of triangles overall is at most 4N^2 M, which in our current problem (based on the constraints) is small enough to be manageable. Thus, we should feel no guilt in trying to enumerate all triangles!
Computing the sum of all triangles
We have just established that the number of triangles isn’t too large. However, we can’t enumerate their sums naïvely because it takes a while to get the sum of each triangle. In other words, while the number of triangles is small, the sum of their areas is not. Thus, what we need is an efficient way of computing the sum of all triangles.
The key here is to notice that many triangles share a lot of entries, and some even differ only in very specific ways. For example,
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . # . . . . . . . . . . . . . . . . . . . . . # . . . . .
. . . . # # # . . . . . . . . . . . . . . . . . . . # # # . . . .
. . . # # # # # . . . + . . . . . . . . . . . = . . . # # # # # . . .
. . # # # # # # # . . . . . . . . . . . . . . . # # # # # # # . .
. . . . . . . . . . . . # # # # # # # # # . . # # # # # # # # # .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The left triangle and the right triangle differ in only a single horizontal line segment. In other words, to get the sum of the right triangle, we only need to get the sum of the left triangle and add to it the sum of the horizontal line segment!
Similar relationships can be found in other triangle types:
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . # # # # # # # # # . . # # # # # # # # # .
. . # # # # # # # . . . . . . . . . . . . . . . # # # # # # # . .
. . . # # # # # . . . + . . . . . . . . . . . = . . . # # # # # . . .
. . . . # # # . . . . . . . . . . . . . . . . . . . # # # . . . .
. . . . . # . . . . . . . . . . . . . . . . . . . . . # . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . # . . . . . . . . . . . . . # .
. . . . # # . . . . . . . . . . . . # # .
. . . # # # . + . . . . . . . = . . . # # # .
. . # # # # . . . . . . . . . . # # # # .
. . . . . . . . # # # # # . . # # # # # .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . # # # # # . . # # # # # .
. . # # # # . . . . . . . . . . # # # # .
. . . # # # . + . . . . . . . = . . . # # # .
. . . . # # . . . . . . . . . . . . # # .
. . . . . # . . . . . . . . . . . . . # .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. # . . . . . . . . . . . . . # . . . . .
. # # . . . . . . . . . . . . # # . . . .
. # # # . . . + . . . . . . . = . # # # . . .
. # # # # . . . . . . . . . . # # # # . .
. . . . . . . . # # # # # . . # # # # # .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . # # # # # . . # # # # # .
. # # # # . . . . . . . . . . # # # # . .
. # # # . . . + . . . . . . . = . # # # . . .
. # # . . . . . . . . . . . . # # . . . .
. # . . . . . . . . . . . . . # . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . # . . . . . # .
. . . # . . . . . . # . . . . # # .
. . # # . . . . . . # . . . # # # .
. # # # . . + . . . . # . = . # # # # .
. . # # . . . . . . # . . . # # # .
. . . # . . . . . . # . . . . # # .
. . . . . . . . . . # . . . . . # .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . # . . . . . # . . . .
. . # . . . . # . . . . . # # . . .
. . # # . . . # . . . . . # # # . .
. . # # # . + . # . . . . = . # # # # .
. . # # . . . # . . . . . # # # . .
. . # . . . . # . . . . . # # . . .
. . . . . . . # . . . . . # . . . .
. . . . . . . . . . . . . . . . . .
Thus, to enumerate the sums of all triangles efficiently, we only need to compute the sums of horizontal and vertical line segments quickly! Since the horizontal and vertical cases are very similar, we’ll only focus on the horizontal case. Specifically, given i and j_1 \le j_2, we want to find out the sum H_{i,j_1} + H_{i,j_1+1} + \ldots + H_{i,j_2}.
To do it quickly, we can simply compute the cumulative sums of each row! In more detail, for 1 \le i \le N and 0 \le j \le M, let R_{i,j} be the sum of the first j entries in the i th column, i.e. the sum H_{i,1} + H_{i,2} + \ldots + H_{i,j}. The R_{i,j} values can be efficiently computed with dynamic programming using the recurrence R_{i,j} = H_{i,j} + R_{i,j-1} and base case R_{i,0} = 0. Most importantly, the R_{i,j} values can be used to compute the sum H_{i,j_1} + H_{i,j_1+1} + \ldots + H_{i,j_2} efficiently as simply R_{i,j_2} - R_{i,j_1-1}!
(Similarly, we compute cumulative sums of each column to compute vertical line sums quickly.)
With these tools and methods, we now have an efficient way of enumerating all triangle sums!
The following pseudocode illustrates one way to do it:
// compute cumulative sums
for i = 0..n-1:
for j = 0..m-1:
D[i+1][j] = H[i][j] + D[i][j]
R[i][j+1] = H[i][j] + R[i][j]
// compute all triangle sums
for i = 0..n-1:
for j = 0..m-1:
for (k = 1, s = H[i][j]; i-k >= 0 and j+k < m; k++)
yield (s += R[i-k][j+k+1] - R[i-k][j])
for (k = 1, s = H[i][j]; i+k < n and j+k < m; k++)
yield (s += R[i+k][j+k+1] - R[i+k][j])
for (k = 1, s = H[i][j]; i-k >= 0 and j-k >= 0; k++)
yield (s += R[i-k][j+1] - R[i-k][j-k])
for (k = 1, s = H[i][j]; i+k < n and j-k >= 0; k++)
yield (s += R[i+k][j+1] - R[i+k][j-k])
for (k = 1, s = H[i][j]; i-k >= 0 and j-k >= 0 and j+k < m; k++)
yield (s += R[i-k][j+k+1] - R[i-k][j-k])
for (k = 1, s = H[i][j]; i+k < n and j-k >= 0 and j+k < m; k++)
yield (s += R[i+k][j+k+1] - R[i+k][j-k])
for (k = 1, s = H[i][j]; j-k >= 0 and i-k >= 0 and i+k < n; k++)
yield (s += D[i+k+1][j-k] - D[i-k][j-k])
for (k = 1, s = H[i][j]; j+k < m and i-k >= 0 and i+k < n; k++)
yield (s += D[i+k+1][j+k] - D[i-k][j+k])
(Note that this pseudocode is zero-indexed, but the idea is the same.)
This code yields every sum of some triangle exactly once.
Preparing the triangle sums for the queries
Now that we have a list of triangle sums, it’s time to answer queries. For a given query G, we need to find the size of the largest set of triangles each of whose sum is positive and the sum of their sums is \le G. Since we only care about positive triangle sums, we can simply ignore all triangle sums equal to zero.
Let’s make this more specific. Let’s say the nonzero triangle sums are [s_1, s_2, s_3, \ldots, s_k]. What is the size of the largest subset of these sums whose sum is \le G? Clearly, a greedy approach will work here, and we need to get the triangles with the smallest sums first. (Why?) Thus, our first step is to sort the list [s_1, \ldots s_k].
Any fast sorting algorithm will do, because the list is somewhat manageable. However, counting sort is particularly useful here, because 0 \le H_{i,j} \le 50, which means that the largest sum of any triangle is 50NM, which is less than 6 million.
Next, now that [s_1, \ldots, s_k] is sorted, the smallest sums are now in front of the array. Thus, in order to maximize the number of sums we get, we will need to take the leftmost element one by one until the sum exceeds G. In other words, the answer for the query G is the length of the longest prefix of [s_1, \ldots, s_k] whose sum is \le G.
It looks like we’ll be needing prefix sums again, so let’s try calculating all prefix sums beforehand. Let p_i be the sum of the first i elements of s, i.e. p_i = s_1 + s_2 + \ldots + s_i. As before, the $p_i$s can be computed with dynamic programming with the recurrence p_i = s_i + p_{i-1} and base case p_0 = 0. Thus, we can now rephrase the answer to a query: the answer to the query G is the largest i such that p_i \le G. But since p_i is increasing, this index can be computed using binary search!
The following pseudocode illustrates it: (Here k
is the length of [s_1, \ldots, s_k])
def answer_query(G):
L = 0
R = k+1
while R - L > 1:
// we will maintain the invariant p[L] <= G < p[R]
M = (L + R) / 2
if p[M] <= G:
L = M
else:
R = M
return L
Binary search is very fast (O(\log k)), so this gives us a very fast way of answering queries. In our case, k \le 4N^2 M, so processing all Q queries takes O(Q \log (N^2 M)). Overall, the time complexity is O(N^2 M + H_{\max} NM + Q \log (N^2 M)):
- O(N^2 M) for computing triangle sums,
- O(H_{\max} NM) for sorting the sums,
- O(N^2 M) for computing prefix sums, and
- O(Q \log (N^2 M)) to answer queries with binary search.
Time Complexity:
O(N^2 M + H_{\max} NM + Q \log (N^2 M))