KOL16A - Editorial



Editorialist: Soumik Sarkar




Dynamic programming


You are given a list of N battles and a total cost metric PEB. Each battle b_i has an effect on the PEB which is determined by P_i, and a gain value F_i. If the battle is won, F_i is gained but PEB also increases. If the battle is lost no gain is made but PEB decreases. The goal is to calculate the maximum gain possible such that the PEB remains below a threshold value S. PEB is calculated as

total = 0
for each b_i
	if win
		total += 1 - P_i
		total -= P_i
PEB = max(total / no. of battles, 0)


It may not be immediately obvious due to the floating point values of P_i, but dynamic programming is to be applied to solve this problem. The key observation is that although all P_i are floating point values between 0 and 1, they have at most 2 digits after the decimal point. So there can be only 101 distinct values of P_i.

The following is the top-down approach, which is quite straightforward. Let the function be f. A single state will have two parameters, the index i and the “total” value till now t. If the battle can be won, the new total will be t+1-P_i. If the PEB with this total does not exceed S, winning is considered. The total gain from this option will be F_i + f(i+1, t+1-P_i). A battle can always be lost without restriction, and the total gain from losing will be f(i+1, t-P_i).

function f(i, t):
    if i == N + 1:
        return 0    // base case, end of all battles
    win = 0
    if t + 1 - P_i < S * i:
        win = F_i + f(i + 1, t + 1 - P_i)    // gain from winning
    lose = f(i + 1, t - P_i)                 // gain from losing
    return max(win, lose)

Just adding memoization to the above function gives a theoretically correct dp solution. However a direct implementation of the above is not a good idea, since using P_i as given will lead to floating point errors and incorrect answers. The workaround is to multiply all P_i and S by 100 so that each is now an integer, and then adjust the implementation of f accordingly.

Total number of unique values attainable by i is N and by t is 201 \cdot N, hence if using a 2D dp array care should be taken to initialize it with sufficient size. Also the value t can be negative, so a sufficient offset should be used to obtain the appropriate index.
Note that all the hassle can be avoided by using a map in place of an array :smiley:

The complexity of this approach is \mathcal{O}(N \cdot 201 \cdot N) \approx \mathcal{O}(201 \cdot N^2)


Editorialist’s solution can be found here.

1 Like