COFFEE - Editorial

pkacprzak · September 22, 2015, 1:13am

PROBLEM LINK:

Practice
Contest

Author: Sergey Kulik
Tester: Yanpei Liu
Editorialist: Pawel Kacprzak

DIFFICULTY:

Medium

PREREQUISITES:

dp

PROBLEM:

Your working day is divided into N periods. In each one, you can either write some code or drink a cup of coffee. If you decide to write code in the i^{th} period, you will write a_i lines in that period. On the other hand, if you decide to get a cup of coffee in the i^{th} period, then you do not write any lines of code in that period, but for each period j > i, such that i + D \geq j, the number of lines that you write in that period gets multiplied by M. The only restriction is that you have to drink exactly K cups of coffee during the whole day. Given N, K, D, M and the array a of size N, your task is to compute the maximum number of lines of code which you can write during all N periods.

QUICK EXPLANATION:

Please, skip this section if you want to write about how to solve subtasks or how to derive the desired solution.
A quick explanations is that, you can use dynamic programming to count the maximum number of lines of code which you can write during the first i periods, drinking exactly j cups of coffee, and drinking the last one at time t.

EXPLANATION:

If the problem looks complicated at first sight, it is always good to simplify it a bit.

Simplified problem

Let’s consider a version of the problem, when K equals 1, so you are have to drink exactly one cup of coffee during the whole day. Solving this simplifier problem is enough to complete the first subtask here. Let’s assume that we know what period is the best for drinking coffee. Let denote this period by P. How to compute the number of lines of which we write, when we drink coffee during period P? Well, this is pretty simple:

In period P, we do not write any code, because we drink coffee then.
In each period j > P, such that P + D \geq j, we write M \cdot a_j lines
In any other period i, we write a_i lines.

So for a fixed period P, we can compute the number of lines we write in O(N). If we try any period as P, we can solve this simplified problem in O(N^2) time, which is perfectly fine for the first subtask, because N \leq 18 there. Since we have T test cases to handle, the total complexity of that solution is O(T \cdot N^2).

General problem for small N

Ok, so the case when K = 1 for not big N is quite simple, let’s try now to solve the problem for any K < N, but without worrying about the number total number of periods. Do we know how to solve the problem, if we fix all K periods during which we drink coffee? Of course we know, and this is very similar to a method presented in in the above paragraph:

Let S be a set of K periods during which we drink coffee:

In any period from S, we do not write any code, because we drink coffee during these periods.
For any other period j, let P be a period with the greatest index less than j, such that we drank coffee in that period. If P exists and P + d \geq j, we write M \cdot a_j lines of code in period j, otherwise, we write just a_j lines in that period.

If we scan periods from left to right, updating the last period when we drank coffee, we can compute the number of lines of code we write for that fixed set S in O(N) time. In order to compute the best set S, we can iterate over all \binom{N}{ K} such sets, and compute the best one. In the second subtask, N is also not greater than 18, so that method will work there, but since there are at most 200 test cases to handle, you can precompute a list of all K-element subsets of the set of N periods for any N, K, where 1 \leq K \leq N \leq 18. Doing that, you can achieve the total time complexity of O(2^N + T \cdot \binom{N}{K} \cdot N).

Speeding it up

However, in order to get a full credit for that problem, you have a much faster solution Original constraints are 1 \leq K < N \leq 200. Let’s try to derive it.

In any period i, if you have drunk less than K cups of coffee in previous periods, you can either drink a cup and do not write any lines of code, or do not drink a cup and write a_i lines, or write M \cdot a_i lines, if you have drunk at least one cup in any of last D periods. This reasoning leads to a nice and clear dynamic programming approach.

Let g[i][j][t] be the maximum lines of code that you can write during the first i periods, drinking exactly j cups of coffee, and drinking the last one in period t. Then you can use the following rules to compute the whole g table. Notice that the result here is \max \{g[N][K][t] : K \leq t \leq N\}:

//initially, set all entries in g table to -1, because some entries are invalid, like when the number of drunk cups is greater than the number of periods

//computing g[i + 1][j][t]
if i + 1 > t && t + D >= i + 1:
    lines = m * a[i + 1]
else:
    lines = a[i + 1]
g[i + 1][j][t] = max(g[i + 1][j][t], g[i][j][t] + lines)
g[i + 1][j + 1][t] = max(g[i + 1][j + 1][t], g[i][j][t])

The correctness of this method is pretty straightforward to prove, because we consider here all possibilities here. The idea behind this kind of dynamic programming approaches is that you do not have to know in what periods you decided to drink coffee, you just need to know how many cups you have drunk so far and when you drunk the last one. Using this idea, we can speed up exponential solutions for many problems.

The time complexity of this approach for a single test case is O(N^3) and you may wonder if it is enough to pass the third subtask, since there are up to 200 test cases to handle. Well, just notice that the sum of N in all of them is at most 1000, so we are all good with the third subtask. However it is still too slow to get a full credit.

Final solution

How to get even faster solution? If you look closer at the above dynamic programming structure, you will see that for many entries, the result is the best entry computed before plus the sum of lines written in periods between these two entries with or without a multiplier M applied to them. Based on this idea, we can derive even faster dynamic programming solution:

Let a[N+1] = a[N+2] = ... = a[N+N] = 0].

In addition, let p[i] := \sum\limits_{j=1}^i a[i], in other words, p[i] is the sum of the number of lines of code written in periods from 1 to i without any multiplier.

With these definitions, we can start working on a new dynamic programming approach.

Let f[i][j] be the maximum number of lines of code produced during the first i + D periods, such that in exactly j of these periods we decided to drink a cup of coffee and we have drunk the last one in the i^{th} period.

If we are able to compute the f table, we can easily compute the result as the maximum value of f[i][K] for 1 \leq i \leq N.

So how can we compute the f table? First, notice that the value of f[i][1], for any 1 \leq i \leq N, can be easily computed using the p table in constant time.

Now, we want to know how to compute the value of f[i][j] knowing the values of f[i'][j - 1] for i' < i.

There are two cases to consider:

1. i' + D < i

Then :

f[i][j] = \max_{i'} \{f[i'][j - 1] + (a[i' + D + 1] + a[i' + D + 2] + \ldots + a[i - 1]) + (a[i + 1] + a[i + 2] + \ldots + a[i + D]) \cdot M\} \\ = \max_{i'} \{f[i'][j - 1] + (p[i - 1] - p[i' + D]) + (p[i + D] - p[i]) \cdot M\}

Notice that for a fixed i, the above value varies only on f[i'][j - 1] + p[i' + D], so in order to compute f[i][j] in this case, we only need to know what is the value of i' which maximizes this term, and it is easy to compute the best values of it in previous steps.

2. i' + D \geq i

Then

f[i][j] = \max_{i'} \{f[i'][j - 1] - (a[i] + a[i + 1] + \ldots + a[i' + D]) \cdot M + (a[i + 1] + a[i + 2] + \ldots + a[i + D]) \cdot M\} \\ = \max_{i'} \{f[i'][j - 1] - (p[i' + D] - p[i - 1]) \cdot M + (p[i + D] - p[i]) \cdot M\}

The same similar observation as in the previous case applies here. For a fixed i, the only varying term is f[i'][j - 1] - p[i' + D] \cdot M, and in order to pick the best i', you can use values precomputed in previous steps. The tester used queue to deal with it, check his solution for more details.

Time complexity

The total complexity of this method is $O(N \cdot D)$, because this is the size of $f$ table, and we fill each of its entries in a contant time.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

RELATED PROBLEMS:

s24w · September 27, 2015, 2:11pm

It should be M=aj line of code rather than M<=aj line of code.
Please correct it.

pkacprzak · September 27, 2015, 3:03pm

You probably mean M * aj lines of code, right?

shuklas664 · September 27, 2015, 4:07pm

Shouldn’t it be P + d >= j instead of P + d <= j ?

kri1311 · September 27, 2015, 6:39pm

For the first subtask, where K =1 , won’t he have the coffee at the smallest value out of N numbers ? Because that would give the maximum amount of code he can write. Where i am wrong ?
Here is my solution for the first subtask. http://ideone.com/9SaW7S

achilles1993 · September 27, 2015, 6:51pm

@kri1311
It seems you have misunderstood the question…
Your code gives wrong ans for following test case

1

5 1 2 10

1 2 3 4 5

Output = 93

He will have coffee in 3rd period so ans=1+2+0+(4+5)*10

aka_007 · September 27, 2015, 7:09pm

I have 2 codes, which have a difference in position of recurrence call. The first one gets 15 points(https://www.codechef.com/viewsolution/8266369) and the second one gets 70 points (https://www.codechef.com/viewsolution/8266399).
The only difference between both of these can be found here: http://www.diff-online.com/view/5607f0be93e27.
I am unable to understand the reason for the same.

pkacprzak · September 27, 2015, 7:45pm

Of course, thanks for pointing it out.

prateekjjw001 · September 28, 2015, 2:06pm

In final solution, for finding f[i,j], we need to iterate over i’<i and find max value. But won’t that make the complexity O(NNK)? And could you explain how and why are we using a queue here?

ajay_bhammar · September 28, 2015, 11:07pm

plz help with this solution
i get wrong answer
https://www.codechef.com/viewsolution/8273065

rushilpaul · September 29, 2015, 8:53am

shouldn’t the complexity be O(N*K) ?

anupam_codecs · September 29, 2015, 10:22am

can you someone please explain how the queue is used ?

pkacprzak · September 30, 2015, 12:14pm

No, that’s not true. Consider the following example. We have 2 periods of time, in the first one we write 2 lines, in the second 1 line. Moreover, D = 1 and M = 10, K = 1 as you stated. If you decide to drink a cup of coffee in the second day, then you will write 2 lines of code. However, if you decide to do it in the first period, you will write 10 lines total.

sharad07 · September 30, 2015, 8:04pm

It would be really helpful if anyone could tell what f[i][j] represents…I have read the editorial but am still unable to get it…!!

mach_hack123 · September 30, 2015, 9:43pm

subtask 3’s solution will obviously give segmentation fault

sagarpatel · October 1, 2015, 10:25pm

f[i][j] -> represents the max Lines of code that the coder can write in the period 1(First) to i+D, given that the last coffee he drank was at ith period.