KCOMPRES - Editorial

acmonster · August 16, 2018, 1:44pm

I think that the solution is incorrect, since the cost of the K-compressed sequence is not monotonic in K. Here is a simple counter-example.

Consider sequence A = (2, 1, 2, 3, 4). One can verify that one of its 1-compressed sequences is (2, 1, 2, 3, 4) itself, which has a cost of 12. However, one of its 2-compressed sequence is (3, 1, 2, 2, 3), which, in fact, has a lower cost of 11.

acmonster · August 16, 2018, 1:50pm

Also note that in the K = 2 case, though we have A[1] = A[3] in the original sequence, we can achieve a smaller sum only if we choose to assign different values to B[1] and B[3]. This shows that the greedy algorithm mentioned in the solution is not optimal.

vijju123 · August 16, 2018, 6:13pm

Hey @acmonster , if we assign B_1 and B_3 different values, then doesn’t point 2 of the question get violated? It says that if there are X elements smaller than A_i in that range, then this must be same for B_i as well.

For your sequence, there is 1 element smaller than A_1 in range [1,3], but 2 elements smaller than B_1 in range [1,3] which I feel make your sequence invalid. Can you please have a look?

acmonster · August 16, 2018, 6:40pm

Hi @vijju123, the problem statement was updated and it now counts the number of numbers “smaller than or equal to” A[i] and B[i].

acmonster · August 16, 2018, 6:44pm

@vijju123
Moreover, under the “smaller than” version of the problem that you referred to, we can still construct a counter-example of A = (3, 4, 3, 2, 1), whose 1-compressed sequence is (1, 4, 3, 2, 1) (with sum 11), and whose 2-compressed sequence is (1, 3, 2, 2, 1) (with a smaller sum of 9).

vijju123 · August 16, 2018, 7:10pm

Oh! I missed the statement update. Sorry, and thanks for telling that :). I will forward your concern to @mgch to have a look.

byomkeshbakshy · August 16, 2018, 8:18pm

@acmonster I didn’t get what you are trying to say…we want to find the k compressed sequence with lowest sum so 1-compressed sequence with minimum sum for {2 1 2 3 4} is {1 1 1 1 1} with minimum sum of 5 while 2 compressed sequence with minimum sum is {3 1 2 2 3} with sum 11. Why are u comparing any random k compressed sequence?

likecs · August 16, 2018, 10:14pm

Thanks for letting me know this.

acmonster · August 17, 2018, 6:28am

Hi @byomkeshbakshy, you might want to check that the 1-compressed sequence with minimum sum for (2, 1, 2, 3, 4) is (2, 1, 2, 3, 4) itself, not (1, 1, 1, 1, 1). This shows that the minimum sum of K-compressed sequence is not necessarily non-decreasing in K.

byomkeshbakshy · August 17, 2018, 7:20am

@acmonster ohh sorry!.. {1 1 1 1 1} is 0 compressed sequence…yes you are right, then we cannot use binary search then.

ankush_953 · August 17, 2018, 11:27am

@likecs Please help me with this solution… What can be test cases It may be failing upon?
https://www.codechef.com/viewsolution/19759987

aryanc403 · August 17, 2018, 1:37pm

Why ?? I statement is update. But there is no announcement. Rule for problem setter should be made. I he wants to add a “.” also. Then he will be allowed to do so only after making an announcement on contest page.

Btw when this update was made ??

aryanc403 · August 17, 2018, 1:41pm

My comment on Magic Set July’18 which nobody bothered to reply -

“Test Case 1 - Subsequences of [1,2] are [1],[2],[1,2] sym of all these is 6. (Divisible by 3). So answer should be 1? Is answer 0 correct?”

“When was “each” added in problem statement? And why is no announcement made after this change?” .

I remember reading statement multiple times because I got intution that this is what statement is asking. But each was not present there.

buda · August 17, 2018, 10:21pm

FWIW, I also thought binary search would be useful here upon seeing the problem, but then wrote a brute-force solver and found counterexamples. One’s intuition can be misleading.

roll_no_1 · August 18, 2018, 12:09pm

In the editorial, it is written to check whether the ranges of the identical elements coincide with each other, but it does not give the criteria for coinciding ranges. Example: For a = [2, 3, 4, 3] and k = 1, the range of the 3 at index 2(assuming 1 based indexing) is [1, 3] and that for the 3 at index 4 is [3, 4]. But both of these can be mapped to different values. So, to check whether the ranges for 2 indices i and j (i < j) for a given value of k overlap or not, we need to check j - i <= k, and not i + k <= j - k.
In the example above, the optimal compression for k = 2 is [1, 2, 3, 1], while using the wrong condition it comes out to be [1, 2, 3, 2]. My solution with the wrong condition passed all but the last test file.

buda · August 18, 2018, 1:43pm

I doesn’t look to me like the described greedy solution works. I also had the intuition that something like that would work (e.g. “all equal values in A that are within each other’s range will be the same in B”) but I managed to disprove all of these intuitions that I had.

For example, for A = [1, 1, 2, 2] and K = 1, the setter’s solution computes B = [1, 1, 2, 2], but in fact, B = [2, 1, 1, 1] with a smaller sum of 5.

So I wonder if someone can explain a correct algorithm to compute the K-compressed sequence in O(nlgn) time.

ardamose123 · August 19, 2018, 6:50am

I came up with this kind of solution, too, but got some WAs for some unknown reason. Since I was using naive data structures in Haskell, I also got TLEs.

koyaaniqatsi · August 19, 2018, 6:55pm

@admin, @likecs, is anything going to happen here?

I do not see how this:

I guess that the problem (and test
data) actually requires that sequence
B should preserve all the relative
sizes for each index pair

can be derived from problem statement… Moreover, I bet, during AUG18 most people who submitted greedy algorithm and got AC - they didn’t question their own solution(they got AC, you know) and just moved on to next problems. It can’t be justification of incomplete problem statement.

I just picked and tested some of AC solutions from top 30 aug18 finishers, all they look wrong and fail on these ‘counterexamples’ - you can easily check it yourselves using following input:

2
4 6
3 2 1 1
6 12
4 3 1 2 1 2

and if you get following response from any solution:

1
2

it is wrong answer. I don’t think how this whole situation is ok.

Besides to @acmonster and @buda have already pointed to these very flaws in editorial solution. I as well developed greedy algorithm at first during the contest but on first WA(bug in implementation), after more thorough thinking, I discovered similar ‘counterexamples’ myself. So that essentially made problem harder obviously and I ‘postponed’ it(and it turned out I never returned back to it during the rest of contest).

IMHO, either this problem should be removed from every contestant’s score or entire aug18-long should be unrated. Because It feels like bad precedent.

Am I missing any specific rules for such cases?

PS: how many people doe usually verify problem statements and solutions? Looks like this was missed even by ‘tester’.

likecs · August 19, 2018, 7:37pm

@acmonster you correct that the current statement doesn’t have the proof for binary search. Even the method for finding k-compressed array will be wrong. The framing of the english statement based on my idea for the question couldn’t be framed correctly by me. As per your comments (which are posted in the editorial), the correct statement should have been:

“B should preserve all the relative sizes for each index pair (i, j) such that |i - j| <= K. In other words, if A[i] < A[j], we should have B[i] < B[j]; if A[i] = A[j], we should have B[i] = B[j], etc”

If this, the editorial is exactly in lines with what you said. I apologise if someone faced the similar situation during the contest.

likecs · August 19, 2018, 7:42pm

@koyaaniqatsi, you are correct. I have already mentioned in the editorial that the test case for small dataset was weak and the english statement was horrible. About 4 different people tested their solution during the preparation, but everyone got the same logic and everything looked ok then. Regarding question being removed or contest being unrated, please contact @admin or @mgch.