ANUMFS - Editorial

PROBLEM LINK:

Practice
Contest

Author: Anudeep Nekkanti
Tester: Sergey Kulik
Editorialist: Lalit Kundu

DIFFICULTY:

CHALLENGE

PROBLEM:

It is an interactive judge problem. You have a search area defined by n coordinate points that are present on the search area boundary. There is a point which you have to search for (that may or may not lie in the search area). You have m air-planes, which you can use. Each plane is defined by 3 variables (I, R, C). I is the unique id of the plane. R is the range it can transmit the special signal. C is the cost. You can use a plane to reach (x, y) which will cost 2*(x+y)*C, the judge will reply with “yes” or “no”, depending on if the search point is within the R manhattan distance of (x, y).

EXPLANATION:

All the coordinates on the boundary are given in the input. And it is said that the whole search area can be enclosed in a 1024*1024 square. So we can actually shift the positions and always have x,y <1024. Now take the grid [1024][1024] and mark all boundary points. Now if you do a dfs from exterior point and stop on reaching marked points. you will end up marking all exterior points. So the unmarked points are interior points of search area. We can try all interior points, if we get the plane, we can report. if not we can report that it is out side.

To get better scores we can divide the 1024*1024 grid into smaller grids depending on available plane radius R, then query each of them, if we get an yes, then we can restrict our search to that area.

In contest, @mugurelionut and @aawisong, were able to pin point the exact positions by making assertions, hence they had the perfect score. All other scores are negligible considered to the perfect score, so every else had a score of 0.000.

This is what @aawisong had to say about this:
I was very interested in the problem ANUMFS.
I think this problem was so funny and combined with real circumstance.
In solving it, first of all I had tried to reduce the number of sending operation.
So, I sent the planes in the following way.

I find the planes which could investigate about half number of the points of search area.
Among those planes I choose the plane with the minimum operation cost and send it.
As the result, the number of points in search area to be investigated decreased half.

In this way, I could find the position of the missing flight by sending about log (number of points in search area) planes.
I sent planes with the probability theory more efficiently.
But one day, I noticed an interesting fact in the result of my submissions.
It was that the test data were static and the position of the missed flight was constant for each test case.
Because the problem was to find the position of the missed flight, this fact was very important.
I thought I could find the position of the missed flight without sending planes.
However, after several submissions, I could know the position of missing flight for each case.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution
Tester’s solution

@darkshadows : Anything that can be solved with a perfect score is NOT A CHALLENGE PROBLEM . Either the approach of @mugurelionut and @aawisong violates the spirit of solving the challenge problem , by learning the data from previous submissions or problem is UNFIT . I think codechef admin’s should take a note of this and publish a clarification in this regard .

5 Likes

While doing dfs starting from an exterior point, consider this case:
(marked points are represented by commas and unmarked points by dots)

…,…

…,…,…

…,.,.,…

…,.,…,.,…

…,.,.,…

…,…,…,…

…,…

…

…

The boundary highlighted in black (same case a above):
alt text


If you start from an exterior point outside, you can never reach the points in the middle (which are also exterior points), and vice-versa.

How do we overcome cases like this. (where all the exterior points are not connected to each other by points)

Yes, this had been duly noted and a lot of discussion has already taken place among the whole moderator group of MAY14. Surely, there will be some major changes in future that’ll prevent such cases in challenge problems.

Why does the author’s solution gives wrong answer?

Link:-http://www.codechef.com/viewsolution/3906717

What me and @aawisong did is just a more extreme version of what has been done in most past Codechef contests - optimizing the solution for the official test cases. Ideally, a challenge problem shouldn’t allow the contestants to get the perfect answer, but this one did. So, although the problem was nice (and I also have solutions with nice and good search strategies), I would say that the problem in its current form was not fit as a challenge problem.

I like interactive problems and I would like to see more of them as challenge problems in the future. So here are just a few suggestions for avoiding what happened in this contest: 1) generate (part of) each test case randomly at each run (this has been done in the past) ; 2) make the output of each test case sufficiently large so that it cannot be found entirely with 500 submissions ; 3) make the judge answer interactive queries adaptively (e.g. in order to try to generate the worst case situation for each contestant on the fly)

@mugurelionut , @darkshadows :
I feel not running the submission on test cases for final score during the contest is the only real solution .
Otherwise as mugurelionut said : " What me and @aawisong did is just a more extreme version of what has been done in most past Codechef contests - optimizing the solution for the official test cases " .
This will remain a problem even if doesn’t lead contestants to an optimal/perfect solution .
I feel this NOT IN THE SPIRIT of the contest and am surprised that it was not considered UNFAIR MEANS .
I sincerely don’t wish to criticize any participant but I believe that the codechef community as a whole should evolve into the RIGHT DIRECTION , which is that the test cases when your solution gets live cannot be predicted . Only a certain amount of constraints can be assumed about them . This is the original suggestion I made several months back and it was diluted to accomodate cases where a correct solution during contest finally becomes TLE or Wrong Answer after the contest .
The workarounds suggested by @mugurelionut are interesting but it may not be possible to adapt every problem to them and may still leave room for such things .
The provisional test cases that run on the submission during the contest should be boundary cases , corner cases , difficult cases ( which could lead to TLE ) and final cases should be random cases not designed to break the solution in a certain way .

Once upon a while it may happen that someone’s correct solution originally finally becomes Wrong / TLE or runtime error , but given that there are 500 allowed submissions , the possibility of this happening to all the solutions of a contestant for the problem is low . And even if it happens it should be considered part of the game .

1 Like

i totally agree with your point of view. nothing more to say about this.

In principle, I also agree. But with the condition that the test case generation strategy is fully specified for each problem. There have been many situations in the past where very little information about how test cases were generated was provided. This is not acceptable when your solution is run on a totally different set of test cases. The most recent positive example is the problem LMATRIX2 from the FEB’14 long contest.

@mugurelionut & @aawisong : could you please provide the general method you used ?

As i understand from reading your successive submissions :

  1. you find first a “regular” way to solve the problem, to get AC, as a “reference code”
  2. you then guess the number of test cases (if not given in the statement)
  3. you hash the test case configuration to provide a one-on-one relationship
  4. for each test case, you use fake number of operations or fake memory allocations to leak the answer
  5. you build an array with the hard-coded answers
  6. given the test case hash (not necessarily in order), you provide the related answer

Am I right about this ?

It’s not a judgment here, just i’m genuinely curious about how you achieved that, and

noting the fact this process could probably be applied to all problems in a competition.

Not that i want to use it myself, but i’m still impressed by the efficiency of it.

Edit:

just thinking, with a little bit more work, you would even have been able to provide a TXT file with the answers.

you just had to retrieve the test cases order. not the hardest part when you know everything else.

If you read the test data generation plan. It is not possible to have such space.

Sorry for the mistake, I gave the new program to them, it will be updated soon.

I apologize about this. Though we did think about this situation during testing, We did not expect users to extract complete data. After we noticed this with 2 days left in the contest, we had many discussions regarding this. We could have changed/added test cases to fail such solutions but majority of the testing/setting panel voted against change in test data. So we had to let go this problem. However then we made efforts to make sure that this situation does not occur again. Please find the details here - http://discuss.codechef.com/questions/42912/issue-with-challenge-problems

I am writing this answer to respond to @cyberax’s post, because the length of a comment to any post is limited (and I wanted to write more than that).

In principle you are correct - that’s the general method. Before going into more details, let me say that the method could also be used for other problems in the contest (but with more difficulty) as long as the output per test case is small (e.g. a single number). Ideally you use this method once you already have a solution which gives AC, but you want to optimize it, as in the case of the challenge problem. For the other problems, once you have an AC solution, you’re done :slight_smile:

So, let’s assume for now that you have a solution which gives AC and is not too slow (i.e. not too close to the time limit) - if it’s very fast it’s even better. In order to find out the number of test case you just make your solution run for a fixed amount of time per test case (if it runs faster, just make it “wait” without doing anything until the fixed run-time). Then you see the sum of running times, you divide by your fixed run-time and you get the number of test cases. There were 20 test cases for this problem and I even mentioned that in one of my comments during the contest.

Next you need to come up with some conditions which uniquely identify each test case. I prefer to compute a 64-bit hash of the input, but anything works as long as you find something which is unique to each test case. Let’s talk about hashes, though. Then you need to find the hash for each test case - finding the exact value is difficult, because of the large range of values, but it is enough to find a range of values which uniquely identify each test case. What I did is to find 20 values: h(0), …, h(19), such that: there is 1 hash of some test case in [0,h(0)], one in [h(0)+1,h(1)], …, one in [h(18)+1,h(19)]. Obviously, you can choose h(19)=the max. possible value of the hash, so you actually need to find only 19 values. These values kind of have to be guessed. For instance, I choose a value h(0) and then I verify it. If there’s more than 1 hash value <= h(0), then I choose a smaller value for h(0), otherwise a larger one. How to verify how many test cases have a hash value <= a given value ? Well, similarly to how we found out the total number of test cases.

After this, we have conditions which uniquely identify each test case - note that the order of the hash values does not correspond to the actual order of the test cases on which the solution is run, but that doesn’t matter.

From now on we will consider each test case and try to optimize our solution independently for each. In the case of this problem the best “optimization” was to find the actual answer for each test case: two numbers between 0 and 1000. Let me first suggest something that I haven’t tried myself, but which might just work and might need only 2 submissions per test case. You first see how much memory your solution uses (the one reported by Codechef). Then, for the test case you are currently considering, if the answer is (-1,-1) then just generate a Runtime error. Otherwise allocate an extra x MBytes of memory. If no runtime error was generated then the memory that you see as being used minus the “base” memory should be the exact value of x :slight_smile: Then do the same for y.

What I did, after finding out that the answer is not (-1,-1), was to use 8 submissions per test case : 4 for finding x and 4 for finding y. I found out the base 6 representation of both numbers. Let me explain how to find (x%6) and you can figure out for yourself how to find the other numbers in the base 6 representation. My AC solution was very fast (about 0.1-0.2 sec per test case). So I made it “wait” for an extra (x%6) * 0.5 sec (the time limit was 3 sec). Then, by taking the difference between the running time and the “base” running time, I could easily find the value of (x%6). Note that I could have used a larger value for the base, as my solution was fast enough. It would have been very feasible to use base 12 (and 0.25 sec as the extra time factor instead of 0.5 sec) and this would have required only 6 submissions per test case instead of 8.

That’s basically what I did. However, it was only the 7-th day of the contest and I couldn’t just submit a solution which scored 0, because then everyone would have figured out what happened and there would be many more contestants doing the same thing. So what comes next is simply wait until close to the end of the contest to make the final submission with the minimum possible score, when there is no more time for anyone else to do the same thing (unless they already did it, like @aawisong).

P.S.: I liked the problem a lot and I spent quite some time into coming up with smart strategies for it (in case the test cases were going to be changed - I was prepared for this). It’s a bit sad that instead of discussing which strategy was the best to solve the problem, we are discussing methods of extracting the answers.

8 Likes

great explaination, and i mean it ! thanks a lot !

PS: i’ll probably add some thoughts about how i approached this problem.

feel free to do the same, because after all, you first solved this challenge just like us :slight_smile:

“P.S.: I liked the problem a lot and I spent quite some time into coming up with smart strategies for it (in case the test cases were going to be changed - I was prepared for this). It’s a bit sad that instead of discussing which strategy was the best to solve the problem, we are discussing methods of extracting the answers.”

Thank you. I will add another problem to the practice section in which the points are always different for each submission.