[JAN'15] Top contestants (msm1993 and kutengine) cheat again

mugurelionut · January 13, 2015, 5:40am

After the obvious cheating case from DEC’14 (you can read here about it), ~~which however was not handled in any way by Codechef,~~ the users @msm1993 and @kutengine cheated again - this time in a very elaborate manner. At first sight it seems that their solutions are very different - and on the surface they are. But the core idea (and even the data encoded in their source code) is exactly the same.

This is @msm1993’s top submission: http://www.codechef.com/viewsolution/5874709

This is @kutengine’s top submission: http://www.codechef.com/viewsolution/5875762

If one just looks at them they seem very different (and indeed I thought the same thing, too, at first). However, they are in fact almost the same solution at the core - but implemented in two different ways.

@msm1993 has a function called void init_permutation(), which does exactly the same thing as @kutengine’s void init_swap_positions() function from the Permutation namespace. Note that even the number of generated permutations is the same (8000000 in both cases) and they both use the value 6666 when deciding which permutations to store (in @msm1993’s solution it is #define MOD 6666, while in @kutengine’s solution it is const int MDN = 1111 * 6; - as if it were hard to do the math ).

The differences do not stop here. Both @msm1993 and @kutengine have a bunch of data encoded in their solutions, which is data tuned to the official test cases. Although encoded differently, this data is the same.

I will start with the data in @kutengine’s solution. Let’s take the array char pps[6][10005]. The first string in this array starts with “echzdfzdhhzejz”. If we look at the function void init() { from the namespace ZipInt, which decodes this data, we see that ‘z’ is a delimiter and that the other letters are digits (‘a’=0, ‘b’=1, ‘c’=2, …). Thus, this array contains in fact numbers. If we decode the numbers from the prefix I extracted we get: “ech”=427, “df”=35, “dhh”=377, “ej”=49.

@msm1993 encodes the same data in a much more elaborate way, in the arrays int p[][ML], int f[][ML] and int e[][ML]. These arrays are decoded within the class called FastRandomGenerator. By looking at the class we see first that it selects prime numbers in the int gen[ML] array, by using a sieve-based algorithm. It gets gen[0]=2, gen[1]=3, gen[2]=5, gen[3]=7, gen[4]=11, gen[5]=13, gen[6]=17, etc. The gen array is used in the next() function of the FastRandomGenerator class, which decodes the p, f and e arrays. By looking at the code we see that p says the number of elements of f to consider. f contains indices in the gen array (i.e. indices of prime numbers) and e contains the exponent of these prime numbers.

Here’s a prefix of the p, f and e arrays from @msm1993’s solution:

int p[][ML] = { {3, 1, 1, 2
int f[][ML] = { {1, 4, 5, 11, 74, 1, 6
int e[][ML] = { {1, 1, 1, 1, 1, 1, 1

The first number returned by the first call of the next() function considers the first 3 numbers of the array f: gen[1] x gen[4] x gen[5] = 3 x 11 x 13 = 429. Note that the function subtracts 2 from this multiplication, so the first returned number is 427 (the same as in @kutengine’s solution).

The second number returned by next() is simply gen[11]-2. You can check here that gen[11]=37 and, thus, the function returns 35 (the same as @kutengine’s 2nd number).

The 3rd number returned by next() is gen[74]-2. Use the same link to see that gen[74]=379 and, thus, the returned number is 377 (the same as @kutengine’s 3rd number).

The 4th number returned by next() considers the following 2 values from the array f (because the corresponding value from the p array is 2). Thus, it returns gen[1] x gen[6] - 2 = 3x17-2 = 49 (the same as @kutengine’s 4th number).

I will not continue with the other numbers because there are too many of them, but I hope that the pattern is clear.

@kutengine and @msm1993 implemented basically the same core solution which contains the same data tuned on the official test cases. However, in order to deceive anyone who would look at both their source codes, they implemented things differently and encoded the data in different formats, in order to seem that they have unrelated solutions.

For me the annoying part here is that what they did is essentially equivalent to using 2 accounts in order to be able to make twice as many submissions to the challenge problem (and the more submissions one makes, the more data one gets and, thus, one’s solution is tuned much more to the official test cases). Then they used the data obtained by making submissions to the challenge problem from 2 accounts. I can say that if I had been able to make 1000 submissions to the challenge problem (instead of only 500), I could have tuned my solution much more and I could have obtained a significantly better score than the one I did with only 500 submissions (this assuming I would have had the time to make 1000 submissions).

I will also send an email to [email protected] to report this case, but I am very skeptical that they will do anything. Since they did nothing to punish @msm1993 and @kutengine in DEC’14 long contest, where their source codes were almost identical, I don’t think they will do anything now, when @msm1993 and @kutengine tried very hard to hide the fact that they submitted essentially identical solutions.

I wrote an email about this to [email protected] and I hope they will take action against @msm1993 and @kutengine, the same way they did for DEC’14.

Edit: I looked at @msm1993’s and @kutengine’s submissions for the other 9 problems of JAN’15 long contest and they seem to follow the same pattern, i.e. they are the same idea implemented in slightly different ways. The most obvious to me are their submissions for the XRQRS problem - the submissions have similar running times and the code is obviously modified in order to seem different, despite being essentially the same.

This is @msm1993’s submission for XRQRS: http://www.codechef.com/viewsolution/5856986
And this is @kutengine’s submission for XRQRS: http://www.codechef.com/viewsolution/5827335

The fast I/O function was renamed and slightly changed and the query functions were slightly modified (in one case they are numbered from 0 to 4 and in the other case they are numbered 1 to 5), but if one pays attention, it is obvious that it’s the exact same solution modified in order to “seem” different.

Edit 2: The Codechef team told me (upon asking them) that @msm1993 and @kutengine replied to them and, upon examining their case better, they decided that this was not cheating. Thus, no action was taken against them. Although I believe the conclusion to be wrong, there is nothing more that I can do.

ketanhwr · January 13, 2015, 10:33am

I think that banning them for some months might solve the thing. These guys have been cheating since 2-3 months and winning contests by getting an unfair advantage. rns4 was removed from December '14 Long Challenge rankings but msm1993 was ignored still. This time, I really think that they should be banned.

dev8546 · January 13, 2015, 11:29am

I also agree that codechef Admin should be very strict this time as it is not very encouraging for top coders like mugurelionut who play very fare and then see somebody wins with not appropriate approach. It is great work by mugurelionut who spends so much time to find the people who are cheating . Kudos to mugurelionut

betlista · January 13, 2015, 3:08pm

That’s not a bad idea, in theory I can ban the user, but I’m not sure if such user is banned only in forum or also for a contest, someone wants to try (I can ban for a limited amount of time)?

betlista · January 13, 2015, 3:12pm

I really appreciate your effort @mugurelionut!

I reported during the contest, that user yowa (I’m not telling it is the only one) is suspicious, based on your previous post about the cheating - it seems to me that this is I-need-more-submissions problem. I saw it when I was retrieving statistics…

If someone wants to investigate, feel free and raise a hand, my tool can retrieve all users with only challenge problem submission quite easily

mugurelionut · January 13, 2015, 9:12pm

@betlista: What did you find suspicious about @yowa’s submissions?

sidgupta234 · January 13, 2015, 9:35pm

He had submissions only for the Challenge problem, so it might be used by people who required more submissions to optimize their solution.

betlista · January 13, 2015, 9:54pm

When I saw it, it was during the contest, so I was not able to see the code. After the contest I didn’t investigate more, but it is as @sidgupta234 described above, I’m not telling definitely he is guilty, maybe just a big fan of challenge problems, he/she always solve only challenge problem (as I can see from his profile)…

subhendu_sethi · January 13, 2015, 10:54pm

@msm1993 and @kutengine are top notch programmers. People like them should not indulge in such activities.
Correction: Need not indulge in such activities. Banning then for few contests seems the right choice if they have cheated.

neo1tech9_7 · January 14, 2015, 5:05am

Some people do only the hard problems or the challenge problem in the long challenge.

mugurelionut · January 14, 2015, 5:19am

@subhendu_sethi: I agree with you that @msm1993 and @kutengine are good programmers, but they seem to lack any kind of ethics. They are willing to do anything in order to get high ranks in the contest - and that includes cheating (repeatedly). It’s sad that Codechef admins allow such behavior to take place on their platform without doing anything about it.

mogers · January 14, 2015, 5:27am

Right, one should be careful. For example, http://www.codechef.com/users/gennady.korotkevich only played with the challenge problem in DEC14.

kuruma · January 14, 2015, 6:10am

Hi,

I have forwarded this to admins… Hopefully they will see it and take appropriate measures this time…

It’s unfair that you, @mugurelionut which seem to enjoy working hard for this problem specifically and from which submissions I can learn a lot, sees himself being undermined by teams or by people which lack ethics enough to prefer cheating over an honest contest…

I hope they will do anything this time

Bruno

kuruma · January 14, 2015, 6:17am

Im not sure if I can do it myself… not sure if this is about karma or about those stars in front of your name, but, if I could I would!! this is unfair for @mugurelionut which I actually greatly admire for his awesome submissions and way of explaining things… There are contests possibly better suited for teams or for someone with dubious ethical standards…

ketanhwr · January 14, 2015, 9:10am

I think the admin should take a strict action this time.

drathi5 · January 14, 2015, 2:00pm

Hello,
I also found some suspicious accounts.
http://www.codechef.com/users/random11111,
http://www.codechef.com/users/yowa,
http://www.codechef.com/users/mrcoder.
These guys only solve tough questions of long challenge.

admin · January 14, 2015, 4:04pm

Let us apologise for this delayed response. The reason we could not act upon this earlier was mainly due to the ICPC onsite regional and the entire team here being busy with that. But that is no excuse. Also we had written to the participants in question and were waiting for them to respond.

We have verified and agree to every single point that mugurelionut has mentioned. And we thank mugurelionut for all the pains taken to do this. As an immediate step we are disqualifying their solutions form these contests and reducing their rating points (a standard procedure that we apply in any case of violation of our Code Of Conduct). We will also send them an email warning them about the same.

We are also considering reducing the submission limit to 200 per user. We will need a broader discussion on the same. Creating multiple accounts is against our Code of Conduct and if someone is doing so to make multiple submissions, we will want to know. It is not always possible to catch this in MOSS results, but we would act on every such reported case.

dzlla · January 14, 2015, 7:41pm

mugurelionut’s words is unfair for @msm1993 and @kutengine. perhaps they have studied together with the same teachers since high school (both of them took part in IMO 2011). You shouldn’t accuse them of cheating, just because their ideas are similar to each other.

neo1tech9_7 · January 14, 2015, 9:51pm

For starters it is quite a known fact that Koreans compete in teams. Secondly in the previous competition they had basically the same solution for the challenge problem. No matter how closely you think you can’t have solution with that kind of similarity unless you are cheating. Thirdly if we assume that they had similar ideas in previous challenge then how did they have wildly different codes for same ideas in this contest unless they are trying to hide something ?

code_streak · January 14, 2015, 10:38pm

I am a fan of @mugurelionut… just this… hats off to your efforts…a true coder…!!