PROBLEM LINK:
Author: David Stolp
Tester: Hiroto Sekido
Editorialist: Anton Lunyov
DIFFICULTY:
CHALLENGE
PREREQUISITES:
PROBLEM:
The best described in the problem statement.
EXPLANATION:
The simplest solution is to output all servers. It will bring you only 13.249427 points, but you can even don’t read distribution of chunks by servers
The second simplest solution that still allows you to not read distribution of chunks by servers is to output first S − N + 1 servers. Thus, we left only N − 1 servers. It is clear that it is impossible to recover all N chunks having only N − 1 their combinations. Such solution brings you exactly 10 points and actually scoring function provides a hint to such solution
The third simplest solution is to iterate over all chunks, find the one that belongs to the least number of servers and output servers of this chunk. This will bring a better score 6.378516. This works because after deleting these servers, the chosen chunk is not contained at any other server, so we can’t recover it.
Now we consider the idea that allows to reach the high score.
The input can be represented as S × N system of linear equations over the field Z2. The variables are chunks and the right hand side is composed of servers. If A is the matrix of this system, then A[i][j] = 1 if and only if chunk j is contained on server i (1 ≤ i ≤ S, 1 ≤ j ≤ N).
Claim. The chunks could be always uniquely recovered if and only if all N columns of this matrix are linearly independent over Z2.
Indeed, if columns are linearly independent, then using facts from basic linear algebra we can left only N rows of this matrix such that obtained N × N matrix is non-degenerate. Hence we can recover chunks by servers using inverse matrix. Now assume that columns are linearly dependent and their rank is K < N. Then we can left exactly K linearly independent rows in the system. It is well-known that such system has not-unique solution for every possible vector in right-hand-side.
Thus, we need to delete as few rows as possible such that some column becomes a linear combination of other columns modulo 2. This, in turn, equivalent to performing series of elementary column operations such that some column becomes zero (in the matrix with deleted rows). Elementary column operation is simply a bitwise xor of columns if we consider them as binary numbers.
Instead of deleting some rows and then trying to reach zero column, we can do at first some elementary column operations and then for every column the set of rows we should delete to make this column zero is simply all rows having 1 in this column.
To achieve better performance we could represent columns as collection of 32bit unsigned variables and do bitwise xor 32 times faster (see bitset data structure in author’s solution).
Now the simplest we can do is to perform random bitwise xors while time permits, check whether the column we change have less ones than the answer and update the answer if necessary. To count the number of ones at GNU C++ we can use fast built-in function __builtin_popcount
. Otherwise we can precalculate popcount for numbers up to 216 and calculate popcount of 32bit number as a sum of popcount for its lower 16 bits and popcount for its higher 16 bits. Such solution could bring about 4.89 points.
Another method to use this idea is the following.
At each step we select random 1 in the matrix. Let it be in the intersection of the i-th row and the j-th column. Then we do the step like in Gaussian elimination. For each column k other than j, if the i-th element in the k-th column is 1 we replace k-th column by its bitwise xor with the j-th column. Also we check whether the updated k-th column has less ones than the answer and update the answer if necessary. We do these steps while time permits. Such solution could bring you about 3.75 points (note that the winner score is about 3.60).
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s solution can be found here but it get WA
The corrected version is here (the score of this submissions is 3.745888).
Tester’s solution can be found here.