PROBLEM LINK:
DIFFICULTY
CHALLENGE
PREREQUISITES
Gaussian Elimination, Brute Force, Ad Hoc
PROBLEM
You are given a list of numbers. The bitwise-xor of all the numbers is 0.
You wish to split this list into as many lists as possible (suppose K), such that the bitwise-xor of the lists themselves is also 0.
Since this is a challenge problem, you want to maximize K as much as possible. K could also be 1, meaning you did not split the list at all.
EXPLANATION
The problem at its heart is a brute force search problem.
- Search for a subset of numbers whose bitwise-xor is 0
- Since the bitwise-xor of all the numbers is zero, this means that the remaining set that you did not select also has a 0 bitwise-xor
- You wish to select sets which are as small as possible
- You have to optimise
N/K
The question comes down to, is there any efficient way you can select a subset whose bitwise-xor is 0?
There indeed is. The meat of how to find subsets lies in doing a Gaussian Elimination in F2!
Suppose we have the binary representation of the N numbers as follows
A1 = b1,1 b1,2 … b1,M
A2 = b2,1 b2,2 … b2,M
…
AN = bN,1 bN,2 … bN,M
where each number has at most M bits.
Let there be N, binary variables
X = { x1, x2 …, xN}
Here, for some subset S of the N numbers, the value of xi is 1 iff Ai is in S.
Now, for the bitwise-xor of a subset of numbers has to be zero, the following must be true
x1b1,1 ⊕ x2b2,1 … ⊕ xNbN,1 = 0
x1b1,2 ⊕ x2b2,2 … ⊕ xNbN,2 = 0
…
x1b1,M ⊕ x2b2,M … ⊕ xNbN,M = 0
for the subset encoded via X.
We already know that for X = { 1: repeated N times}, the above is true.
There are N unknowns, but only M equations! (Sometimes this N might be as much as M or even smaller, so be careful!)
Thus there are several solutions of X possible.
- A solution can be called small, if there are few 1s in X
- You want to find several mutually exclusive small solutions of the above
- You can find a small solution
- then ignore the xi’s that are in the solution
- then repeat the above two steps again
This way, you will be able to find a split of the N strings.
The problem now is, how do you search for a small solution?
The answer to that lies in
X = set of x[i]'s that are not marked
proc: find_answer
1: Choose some i, for which x[i] is not yet marked (using X)
2: we assume i HAS to be selected in the subset and keep the bit values of i in our solution matrix for the system of equations
3: randomly consider O(M) variables from the list of un-marked variables (using X)
check if they give you a solution of the system of M equations with M variables in each, with the above solution matrix
if they do
then mark the subset and remove them from X
else
repeat from step 3
We can optimize the above by making our selections smarter.
The step that finds a solution of the system of equations, uses Gaussian Elimination in F2. This means that
- Add and Subtract operators are simply xor operators
- Multiply remains multiply
- Since all values are either 0 or 1, we do not need to do any division
- The equations are degenerate if we find an intermediate equation
- none of whose coefficients are 1
- the solution is supposed to be 1
- The equations may be linearly dependent, but that does not matter, since we could still have a solution (by ignoring the eliminated variable)
SETTERS SOLUTION
Can be found here
TESTERS SOLUTION
Can be found here