In my approach step 1-3 are exactly the same as described in the editorial. For the last step, I was trying to solve ILP using LP and branch and bound, but my implementation was too slow and probably buggy. I finally used a rule-based method, which seems good, but I don’t have any formal proof for its correctness.
Let us say that P = 10, and after all reductions we ended up with a bunch of 1s, 2s, 3s and 4s which we want to partition into maximum number of subsets such that each subset has a zero sum modulo P. There are at most 30 “prime” subsets with zero sum (a prime subset is the one which has no proper subset of zero sum). The idea is to partition the 1s, 2s, 3s and 4s into these subsets randomly, and then optimize the partition using a set of rules. A typical rule could be like this:
(1, 3, 4, 4, 4, 4) + (1, 2, 2, 2, 3) --> (2, 4, 4) + (1, 2, 3, 4) + (1, 2, 3, 4)
This means that if in our partition, if we are using both (1, 3, 4, 4, 4, 4) and (1, 2, 2, 2, 3), then we can replace them by the three subsets: (2, 4, 4), and 2 copies of (1, 2, 3, 4), which will gain us an extra subset in partition. A rule is a “prime” rule, if it cannot be split into two rules. For example, the following is not a prime rule
(1, 3, 4, 4, 4, 4) + (1, 2, 2, 2, 3) + (2, 3, 3, 3, 3, 3, 3) + (2, 2, 2, 2, 2) --> (2, 4, 4) + 2 (1, 2, 3, 4) + 3 (2, 2, 3, 3)
as it can be split into
(1, 3, 4, 4, 4, 4) + (1, 2, 2, 2, 3) --> (2, 4, 4) + (1, 2, 3, 4) + (1, 2, 3, 4), and
(2, 3, 3, 3, 3, 3, 3) + (2, 2, 2, 2, 2) --> 3 (2, 2, 3, 3)
We find all “prime” rules (there are approximately 150 of them for P = 10 and 1, 2, 3, 4), and apply them on the partition whenever applicable. The reduction system is confluent, but not persistent, so we cannot guarantee the existence of a unique normal form. However, since the reduction system contains all prime rules, we can prove that all normal forms will have the same size, i.e., optimal with respect to partition size.
So now, the question is how to find the set of prime rules. First we compute optimal partition of sets where number of 1s, 2s, 3s and 4s are bounded by some constant (say M). This is basically bounded knapsack problem can can be solved using dynamic programming in O (kM4) time, where k is the number of “prime” subsets. We consider each of these optimal partitions, and try to extend them by adding an extra prime subset into it, if the new partition is not an optimal one, then we get a new rule. For example, the following is an optimal partition of (1, 1, 1, 3, 4, 4, 4, 4, 4, 4).
(4, 4, 4, 4, 4) + (1, 1, 1, 3, 4)
Now, if add a prime subset (1, 1, 1, 3, 4) into it, it longer remains an optimal partition, as the resulting set (1, 1, 1, 1, 1, 1, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4) can be partitioned into 4 subsets unlike the 3 subsets achieved by above extension. This gives us the following rule:
(4, 4, 4, 4, 4) + 2 (1, 1, 1, 3, 4) --> 3(1, 1, 4, 4) + (3, 3, 4)
Ideally, one would expect that if we increase the value of M, we will get more and more rules. However, surprisingly
the set of rules does not grow if we increase M beyond 12. This is the part for which I have NO explanation. I tried increasing M up to 50 and it did not find anything new, so I was pretty much convinced that my set of rules is complete. If anybody can prove a decent bound on the value of M, I would be very happy to discuss, as it seems like a nice way to handle semi-bounded knapsack problems (capacity of knapsacks is bounded, but number of allowed copies is not).
I first tried to pre-compute all rules, and put them in the file, but the 50000 bytes limit did not allow me to do that. Unfortunately, the computation of all optimal partitions of bounded sets was very time-consuming. In the end, I found that even if I consider only the lexicographically smallest optimal partition of sets to generate the set of rules, we can get almost all the rules (except a few, which I included in my source file).