LECOINS - Editorial

gamabunta · March 13, 2013, 5:15am

PROBLEM LINKS

DIFFICULTY

HARD

PREREQUISITES

Simple Math, Dynamic Programming, Shortest Path Problem

PROBLEM

You are given N types of coins. The i^th type of coin has the value V_i and the color C_i. You have an infinite number of each type of coin. You have to answer several queries of the following type

Among all the ways of choosing coins whose values sum up to exactly S
How would you choose coins such that the number of distinct colors among the chosen coins is maximum

You just have to report the number of distinct colors in such a selection.

QUICK EXPLANATION

Let us look at a much easier problem and work out a solution. The solution to that problem would give several insights into how to solve this problem.

Ignore the requirement to optimize the number of colors used.

Now, the problem reduces to finding whether it is possible to choose coins whose values sum up to exactly S, or not. This can be done through use of some clever math, as below

Let V = { V₁, V₂ …, V_n } be the values of the n coins.
Without loss of generality, we can assume that V₁ is the smallest value in V.

Now, consider a table

DP( 0 … V₁-1 )
DP(i) = smallest sum that can be built from values V - { V₁ } such that
- DP(i) % V₁ = i

The benefit of such a table is that to test a target value S, we can

Given d = DP(S % V₁)
if S < d
- the smallest value that could be constructed from all other values is larger than the given value
- hence, S cannot be constructed
if S = d
- S can be constructed using all the other values
if S > d
- S can be constructed using all the other values
- plus, V₁ as well (to meet the remaining deficit)

Note that such a selection only tests for possibility, but not optimality.

The quesiton arises, how do you construct the table DP?

EXPLANATION

Let us iterate over all the values v in V - { V₁ }

For each x, which is a residue modulo V₁, we can update DP(x + v) by DP(x) + v, if it is smaller. This looks eerily similar to the Shortest Path problem.

We can think of the residues as graph vertices
and |V - { V₁ }| * V₁ edges, that depict using a value v to jump from x to x+v.

Since we need the shortest distances to all vertices from 0, one way we can do this is by using Dijkstra’s shortest path algorithm. But, we will look at the Bellman Ford shortest path algorithm.

Let M = V₁

repeat M times
    for all v in V - { M }
        for x = 0 to M-1
            relax( DP( (x+v) % M ), DP(x) + v )

But this is very inefficient! That is because, we assume no guarantees of the order in which we are encountering edges. If we could access all the edges in simple paths / cycles, we would have been able to trigger the relaxations from the right vertices and along the paths.

For this problem we are going to essentially do what the Bellman Ford algorithm does in the manner of relaxing over the edges, but we will do this in a very optimal order of edge selections; due to which, we are able to accomplish the same result the naive algorithm would in much lower complexity.

Let us look at the interesting trick to target the best complexity class for our solution. We know that we need to call

relax( DP( (x+v) % V₁ ), DP(x) + v )

for all x between 0 and V₁-1 and for all v in V - { M }

Firstly, we can consider all the edges for each v separately and never consider it again. This is due to the property of this graph that

if there is a path from 0 to x { v₁, v₂ …, v_k }
then there is also a path from 0 to k { v_P(1), v_P(2) …, v_P(k) }
where P is a permutation of { 1, 2, …, k }.

Thus, we consider v in some order and process all edges for each v in bulk.

There is an interesting property in the relaxations we want to perform for some v. The edges are arranged in several disjoint cycles. In this case, we can consider each cycle independently and start with the vertex that has the shortest distance till now. From this vertex, we update the distances to all the vertices, onwards in the cycle. This way, we never have to visit those edges again!

Let us elaborate

From some x, we have to consider the edges
- (x+v, x)
from x+v, we have to consider the edge
- (x+2v, x+v)
from x+2v, we have to consider the edge
- (x+3v, x+v)
and so on till we get (x+M-v, x)

From elementary number theory we know that (x + kv) % M goes through as many as M / gcd(M,v) residues. Thus, we can establish that we need to consider only those gcd(M,v) cycles that contain the first gcd(M,v) items respectively. Thus,

Let h = gcd(M,v)
for i = 0 to h-1
    Let S = {}
    for j = 0 to M/h
        S union ( (i + v*j) % M, (i + v*(j+1)) % M )
        /* notation: ( tail, head ) */
    Find the vertex y with shortest distance from 0
        from the set of vertices in the cycle S
    consider edges e in cycle S, starting from vertex y
        relax( DP(e.head), DP(e.tail) + v )

The above step is of complexity O(M). Thus the algorithm is of the complexity O(M*N).

Accounting for colors

First, realize that no matter what range for C, we only ever worry about N unique colors. Now, we can maintain N DP arrays similar to how we describe above.

DP(k,i) = the smallest value x, such that x % V₁ = i and the maximal number of colors used in the representation of x is k.

We can relax DP( k+1, (i+v)%M ) with DP(k,i) when using v, which is of a new color. To take care of not under-counting the colors we can construct the new DP in decreasing order of k.

for k = maxK-1 to 0
    for each v in V
        for x = 0 to M-1
            relax( DP(k+1, (x+v)%M ), DP(k,x) + v )

call procedure described previously to
    relax all the values within the same k using V

Now the answer for a query S is the largest K such that DP(K, S % M) ≤ S.

The overall complexity of this approach is O(N*N*M + Q*N).

SETTER’S SOLUTION

Can be found here.

TESTER’S SOLUTION

Can be found here.

mukel · March 13, 2013, 6:41pm

I think that test cases were too weak, muy solution first gave TLE, and after a small doze of randomization it becomes the fastest, how is this possible???

anton_lunyov · March 13, 2013, 6:59pm

We were in hurry when prepare the problem.
So test data could be weak.
But your solution is fastest only because sum of execution times is displayed.
Your randomization and heuristics seems work well on some of test data.
While maximal running time is 1.46.

If you know test where your solution works slow I would like to see it.
Since I have no idea how to create strong test data here.

mukel · March 14, 2013, 4:24am

My algorithm is at least O(N * N * min(V1, … Vn)), but may have a big constant factor. All comes from the fact that I use a BFS instead of simple ‘relaxation’ approach as in the official solution, the closest reference I know is http://wcipeg.com/wiki/Shortest_Path_Faster_Algorithm
A really worst case for my program is 30 coins all with different colors, and different values, all close to 200000, queries are not so important. In my machine such a case takes 5-7 seconds (Core 2 Duo 2.4 Ghz). My last AC submission is exactly the same as the previous (which gives TLE) but shuffling the order of coins (groups of coins with the same color).
The complexity of BFS for finding shortest paths has always be a mystery for me, I only can proove a O(V * E) upper bound, but if we shuffle the order of the edges, the BFS becomes really fast. My implementation of min cost max flow, contains a BFS, instead a Bellman-Ford for finding shortest augmentig path, and faster is by far. Does anyone know a more tight upper bound for BFS for shortest paths (when some randomness is added)???

anton_lunyov · March 14, 2013, 4:58am

@mukel

We have 3 tests where N=30 and all colors are different:

Vi are 30 largest values
Vi are 30 largest primes
Tricky test where gcd of any four Vi is > 1, but gcd of all Vi is 1

I see now that we should pay more attention to such tests by adding at least several random tests.