CLUSSCHE - Editorial

mugurelionut · July 13, 2016, 11:21pm

PROBLEM LINK:

Practice
Contest

Author: Praveen Dhinwa
Tester: Mugurel Ionut Andreica
Editorialist: Mugurel Ionut Andreica

DIFFICULTY:

CHALLENGE

PREREQUISITES:

Greedy, Randomized local search

PROBLEM:

You are given a computer network consisting of N nodes, the shortest path distances between them and a cost budget C. Then you are given M batches of B jobs each. Each job has its own processing power and you need to distribute the jobs as evenly as possible among the N nodes. Each job has an initial node, but you can move it to any other node along the shortest path (as long as the total cost budget C is not exceeeded). After each batch, the maximum (MAX) and minimum (MIN) processing powers used up by the assigned jobs at any node in the network are computed and the score for this batch is the difference between MAX and MIN. The total score for a test file is the sum of the scores of each batch. Your goal is to minimize the total score.

EXPLANATION:

First, we should get something out of the way. You can pretty much ignore the cost budget C. This is because although the individual costs between pairs of nodes can be quite large, the shortest paths end up using mostly small cost edges, which allow you to essentially assign any job to any node, without thinking about exceeding the cost budget C (this was an oversight on our side, but by the time we realized this was the case, it was too late to change anything).

A good approach here is to try to minimize the score after each batch (although this does not necessarily mean this is the solution leading to the minimum total score). One simpler and efficient approach is the so-called “Longest Processing Time First” greedy heuristic. You essentially consider the jobs in decreasing order of their processing powers. When considering job i, you assign it to the node with the minimum total processing power used so far. This approach can be implemented very easily in O(Blog(B) + Nlog(N)) per batch and leads to a score of approximately 0.994 of the top score.

Let’s now try to improve the solution generated by the initial Greedy algorithm, keeping in mind that we’d like to optimize the assignment of jobs not necessarily in order to improve the score of the current batch, but in order to improve the score of future batches. Let’s first subtract the minimum processing power of any node before the current batch from the processing powers of each node (we essentially rescale the range of values to start from 0, instead of from the current minimum processing power of any node). Let’s assume that the re-scaled processing powers of the nodes are p(1), …, p(N) (before assigning jobs from the current batch to them) - thus, at least one of the p(i)'s is zero. We now assign the jobs from the current batch to the nodes as determined by the Greedy algorithm presented above. Let MIN and MAX be the minimum and maximum values among the p(i)'s (i.e. those defining the score for the current batch). We will now try to perform a series of changes to the assignment, in such a way that the range [MIN,MAX] is not increased (it may potentially be decreased, though we will not explicitly aim for this). Instead, our goal will be to find an assignment which maximizes the value S = p(1)^q + p(2)^q + … + p(N)^q, where q is an exponent we choose (you can start with q=2, but I obtained the best results for q=1.25).

In order to reach such an assignment we will perform a number of iterations (either a fixed number or as long as time permits). At each iteration we will randomly generate two jobs x and y and check if we can swap the two jobs in the assignment. Let assigned(x) and assigned(y) be the nodes to which the jobs x and y are assigned. Swapping x and y essentially means swapping their assigned nodes. We are allowed to do this only if the new processing powers of the assigned nodes are still in the range [MIN,MAX]. If they are, then we perform the swap and recompute the value S for the new assignment (S can be updated in O(1) time). If the new S is larger than any value S seen so far, we keep the new assignment as the best assignment for the current batch.

We can also try another kind of move: randomly pick a job x and a node y and try to move the job x from its current node to the node y (only if after the operation both the old node and the new node to which x is assigned have a total processing power in the range [MIN,MAX]).

This approach can lead to an absolute score of 2996128 (this is slightly better than the 2nd best score obtained during the contest and is also not that far from the top absolute score obtained during the contest: 2995637).

So, to summarize, a very good and easy to implement approach is to start with the “Largest processing power” first heuristic and then perform a random walk in the space of assignments which keep the same [MIN,MAX] range (or maybe reduce it), in order to optimize the metric S=p(1)^q + … + p(N)^q, where q is greater than 1 (e.g. 1.25).

Why does this metric make any sense? A q larger than 1 will essentially prefer assignments with more large values (close to MAX) - since the total sum of processing powers is constant, this will mean that the other values will be smaller in general and it will be easier to fill them in the next batch (while the larger values may remain untouched in the next batch, or mostly jobs with small processing power will be added to them). Thus, this metric tries to “prepare for the future” instead of only “thinking about the present” (as the initial Greedy heuristic does). But other than this empirical observation, I have no proof to convince you this is a good approach, other than the score it actually gets.

In order to make this approach (or any other approach) obtain a slightly better score you can try over-fitting it to the test data (see, for instance, @ceilks’ winning solution).

Please discuss other approaches that you used in the answers/comments to this editorial!

Setter’s and Tester’S SOLUTIONS:

Setter’s solution can be found here
Tester’s solution can be found here (it implements the approach described in this editorial).

xariniov9 · July 14, 2016, 12:10am

How exactly are ties broken on tie breaker problem?

mugurelionut · July 14, 2016, 12:21am

I don’t understand your question. Do you mean when two solutions have exactly the same absolute score? Because otherwise, the solution with the better score ranks higher (and the score that you see in the standings is the percentage of a solution’s absolute score, relative to the best absolute score obtained by anyone).

xariniov9 · July 14, 2016, 12:25am

same absolute score

see rank #52 for example there are like 14-15 same scores.

While going through the indian rank list I noticed, it makes it difficult to choose top 20 indians, how exactly this tie will be broken?

noble_mushtak · July 14, 2016, 12:30am

@mugurelionut Personally, what I did at first was just output the same nodes as what was given (no nodes were moved) and then I was in a tie with 10 other people at around ~200th place (presumably) because we all had the same solutions to CLUSSCHE and had 547 points from other problems. However, the tie @xariniov9 is talking about is likely different because they seem to have 99.348 points from CLUSSCHE, meaning they actually put thought into their solution and thus they probably don’t have the same exact solution.

mugurelionut · July 14, 2016, 12:39am

@xariniov9: I understood your question now. But, unfortunately, I have no idea how the ties for prizes will be broken (in case there are multiple Indian contestants with the same score as the 20th Indian contestant). @admin should answer this. I can’t speak for anyone (I was just the tester), but maybe if there aren’t too many Indian contestants tied on the 20th place they could all get prizes? (i.e. increase the number of prizes a bit?)

code_hard123 · July 14, 2016, 1:24am

@mugurelionut That’d be a great idea!

ceilks · July 14, 2016, 4:56am

This was quite a strange problem. In my opinion it would have been much more intersting with a tight constraint wrt. the network capacities.

In understanding the problem as is, I didn’t get any farther than the heuristic approach to shift some values in the direction of the extrema. No idea how much shifting is optimal or how an optimal distribution should look like. So I tried various approaches, all yielding quite similar results. In the end I chose a quadratic function for optimization (this seems to induce too much shifting). Instead of varying the exponent, I simply cut of the number of iterations int the optimization procedure.

I guess there are various other approaches to shift the values “somewhat”, choosing the best one seems to depend rather on luck unless there is some more insight how the optimal distribution is supposed to look like.

ceilks · July 14, 2016, 3:02pm

not directly related to the editorial, but somehow CLUSCHE slipped into the medium category of the practice page (instead of challenge)

mugurelionut · July 14, 2016, 4:09pm

@ceilks: yes, I noticed that. and problem POLYEVAL was mistakenly added as a challenge problem in Practice - so there was probably a mix-up between these two problems. I told Praveen (@admin2) about it. Hopefully things will be fixed soon.

mugurelionut · July 14, 2016, 4:30pm

@ceilks: I also don’t have much more insight into this shifting process, either. Initially I actually wanted to shift the values more towards the minimum (i.e. to minimize the S in the editorial), but that turned out to be bad Since I had the code already written, I thought why not try to maximize S instead? And that miraculously worked much better. I didn’t have more time or motivation to dig deeper into how to improve this process more, though. Still looking forward to other approaches from other contestants!