TOURBUS - Editorial

shangjingbo · January 13, 2014, 3:15pm

PROBLEM LINK:

Practice
Contest

Author: David Stolp
Tester: Gerald Agapov
Editorialist: Jingbo Shang

DIFFICULTY:

Challenge

PREREQUISITES:

Greedy, Dynamic Programming, Search, Random

PROBLEM:

Given a undirected graph with N nodes and M edges, while each node is associated with a 2D coordinate, find minimum number of simple paths or cycles, which are not self-intersected and covered each edge exactly once.

EXPLANATION:

This is a challenge problem. The most important thing in solving challenge problem is to find some approximation/partial methods using the combination of search, greedy, dynamic programming, etc…, because this challenge problem is NP-hard. Also, it is important to write a local test system to help you find the best strategy (almost all top ranked guys wrote their own test codes).

To solve this problem, first of all, we need to know why there is a restriction of the length of answer – should be smaller than or equal to (N+M)/2. That is because we can connect two edges share a common vertex as a path, which is definitely not self-intersected. Therefore, we can achieve the restriction of output.

And then, considering the special cases with smaller size. For example, if M is smaller than 15, we can use dynamic programming to get the perfect answer. First of all, generate all valid paths and cycles. They are distinguished by its set of edges (represented in a binary mask, 0…2^M). And then, for a given subset S of edges, we enumerate all its subset check whether it is a possible path/cycle and update the optimal answer accordingly. More specifically, denote the minimum valid paths/cycles needed to exactly cover the edge set S as f[S]. The DP procedure is O(3^M) after we obtain all valid paths/cycles as following:

f[0] = 0; // it is an empty set
for S = 1 to 2^M - 1 {
    f[S] = infinite
    for (sub = S; sub > 0; sub = S & (sub - 1)) {
        if (sub is a valid path/cycle) {
            f[S] = min{ f[sub] + 1, f[S] }
        }
    }
}

The trick is to enumerate the sub via binary operations instead of 0…2^M.

For the general cases, i.e. large N and M, we can find some greed ways. For example, intuitively, we may need to find out the longest valid path/cycle, remove them, and go on. Most of the top solutions during contest apply this idea or similar ideas. But, even finding the longest valid path/cycle is NP-hard in undirected graph. Here, we introduce two ways to find the longest path. The framework is as following:

while there is any edge remained in the graph {
    longest = the longest valid path/cycle in the graph
    remove longest from the graph
}

For both methods, we need to find some heuristic functions to rank the next edge to add to the current path. The most intuitive function is based on the intersections of other edges. It reflects how an edge conflicts with others, which means those edges will be invalid after we select this edge. Therefore, we can rank the edges in order. For the ties of edges, we can break them randomly.

The first method is Search. It takes a starting edge, and use dfs to expand the path as long as possible. The second method is Greedy, choose the best edges in each step. For Search, we also need to add constrains to the number of expanded solutions or the time it costs. Search may give better solution while Greedy is much faster. This is a trade-off.

In the end, I would like to give you 3 more tips:

Due to the randomness of the order of edges, it will be better to run the algorithm multiple times to get a better score.
Use different heuristic functions for different type of data or for different runs of the algorithm, may also benefit to your score.
Implement your codes as fast as possible, such that it could be run more times in the time limit.

Any more awesome ideas are welcome to share in the replies.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

ballon_ziq · January 13, 2014, 5:00pm

The Editorial gives perfect solution for tests with M<=15. Do you really belive such tests are present in official test cases? I mean N>20 and P>0,3. So the expected number of edges is at least about 60.

f03nix · January 13, 2014, 6:31pm

Would anyone be kind enough to share their test case generator / verifier now that the contest is over ? My code was returning WA with some parameters and I wanted to investigate them further.

It’d save me some time having to write my own … sigh.

sikander_nsit · January 13, 2014, 6:39pm

This is a very basic generator. It does not check for collinearity though it is unprobable. Also value of n has to be changed manually.

shangjingbo · January 13, 2014, 7:10pm

This perfect algorithm can be applied when some longest paths/cycles have been removed from the original graph.

shangjingbo · January 13, 2014, 7:12pm

I think you can look at some top solutions, they all have generator and verifier that you need. Also, it is not hard to implement it. I think implementing them now will be helpful for the next month’s contest.

freeman92 · January 13, 2014, 9:05pm

Can any of the top scorers or others explain how to use some of the other methods like using intersection between the segments or dfs method etc…

P.S.:

I have done it using the following method:

find degree of each vertex.
Sort the vertices in decreasing order of degrees.
choose the first vertex.
If it has degree >=2, then make a 2 length path with this vertex as center and then extend it both ways(choosing the first positive degree vertex to add to the end and the repeat the same with the newly added end).
If it has degree 1, then make a 1 length tour
decreasing the corresponding degrees and also deleting the corresponding edges from the graph at each step of adding an edge.
Go to step 2 and repeat the same until there are more edges.

This method gave me a score of 0.447
(suggestions to the method i used are also welcomed)

f03nix · January 13, 2014, 10:06pm

I used a similar approach to yours but with these diff.

Instead of cities, I used roads (represented a road between i, j as i*N + j where i < j) and determined connecting roads.
When choosing a road to extend both ways for a tour, i try it with ‘kLimit’ topmost nodes (like yours, in the order of their max degrees).
After extending, I get kLimit different tours at this level - I try the 2 longest and attempt the rest of the solution with them (all the rest tours)
The best of these two is chosen - however to save time, this check for 2 is only done with top ‘kThreshold’ level

f03nix · January 13, 2014, 10:08pm

I tweaked values for kLimit and kThreshold to get more optimal results … but unfortunately I didn’t do much good either .607.

f03nix · January 13, 2014, 10:20pm

Thank you, i’ll look into it.

A random test case generator shouldn’t be hard, but one which ensures points are collinear, etc. And the verifier which ensures all the roads have been visited, and no road intersects would take a bit of an effort. I’m prepared to write one myself, but would’ve saved me a lot of effort if someone who already had done it for the contest, shared it with me.

freeman92 · January 13, 2014, 10:27pm

whats this “kThreshold” ?

f03nix · January 13, 2014, 10:38pm

Basically after you get the first tour (the set of connecting roads), you’ll have to run the same algorithm again on the remaining roads till you get tours with all of them covered.

I am only evaluating the best of 2 tours for first ‘kThreshold’ tours. For the remaining tours, the best solution is automatically selected.

My reasoning was, that at that point, you’ll have more discontinous sets …

shangjingbo · January 13, 2014, 10:38pm

We can check these conditions and re-generate if some are violated.

f03nix · January 13, 2014, 10:45pm

Example psuedocode

FindTours (tour_no) :

  repeat for kLimit times
    select a road
    expand it to both sides
    save the tour
  end

  if (tour_no > kThreshold)
    select best tour
    FindTour(tour_no + 1)
    return solution
  end

  repeat for 2 best tours
    select tour
    FindTour(tour_no + 1)
    save finalSolution
  end

  return with best out of two finalSolution
end

amntri · January 13, 2014, 10:59pm

You can use my generator: http://ideone.com/89oalv

freeman92 · January 14, 2014, 8:57am

a good variant of my approach. the advantage you have with this method is there are more variables to play with.

james_007 · January 14, 2014, 9:13pm

i hv used dfs aproach
but i dont know y its giving wa
pls can any tell me

1 first select vertex with min degree
2 from it start dfs till any vertex is left (excluding intersecting n repeated ones) n deleting these edges frm grph
3 repaeting 1 n 2 till min degree=0

link to my ans http://www.codechef.com/viewsolution/3249354

james_007 · January 14, 2014, 9:13pm

i hv used dfs aproach
but i dont know y its giving wa
pls can any tell me

1 first select vertex with min degree
2 from it start dfs till any vertex is left (excluding intersecting n repeated ones) n deleting these edges frm grph
3 repaeting 1 n 2 till min degree=0

link to my ans http://www.codechef.com/viewsolution/3249354

anton_lunyov · January 15, 2014, 10:46am

My solution mostly corresponds to what explained in the second part of the editorial.
But instead of greedy or recursive search I use beam search with width equal to 1000 / N (dogscience). For cost of the edge I used euclidean and Manhattan distances - the greater the distance the better the edge. Don’t ask me why - this is dog science as well

As mentioned in editorial I run the tours search multiple times while time permits. To speed up the process I stop the particular run if (the number of paths in it) + (some basic estimate of the number of paths in the remaining graph) is >= (the current best solution). For basic estimate I take (max_degree + 1) / 2. The better one would be the sum of such values over all connected components. But it’s time-expensive and give worse score.

Another small difference is that instead of taking some fixed edge I take some vertex and build the longest path starting from it. Also once the path is not extendable I try to extend it in opposite direction. My tours search has two modes. Random one - where at each step we take any vertex of positive degree and greedy one - where at each step we take the vertex of maximum degree.

Since number of vertexes is below 64 I use bit masks technique to store the graph.

See details in my solution. I cleaned up all testing stuff for readability. It scores 0.741 points.

Regarding the testing stuff I want to mention that test generator here could take long time to generate tests with N close to 50, due to collinearity condition. So instead of generating them in each run I generated random 50 tests once and save them into file.

freeman92 · January 15, 2014, 1:20pm

first time listening “beam search” word.