Fast Fourier Transform, Graph Theory
You are given a tree with N nodes and exactly N-1 edges.
Find the number of pairs of nodes in the tree for which the length of the path (number of arcs between the nodes) between them is prime.
The problem actually asks for the probability that selecting two such nodes leads to a prime length path. But it is obvious that what is needed is the number of such pairs. The answer will just be
number-of-pairs / NC2
The explanation of the solution derives ideas largely from the Tester’s Solution.
The intended solution for this problem was based primarily on using FFT for convolution. I find this idea extremely elegant (and usually, much simpler, because its very generic).
I have experience with setting / testing very few problems using this central idea in various contests (3 of them on CodeChef) and always, the test data seems to have been cracked by a different - no, I don’t mean simpler, just different - approaches.
I say cracked, because these approaches might fail particular corner cases, which were not thought off during testing. I believe that generating the test data that would fail every cracky approach is very hard. Sometimes, an approach might not be possible to fail within the constraints.
This is probably the case in August Challenge. It would be interesting to know whether it was possible to fail the alternate approaches, or not!
ignore the prime numbers
The problem asks you to find the number of pairs at a distance of prime numbers. But, there are no elegant properties that you can use to be able to simplify the problem. Our objective will remain simpler.
- For each node, find the number of nodes at all distances
In the end, we will only add up the number of nodes at prime distances
divide and conquer
- Root the tree at some vertex v. This orients all the edges and defines a clear parent / child relationship between nodes
- For each child of v, run the algorithm recursively on the subtree rooted at it
- The sub-problems (sub-trees) will be of recursively smaller and smaller sizes
Now, for each child c of v, we know how many decendents are at each distance. We can update the values to be stored at v by adding 1 to the values from child c. Of course, values at v will be composed using all the values from all the children of v.
This takes care of all the paths that start at a node and always move downwards, towards the decendents.
For all the other paths, we can do a convolution of the arrays of values stored at each child of v. Of course, the fastest way to convolve is using Fast Fourier Transforms for polynomial multiplication. This idea has been explored in the July Challenge problem FARASA as well.
The given tree is not necessarily a binary tree. If we try to find a split of nodes in such a tree, we may end up with a split that selects only O(N / D) vertices in one set; here D is the largest degree of any node.
This would mean that our divide and conquer may perform too many FFTs in the worst case. This is highly undesirable.
If a node has more than 2 chilren
- Give the node a new right child
- Connect the node to its right child with a 0 cost edge
- Move all but 1 child to the children of the new right child
- The remaining child will be the left child of the node
In the final tree, the meaning of distance will be replaced by the sum of the costs of edges, instead of the number of edges.
Now, we can always find a split that selects at least O(N / 3) nodes in one set.
The complexity of the solution is O(N log2 N).
But, the constants for this solution are large. The Tester’s and Setter’s solutions based on the described approach are still slower than several solutions submitted during the contest.
Can be found here.
Can be found here.