PROBLEM LINK:
Author: Pawel Kacprzak
Tester: Misha Chorniy
Editorialist: Pawel Kacprzak
DIFFICULTY:
MEDIUM-HARD
PREREQUISITES:
Trees, Sqrt-decomposition
PROBLEM:
For a given tree with N nodes and weight assigned to them, the goal is to handle Q queries, each of one of the following 3 types: either for given node v return the sum of values in $v$’s subtree, or for given node v and integer x, add x to values of all nodes in $v$’s subtrees, or for a given two nodes v, u swap their subtrees if they are disjoint or print -1 otherwise.
QUICK EXPLANATION:
Use sqrt-decomposition to handle the queries. There are at least a few approaches to that, but the idea is to maintain a data structure of size at most O(\sqrt Q) rebuilding it in O(N) time when it exceeds the size limit. Handle each query in the linear time in terms of the size of this data structure. This leads to O(\sqrt Q \cdot N + Q \cdot \sqrt Q) time solution
EXPLANATION:
Subtask 1
In the first subtask we have both N and Q at most 1000, so a very direct approach is possible. For each query adding values to all nodes in a subtree and for each query asking for the sum of values in some subtree we can just traverse the given subtree in O(N) time. For a query representing a swap of subtree of nodes v and u, we can first in O(N) time check if one of them is ancestor of the other, which happens if and only if these subtrees are not disjoint. In this case we return -1, otherwise, we can explicitly remove these subtree from their parents and attach them to their new parents in the tree. Since each query can be handled as described in O(N) time, the total time complexity is O(Q \cdot N).
Subtask 2
In the second subtask we have N \leq 10^5 and Q \leq 10^5, but no swap queries to handle. In this case we can linearize the input tree using dfs. The idea is that each time dfs discovers a node we append it to the linear representation and each time dfs exists from a node we also append it there. Thus each node v appears in this order exactly twice and the range of entries between these two occurrences of v is a linear representation of $v$’s subtree. Having this linear representation of the whole tree, we can build a segment tree from this representation and answer both required queries in this subtask in O(\log(N)) time per single query. This results in O(Q \cdot \log(N)) total time complexity.
Subtask 3
In the last subtask constraints for N and Q are the same as in the second subtask, however a query swapping two subtrees is allowed here. This makes the problem significantly harder.
One can try different approaches here including: link-cut trees, some variations of splay-trees, treaps and similar, which may be successful - you can take a look at successful submissions from the contest where contestants used treap and link-cut trees. One more easy approach is to use technique called sqrt-decomposition. There are also various approaches using this concept, like for example very interesting tree compressions (not covered here). One possibility application of sqrt-decomposition is used in my solution and described below. The solution is documented, so you refer to it for implementation details. The general idea is the following:
We are going to maintain a data structure S, which is a list of ranges corresponding to some subtrees of the input tree. These ranges will be build using linear tree representation. Let enter[v] be the index where v occurs for the first time in the linear representation and exit[v] the index where it occurs last there. Initially, we put only one range corresponding to the whole tree to S. This range can be represented as two endpoints: [enter[1], exit[1]]. Whenever any query involving a subtree of v has to be handled, what we do first is to “isolate" $v$’s ranges from ranges in S.
The process of isolating has to result in S having a range with left endpoint equal to enter[v] and a range (the same or not) with the right endpoint corresponding to exit[v]. This can be done in O(S) time by iterating over ranges in S.
Also, with each range we are going to keep the number of vertices entered within in, sum of values of vertices entered within it and the accumulated value added to this range from queries adding values to subtrees. All these informations can be updated dynamically in O(1) time when we slice some range into parts in order to isolate a range. Moreover, these information are sufficient to answer all queries. Notice that in order to find the sum of values in $v$’s subtree it is sufficient to first isolate $v$’s ranges and then summing up values of all ranges between range containing enter[v] and ending with the range containing exit[v]. The query adding value of x to $v$’s subtree can be performed analogously.
The last remaining query type is the one swapping subtrees of v and u. First we isolate their ranges. Then we check if the subtrees intersect, which can be done by checking if a range containing any endpoint of one of them lies between segments corresponding to endpoints of the other. If this is the case we print -1, otherwise perform a swap of these two blocks of ranges, i.e. we take the first block of ranges as all ranges between the range containing enter[v] and the range containing exit[v]. The second block consists of all ranges between the range containing enter[u] and the range containing exit[u]. We swap these two blocks in S.
Each of the queries can be handled in O(S) time when performed as described above. However, with each query the size of S grows by at most 4, so the queries becomes more costly after some time. The trick to prevent this is to rebuild S from scratch after O(\sqrt Q) queries. The process of rebuilding results in a new tree with new linear representation and S containing again just one range. This process can be done in O(N) time - refer to [my solution] for implementation details. This assures that S is always of size O(\sqrt Q), so the total time complexity of the solution is O(\sqrt Q \cdot N + Q \cdot \sqrt Q).
AUTHOR’S AND TESTER’S SOLUTIONS:
Setter’s solution can be found here.
Tester’s solution can be found here.