Heavy-light Decomposition, Persistent data structures
You are given a tree of N vertices. Each vertex is initialized to value 0. Further you are given 3 kinds of operations:
c X Y A B - the “Change” operation: Along the path from X to Y increment the values by A, A+B, A+2B, etc.
q X Y - the “Query” operation: Find the sum of values along the path from X to Y
l X - the “Rollback” operation: Return the tree to the state it was after the X’th Change query.
The input is given in a manner that we require an online solution for the problem (for exact details, refer to original problem statement).
Note: The solution describes using segment trees, heavy light decomposition, and persistence over a tree. To avoid confusion in terminology, “node” refers to the data structure component (segment tree or persistence), whereas “vertex” refers to the vertex of the Tree given in the Problem.
We develop the solution by analyzing the problem under various special cases, as follows:
- The Tree is a chain, there are no rollback queries
- You have a chain, but there are rollback queries
- You have a tree, there are no rollback queries
- Overall solution.
Tree is a chain, there are no rollbacks
Here, we have a segment - lets use a segment tree. Now, we wish to update the values of vertices from X to Y. Suppose we have two such operations on the same vertices: X Y A B followed by X Y C D. Then the values will be (going along the path from X to Y)
after the first operation, and
A + C
A+B + C+D
A+2B + C+2D
after the second operation.
Clearly, this is equivalent to the single operation X Y A+C B+D.
Thus, let us store in our segment tree the pairs (A,B) associated with the operation. Now, when we get to a segtree-node which is completely contained in our X-Y path, we just update the A, B value of that node.
Further, while querying, we would like to return an answer when our query-node is completely contained in our required segment. Thus, we also need to store a field “sum” in our segtree which basically stores the sum of the left subtree + right subtree.
Pseudocode for this is as follows:
update(node cur, //current node in segtree we wish to update int X, //X-val of update-segment wrt current node int Y, //Y-val of update-segment wrt current node int A, //A-val of operation int B) //B-val of operation if(cur.left == X && cur.right == Y) cur.A += A, cur.B += B; return; mid = (cur.left + cur.right)/2; if(X <= min(Y, mid)) update(ret.left, X, min(Y, mid), A, B) if(max(X, mid+1) <= Y) update(ret.right, max(X, mid+1), Y, A + max(0, mid-X+1) * B, B); cur.sum = findSum(cur.left) + findSum(cur.right); findSum(node cur) n = cur.right - cur.left + 1; //#elements in node a = cur.A; b = cur.B; return cur.sum + calc(a, b, n); calc(a, b, n) //returns sum of a, a+b, a+2b, a+(n-1)b return (n * (2 * a + (n-1) * b))/2; query(node cur, int X, int Y, int accumA, int accumB) // similar to update: here accumA and accumB store the sum of A-values and B-values along the path from root to the node, as this is required in finding the overall value of a node at a depth. if(cur.left == X && cur.right == Y) return cur.sum + calc(accumA, accumB, (Y-X)+1); ret = 0 mid = (X+Y)/2; if (X <= min(mid, Y)) ret += query(cur.left, X, min(Y, mid), accumA+cur.left.A, accumB + cur.left.B) if (max(mid+1, X) <= Y) ret += query(cur.right, max(mid+1, X), Y, accumA + cur.right.A + accumB * (size-of-left-subtree), accumB + cur.right.B) return ret;
This can be found in Setter’s Solution (lines 24 - 54), if you ignore the persistence that is introduced in the modify() function.
You have a chain, but there are rollback queries
Here, we need to keep past information as well. The term used for this is Persistence. If we were to generalize the segment tree approach, we need to implement a segment tree that is capable of going back in time as well as forward in time.
To illustrate persistence, let us first take the example of a singly linked list. Imagine we have the list
A -> B -> C -> E -> F
And we wish to insert the element D between C and E. Also, we need to keep track of both “versions” of the list. That is, we require our list to look like
A -> B -> C -> D -> E -> F
while keeping track of the original. We could make an entirely new copy of the entire list, but this will take O(N) memory and time. The thing to notice is, from E on till the end of the list, it remains completely the same. Hence, we “clone” the nodes A, B, and C, and insert D after the clones. We finally associate the cloned root A with the new version.
Thus, our data structure looks somewhat like this now:
Version0: A -> B -> C -> E -> F Version1: A' -> B' -> C' -> D -> E -> F
Note that if we had a doubly linked list, then we would need to clone the entire list in order to implement Persistence.
Now, how does this apply to our generalization? We can imagine our segment tree as a binary tree with links to left and right children. In this setting, what would a modification to the tree look like?
It would just be a path starting from the root upto a particular node! What we need to do here, is clone just the path from the root to the node, and associate the new root with the new version. Here, when the tree has N nodes, a path of length atmost log(N) is being cloned. If you compare with the linked list example, over there, in the worst case we could clone a path of size O(N). Thus, there is a huge improvement in the case of the segment tree.
So, if we were to have a global array of Version-roots, and perform our updation of the tree with cloning, we get exactly what we desire. Refer to Setter’s code : lines 42 - 62 for the description of this.
You have a tree, and no rollback queries
Heavy-light Decomposition of Tree (HLDoT)
The heavy-light decomposition is used to break up a Tree into s set of disjoint paths, with certain useful properties. First, root the tree. Then, for each vertex x, we find the child of x having the largest subtree. Lets call that vertex y. Then the edge x-y is a heavy edge, and all other x-child_vertex edges are light edges.
The most important property of this is, from any vertex x, the path from x to the root goes through at most logN different light-edges. This is because, on this path, whenever you take a light edge, you are atleast doubling the size of the subtree rooted at the new vertex. Hence you cannot perform this “doubling” effect more than logN times.
A number of problems that require you to find some property of the path between two arbitrary nodes of a tree have heavy-light decomposition used. For example, DGCD is one such problem, which asks you to find the GCD of the numbers along the path from X to Y.
If you can solve the problem for a chain using a segment tree, then there is a very good chance that you can solve the problem for a tree using HLDoT. Indeed, if you make segment trees over the heavy edges, then the answer for your path X-Y can be broken up into two paths from X to LCA(X, Y) and from Y to LCA(X, Y). Then, using that you make only logN shifts from one heavy-chain to another, you are actually making only log(N) segment-tree queries.
Applying HLDoT here
Let us perform Heavy Light Decomposition of the tree here. We make chains consisting only of heavy edges. We also need to find LCA efficiently. This can be done in O(N log N) time by storing information up[x][i] = the ancestor of x which is at a height of 2^i above x. Clearly, up[x][i] = up[up[x][i-1]][i-1] (take a 2^(i-1) upward jump from the 2^(i-1)'th ancestor of x). Then,
LCA(x, y): if(x is ancestor of y) return x; for(i = logN; i >= 0; i--) if (up[x][i] is not an ancestor of y) x = up[x][i]; return up[x];
Now, given an update query, from X to Y, first find L = LCA(X, Y). Then, update path from X to L and from L to Y.
This is accomplished in the Setter’s code lines 195-220. Pseudocode follows. Let chain[vertex] = an identifier for the particular segment tree’s root that we require.
change(L, x, y, a, b): // Perform the operation X Y A B, where L = LCA(X, Y) dist = depth[x] + depth[y] - 2depth[L] + 1 lift(x, L, a, b); // update path x to L with parameters (a, b) if(y is not L) find pL = child of L that is ancestor of y //update path from y to pL with parameters (a + b * (dist-1), -b) lift(y, pL, a + b * (dist-1), -b) lift(low, high, a, b): if(chain[low] = chain[high]) Modify the chain[low]'th segtree along the path from low to high as required else Let H = Head of the chain[low]'th segtree (i.e. the one nearest to the root) Let n = number of nodes on path from low to H Modify the chain[low]'th segtree along the path from low to its Head as required lift (parent of Head, high, a + (n-1) * b, b)
The overall solution merges persistence with Heavy Light Decomposition. At the high level, it goes as follows:
- Perform heavy-light decomposition to give you information regarding ancestry-relation between nodes, LCA, depth, and mapping vertices to chain-numbers
- For a change operation between X and Y, for every chain along the path from X to Y, perform a persistent-change to the corresponding segment trees. For each segment tree, you have an array of the root-nodes that map versions to roots.
- For every query operation between X and Y, do the same as step 2, except you need to accumulate queries over various segment trees, and don’t perform any modifications
- For every rollback operation, set a global variable (that denotes your version number) to the required version.
Memory: Each update operation on a segment tree takes atmost O(logN) memory. Each update operation on the tree affects atmost O(logN) chains. Hence, memory complexity is O(N log^2 N).
Time: Updates on each segment tree take O(logN) time. There are atmost O(logN) segment trees to update on an Update Operation. Hence, O(log^2N) per update operation.
Queries behave similarly: O(log^2N) for queries as well.
Rollback operation: O(1) time to update global version-number.
Appendix: Solution for offline problem
The requirement of lastans in the input format ensured that offline solutions would not work here. There is a offline solution to the problem which does not use persistence.
Imagine versions as nodes of a graph. Now, when you have an update operation, that amounts to creating a new version-child of the current node. If you had no rollbacks, then your graph would be a chain. Now that there are rollbacks, you may branch out from an earlier version. In this case you have a tree.
Note that, if you went from a version to its child using the operation X Y A B, then you can go back from the child to its parent using the operation X Y -A -B. Thus, first you build this version-tree, and then you traverse it using a DFS. When you are in a version-node n, then solve all queries that pertain to that node before moving on.
Can be found here
Can be found here