WEASELTX - Editorial

r_64 · August 17, 2017, 6:06pm

PROBLEM LINK:

Practice

Contest

Author: Bogdan Ciobanu

Tester: Jingbo Shang

Editorialist: Hanlin Ren

DIFFICULTY:

Medium-hard

PREREQUISITES:

binomial coefficients, Lucas’s theorem, multidimensional prefix sum

PROBLEM:

You are given a rooted tree with root number 0. Every node u has a weight X_{0,u}. When d>0, define X_{d,v} is the bitwise-xor sum of all X_{d-1,u} where u is in v's subtree. You are also given Q queries, each query is a number \Delta, and you need to output X_{\Delta,0}.

QUICK EXPLANATION:

Let Y_d be the xor-sum of weights of all nodes whose depth is exactly d, then the answer to a query \Delta is the xor-sum of all Y_d's, where (\Delta-1)\text{ and }d=0. By using a dp technique similar to multidimensional prefix sum, one can compute Z_d, which means the xor-sum of all Y_k's where k\text{ and }d=0, in O(N\log N) time, and the query takes only constant time.

EXPLANATION:

subtask 1

Since N,\Delta\le 500, we can calculate X_{i,u} for all 0\le i\le 500, 0\le u < N. Given X_{i,0},X_{i,1},\dots,X_{i,N-1}, we can compute X_{i+1,0},X_{i+1,1},\dots,X_{i+1,N-1} by doing one dfs on tree. This gives an O(N\cdot\max\Delta) algorithm.

other subtasks

Let’s first consider, what if “xor” is changed to addition? i.e., what if X_{i+1,u} is defined as the sum, rather than xor, of all X_{i,v}'s where v is in u's subtree? Obviously the answer is a linear combination of all weights, i.e., X_{\Delta,0}=\sum_{u=0}^{N-1}f(\Delta,u)X_{0,u}, where f(\Delta,u) is something only dependent with \Delta and u.

What is f(\Delta,u)?

TL;DR: f(\Delta,u)=\binom{dep_u+\Delta-1}{\Delta-1} where dep_u is u's depth and dep_0=0.

Now let’s consider how to solve f(\Delta,u). Take the sample input as an example:

Sample

Then we have:

\begin{aligned} X_{1,0}=&X_{0,0}+X_{0,1}+X_{0,2}+X_{0,3};\\ X_{2,0}=&X_{1,0}+X_{1,1}+X_{1,2}+X_{1,3}\\ =&(X_{0,0}+X_{0,1}+X_{0,2}+X_{0,3})+(X_{0,1}+X_{0,2})+X_{0,2}+X_{0,3}\\ =&X_{0,0}+2X_{0,1}+3X_{0,2}+2X_{0,3};\\ X_{3,0}=&X_{1,0}+2X_{1,1}+3X_{1,2}+2X_{1,3}\\ =&(X_{0,0}+X_{0,1}+X_{0,2}+X_{0,3})+2(X_{0,1}+X_{0,2})+3X_{0,2}+2X_{0,3}\\ =&X_{0,0}+3X_{0,1}+6X_{0,2}+3X_{0,3};\\ &\dots \end{aligned}

For example, f(2,3)=3 and f(2,2)=6 here.

A recurrence equation of f is: f(\Delta,u)=\sum_{v\text{ is }u\text{'s ancestor}}f(\Delta-1,v)(note that v could be u). Why? Note that when we calculate X_{\Delta,0}, we write

\begin{aligned} X_{\Delta,0}=&\sum_vf(\Delta-1,v)X_{1_v}\\ =&\sum_vf(\Delta-1,v)\sum_{u\text{ is }v\text{'s offspring}}X_{0,u}\\ =&\sum_uX_{0,u}\sum_{v\text{ is }u\text{'s ancestor}}f(\Delta-1,v);\\ \text{also, }X_{\Delta,0}=&\sum_uX_{0,u}f(\Delta,u). \end{aligned}

This explains the above equation.

Let’s do more on the equation:

\begin{aligned} f(\Delta,u)=&\sum_{v_1\uparrow u}f(\Delta-1,v_1)&\text{we use }a\uparrow b\text{ to represent that }a\text{ is }b\text{'s ancestor(possibly }a=b\text{)}\\ =&\sum_{v_1\uparrow u}\sum_{v_2\uparrow v_1}f(\Delta-2,v_2)\\ =&\dots\\ =&\sum_{v_1\uparrow u}\sum_{v_2\uparrow v_1}\dots\sum_{v_{\Delta}\uparrow v_{\Delta-1}}f(0,v_{\Delta}). \end{aligned}

Thus, f(\Delta,u) is the number of sequences (v_0,v_1,v_2,\dots,v_{\Delta}) such that:

v_0=u;
v_{\Delta}=0(note that f(0,v)=[v=0]);
For all 1\le i\le\Delta, v_i\uparrow v_{i-1}.

Obviously all v_i's appear on the path from u to 0. Let dep_x be the depth of node x(dep_0=0) and d_i=dep_{v_{i-1}}-dep_{v_i}, then (d_1,d_2,\dots,d_{\Delta}) is an array satisfying the following condition:

d_i's are nonnegative integers;
\sum_{i=1}^{\Delta}d_i=dep_u.

We find that every array d satisfying the above condition gives us a unique valid sequence v! So f(\Delta,u) is just the number of such array d's. Next is a classical lemma stating this number is just \binom{dep_u+\Delta-1}{\Delta-1}(refer to Wikipedia, the last line of “Definition and interpretations”). We omit the proof here.

Lemma 1: given n,k, the number of nonnegative solutions of x_1+x_2+\dots+x_n=k is \binom{n+k-1}{n-1}.

Coming back to XOR

Now let’s consider the xor case. Note that xoring the same number for even times does nothing, so for a query \Delta, we pick all nodes u such that f(\Delta,u) is odd, and xor them up. Next we’ll use a lemma called Lucas’s Theorem:

Lemma 2: given n,m,p, and p is a prime. Let’s write n,m in base p:

\begin{aligned} n=&\overline{n_kn_{k-1}\dots n_1n_0};\\ m=&\overline{m_km_{k-1}\dots m_1m_0}, \end{aligned}

Then,

\binom{n}{m}\equiv\prod_{i=0}^k\binom{n_i}{m_i}\pmod p.

Let me demonstrate the lemma by an example. Let p=5, n=116, m=55. Then n=(\overline{431})_5,m=(\overline{210})_5, so \binom{n}{m}\equiv \binom{4}{2}\cdot\binom{3}{1}\cdot\binom{2}{0}\equiv 3\pmod 5. Actually, you can check that \binom{n}{m}=5265169722428127562717416598795968\equiv 3\pmod 5.

How does the theorem help us? Note that we only need to know the reminder \binom{a}{b}\bmod 2 for some huge a,b. When p=2, Lucas’s theorem becomes

Lemma 3: \binom{n}{m}\equiv 1\pmod 2 if and only if n\text{ and }m=m, where \text{and} is the bitwise-and operation.

Thus, f(\Delta,u)\equiv 1\pmod 2\iff \binom{\Delta-1+dep_u}{dep_u}\equiv 1\pmod 2\iff (\Delta-1+dep_u)\text{ and }dep_u=dep_u.

subtask 2

To solve subtask 2, we preprocess Y_d as the bitwise-xor sum of all nodes at depth exactly d, and for a query \Delta we enumerate all d from 0 to N, if (\Delta-1+d)\text{ and }d=d, we xor the answer with Y_d.

Time complexity: O(NQ).

subtask 3

If you print the values of X_{i,0} and try to find patterns, you’ll find that X_{i,0} has a period of length L\le 2N. The solution for this subtask is: first find the length of that period L, then prepare all X_{i,0}'s for i\le L; for any query \Delta, we just print X_{\Delta\bmod L,0}.

In the solution of subtask 4 I’ll show that, the answer only depends on the last \lceil\log_2 N\rceil bits of \Delta-1, and that’s why we have an O(N) period.

subtask 4

Can the condition “(\Delta-1+d)\text{ and }d=d” be further simplified? Yes! Note that x\text{ and }y=0 is a sufficient condition for (x+y)\text{ and }y=y, since when adding x and y in binary, no carries would happen. Is it necessary? The answer turns out to be yes! This can be proved by contradiction: Suppose i is the lowest bit that both x and y has 1 on this bit. Then (x+y)'s i-th bit is 0 and that violates (x+y)\text{ and }y=y. Thus “(\Delta-1+d)\text{ and }d=d” is equivalent to “(\Delta-1)\text{ and }d=0”.

Let Z_d be the bitwise-xor sum of Y_f's such that f\text{ and }d=0. For a query \Delta we directly output Z_{(\Delta-1)\bmod 2^{18}}, since 2^{18}>N and ((\Delta-1)\text{ and }d) is only related to (\Delta-1)'s last 18 bits.

How to compute Z_d? We can do it by a dp that’s similar to multidimensional prefix sum: Let dp_{i,j} denote the bitwise-xor sum of all Y_d's, such that:

For 0\le k < i, d and j's k-th bit can’t be both 1;
For i\le k < 18, d and j's k-th bit are the same.

Then dp_{i+1,j}=\begin{cases} dp_{i,j\text{ xor }2^i}&i\text{-th bit of }j\text{ is }1\\ dp_{i,j}\text{ xor }dp_{i,j\text{ xor }2^i}&i\text{-th bit of }j\text{ is }0\\ \end{cases}. Note that dp_{0,j}=Y_j, and what we want is Z_j=dp_{18,j}.

The overall complexity is O(N\log N+M).

ALTERNATIVE SOLUTION:

If your solution is different with ours, feel free to leave a comment.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.
Editorialist’s solution can be found here.

hemanth_1 · September 11, 2017, 4:00pm

This is exactly what I did nice problem !

meooow · September 11, 2017, 4:04pm

I just used 2 observations to solve this problem.

All X_0 values at a certain depth appear xor-ed together at their ancestors X value or not at all. So all X_0 values at the same depth can be xor-ed together to convert the tree to a chain. I see your approach also uses this.
If the length of the resulting chain is d, then the number of operations after which it reverts to its initial form, i.e its period, is the nearest power of 2 \ge d. Also the pattern of inclusion of values at the root xor-sum is recursive… this is what I mean. If split into two halves, each term either repeats the pattern of the left half as the right half, or leaves the right half empty.

So we can make up the chain to the nearest power of 2 by padding with zeroes for convenience, and then the pattern can be generated by splitting the chain into two halves, solving recursively, and combining (similar to mergesort). Here is my solution, although I have used an iterative version instead of recursion. The complexity is \mathcal{O}(N \log N).

EDIT (The merging and solving procedure in greater detail):
Suppose A is the chain derived from the tree with length d, which is a power of 2. Take a look at the pseudocode below, the terms should be self-explanatory.

function solve(A, L, R):
    N = R - L + 1
    if N equals 1:
        return
    solve(A, L, L+N/2-1)
    solve(A, L+N/2, R)
    A_L = slice of A from L to L+N/2-1
    A_R = slice of A from L+N/2 to R
    for i in [0..N/2-1]:
        A[L+i] = A_L[i] xor A_R[i]
        A[L+N/2+i] = A_L[i]

Calling solve(A, 0, d-1) will compute all d possible answers in A. The algorithms works by recursively computing the same sequence of the pattern of inclusion in both the halves of size N/2 each. Then it generates the current pattern sequence of size N by either incorporating both the left and right value or just the left value, in the proper order of course.

For example, if N = 8, the pattern and the half pattern is
11111111 1111 10101010 1010 11001100 1100 10001000 1000 11110000 10100000 11000000 10000000

So A_L and A_R will be
-- A_L --|-- A_R -- A_L[0] = A[L] ^ A[L+1] ^ A[L+2] ^ A[L+3] | A_R[0] = A[L+4] ^ A[L+5] ^ A[L+6] ^ A[L+7] A_L[1] = A[L] ^ A[L+2] | A_R[1] = A[L+4] ^ A[L+6] A_L[2] = A[L] ^ A[L+1] | A_R[2] = A[L+4] ^ A[L+5] A_L[3] = A[L] | A_R[3] = A[L+4]

And then they will be combined into

A[L]   =  A_L[0] ^ A_R[0] =  A[L] ^ A[L+1] ^ A[L+2] ^ A[L+3] ^ A[L+4] ^ A[L+5] ^ A[L+6] ^ A[L+7]
A[L+1] =  A_L[1] ^ A_R[1] =  A[L]          ^ A[L+2]          ^ A[L+4]          ^ A[L+6]
A[L+2] =  A_L[2] ^ A_R[2] =  A[L] ^ A[L+1]                   ^ A[L+4] ^ A[L+5]
A[L+3] =  A_L[3] ^ A_R[3] =  A[L]                            ^ A[L+4]
A[L+4] =  A_L[0]          =  A[L] ^ A[L+1] ^ A[L+2] ^ A[L+3]
A[L+5] =  A_L[1]          =  A[L]          ^ A[L+2]
A[L+6] =  A_L[2]          =  A[L] ^ A[L+1]
A[L+7] =  A_L[3]          =  A[L]

That took some typing… hope it’s clearer now
Also I hadn’t noticed this before, but I now recognize the pattern as the Sierpinski triangle. Awesome!

adecemberguy · September 11, 2017, 4:11pm

i also saw same pattern but got TLE in last cases(C++)…

vijju123 · September 11, 2017, 4:15pm

That xoring of nodes at same depth…really mind blowing. That was a pretty neat observation honestly!!!

taran_1407 · September 11, 2017, 4:25pm

This is exactly what i tried. I got the chain idea as well as the cycle length, but not the pattern of inculsion and thus, got what i deserved, 0 points.

meooow · September 11, 2017, 5:22pm

@adecemberguy I don’t think using C++ was the cause of TLE, \mathcal{O}(N \log N) should comfortably clear the limit. Perhaps there is some part of your code taking greater time than you expect?

vijju123 · September 11, 2017, 5:28pm

The pattern can have periodicity upto N (roughly) , so if one tries to derive pattern by brute force (i.e. keep on doing recursion, calculating d_1,d_2,d_3... until a repetition/original configuration is obtained) then the last 2 cases give TLE.

dushsingh1995 · September 11, 2017, 5:33pm

@tarun_1407 - Are you sure you took required values as long in JAVA? I was getting WA in C++ because of int. Changed it to long long, got 40 points.

taran_1407 · September 11, 2017, 6:19pm

i used long, got WA, even tried with BigInteger which i knew was much more than required, but still WA.

vasja · September 11, 2017, 6:34pm

For 40 points i simulated the process until i reach the same value on the root as the initial value.
Then i returned rootValues[delta % rootValues.length ]

It’s wrong obviously, as i got WA on the third subtask but hey, 40 points is not bad :).

shubham_raj199 · September 11, 2017, 6:55pm

How I approached it----
it can be simply observed that whenever nodes come in answer all the nodes with same level also comes ,So we can
calculate XOR depth-wise, Let consider super worst tree ever ,this tree is 1->2->3->4->…->n

1)on day 0 ,ans=1

2)on day 1 ,ans=1^2^3^4^5^6…^n

3)on day 2 ,ans=1^(2^2)^(3^3^3)^(4^4^4^4)^(5^5^5^5^5)^…

4)on day 3 ,ans=1^(2^2^2)^(3^3^3^3^3^3^3)^…

u can observe:on day 3 , number of 1s,2s,3s,4s,5s,… are 1,3,6,10,15…

again u can see that this looks like binomial coefficients and on further days this pattern also emerges

so on day k+1 ans=1(kCk time) ^2 ((k+1)Ck times) ^3 ((k+2)Ck times) ^…

You know XOR of a number even times is 0 and otherwise the number itself, so question decreases to find whether nCk is odd or even and now read last para of editorial.

Suggestions: Never cancel out XOR values in initial stages, I stuck on this problem for 7 days just coz I cancelled out XORs and was finding pattern in that
-I couldn’t comeup with dp solution given at last in editorial because calculating for each depth XOR must take O(n) time for each query ,I was unable to think of pre-processing the dp table

pk301 · September 11, 2017, 7:12pm

i have solved it using two observations : first that if any element of any level is present in any node(after some operations) then all the element at that depth is also present in that set…and a tree with depth x will have (2^x) different values of the root. And further after d days the value at root is the Xor of all possible submasks of mask(2^x - d). I calculate all the submasks and store it and then answer the query in constant time :)…

solution : https://www.codechef.com/viewsolution/15353659

first i think that it will give TLE as generating submasks of all the masks is (3^x) operation which will in worst case 3.2 seconds…

but i dont know how it passed…if anyone can explain me why it will be helpful

liouzhou_101 · September 11, 2017, 7:31pm

@pk301
I also have the same approach as you, which is an O(3^k) solution where k = O(log n).
It passed because of compiler optimization and high-speed of bitwise operations.
My solution runs in 0.20s the worst case of the test data.

I was thinking of how to optimize it, however, I didn’t try that as it directly passed.

Feel better if the constraints were set more strict in order to let O(3^k) solution not pass.

de_vika · September 11, 2017, 8:22pm

@meooow, Thanks for the insights
Although, can you please explain the recursive call part ? I understood the pattern that you meant, but I’m not able to relate it to a recursive call.
It would be great if anyone could help.
Thanks in advance

pk301 · September 11, 2017, 8:23pm

@liouzhou_101 you said that you were thinking of optimizing it! can you please tell what kind of optimizations are possible here? i am a beginner and for the first time solved 6th question ! it will be helpful if u share your ideas

liouzhou_101 · September 11, 2017, 9:02pm

@pk301
The way to optimize is just shown in the editorial I think. However, I didn’t catch it at that time (since I passed it with unexpected solution, and I would spend time solving next several problems).

shivam_07 · September 11, 2017, 10:12pm

Please help me with my solution i am getting NZEC but it is working fine in my IDE. Even if you could give me a testcase for which my solution would get NZEC. https://stackoverflow.com/questions/46154263/codechef-sept-2017-challenge-weasel-does-xor-on-tree-getting-nzec-for-this-submi Please help.

eight_bitguy · September 12, 2017, 1:18am

Well, after making some random test cases, even I observed a recursive pattern, those who observed but unable to code can check my


[1] for a simple implementation of that recursive pattern.


Only one test case took 0.48 sec, remaining test cases took less than 0.09 sec. I used 8 for loops and 8 if condition. What else I did was, I divided the depth in a group of 4 because if they are going to contribute in the answer, they will appear in a group of 4.


  [1]: https://www.codechef.com/viewsolution/15336870

khalid545 · September 12, 2017, 4:11am

I made these 2 observations:

Nodes with depth 2^{n} are included in the result iff : (\Delta-1)\ \%\ 2^{n+1} < 2^{n}
Nodes with depth x = \sum 2^{a_{i}} are included in the result iff nodes with depths 2^{a_{i}} are included for all i

So, to get the answer for \Delta, first I find all powers of 2 that are included, then I compute the xor sum of all nodes with depths that are combinations of these powers.

For example: if 1, 4, and 8 are included, 1, 5, 9, 12 and 13 are also included.
If we take the binary representations of these numbers, then we can see that
1={(0001)}, 4={(0100)}, 5={(0101)}, 8={(1000)}, 9={(1001)}, 12={(1100)} are subsets of 1 + 4 + 8 = 13 = (1101)

To compute the xor sum of all subsets, we can use the approach described here:
http://codeforces.com/blog/entry/45223

Complexity: (N + Q) \cdot \log(N)

My solution