DEVHAND - Editorial

djdolls · May 18, 2015, 5:47am

PROBLEM LINK:

Author: admin
Tester: Kevin Atienza
Editorialist: Ajay K. Verma
Russian Translator: Sergey Kulik
Mandarian Translator: Gedi Zheng

DIFFICULTY:

Medium

PREREQUISITES:

Combinatorics, Modular Arithmetic

PROBLEM:

Given two integers N and K, find how many triplets of strings over K alphabets exist, such that length of all strings is bounded by N, and no string in a triplet is a prefix of another string of the triplet.

EXPLANATION:

We compute the number of triplets of strings of bounded length, and then subtract the ones in which one string is a prefix of another. For the sake of brevity, we define a new notation \prec to represent the “prefix” relationship, i.e., "X is a prefix of Y" can be represented by X \prec Y.

Total Number of Triplets:

There are exactly K^i strings of length i. Hence, the number of strings with length not exceeding N will be:
A = K + K^2 + K^3 + \cdots + K^N
A = K (K^N - 1 / (K - 1))

If the modular inverse of (K - 1) with respect to the given prime p = 10^9 + 7 is x, then the value of A \bmod p would be (K (K^N - 1) x) \bmod p.

Both the modular inverse, and power of a number can be compute in O (\log N) time. Hence, we can compute the value of A in O (\log N) time.

There are A strings of length not exceeding N. Therefore, the number of triplets would be A^3. Next, we count how many of these triplets are bad triplets, i.e., the ones in which one string is a prefix of another.

Triplets of the Form X \prec Y \prec Z:

In this section we count the number of triplets which have strings forming a chain under the relationship \prec, i.e., if the triplet is (P, Q, R), then one of the following is true:
P \prec Q \prec R
P \prec R \prec Q
Q \prec P \prec R
Q \prec R \prec P
R \prec P \prec Q
R \prec Q \prec P

Note that for some triplets more than one of these conditions might be true, e.g., (X, X, X) will satisfy all 6 conditions. Hence, we consider four categories, so that each triplet is counted exactly once.

1) X = Y = Z:
There will be A triplets of this form, as we can choose the string X in A ways, and the other two will be equal to it. These triplets will satisfy all 6 conditions, i.e., each triplet will be repeated 6 times.

2) X \prec Y = Z and X \neq Y:
For a given string Y of length r, there are exactly (r - 1) possible values of string X, the ones which are proper substring of Y. Hence, the number of such triplets would be:
K^2 + 2 K^3 + 3 K^4 + \cdots + (N - 1) K^N.

This is the sum of an arithmetic-geometric series, and can be represented explicitly as shown here. The computation of this sum also requires the computation of power and modular inverse, and hence can be done in O (\log N) time.

These triplets will satisfy two of the above conditions (X \prec Y \prec Z, and X \prec Z \prec Y).

3) X = Y \prec Z, and Y \neq Z:
Number of these triplets will be the same as the one computed in (2). This is because once we pick a string Z of length r, then Y has exactly (r - 1) choices.

These triplets also satisfy two of the above conditions (X \prec Y \prec Z, and Y \prec X \prec Z).

4) X \prec Y \prec Z, X \neq Y, and Y \neq Z:
If we pick a string Z of length r, then we only need to choose the lengths of X and Y. The lengths of X and Y must be distinct and lie between 1 and (r - 1). Hence, the number of (X, Y) pairs would be the number of ways of choosing two distinct numbers between 1 and (r - 1), which is given by (r - 1) \choose 2. Therefore the number of such triplets would be:
K^3 + 3 K^4 + 6 K^5 + \cdots + {{N - 1} \choose 2} K^N

Again this is the sum of arithmetic-geometric sequence of second order, and can be computed using the similar methods. The computation will take O (\log N) time.

These triplets will satisfy exactly one of the above conditions.

Hence, the number of chain triplets would be 6 ((number of category 1 triplets)/6 + (number of category 2 triplets)/2 + (number of category 3 triplets)/2 + (number of category 4 triplets)/1).

Triplets of the Form "X \prec Y, Z does not share prefix relationship with Y":

In this section we count the number of triplets in which exactly two strings satisfy the “prefix” relationship, i.e., if the triplet is (P, Q, R), then one of the following is true:
P \prec Q, R has no prefix relation with Q,
Q \prec P, R has no prefix relation with P,
P \prec R, Q has no prefix relation with R,
R \prec P, Q has no prefix relation with P,
Q \prec R, P has no prefix relation with R,
R \prec Q, P has no prefix relation with Q,

Once again it is possible that one triplet satisfy more than one condition, however, we should count it only once.

1) X = Y, Z has no prefix relation with Y:
If we have a string Y of length r, then there are exactly r prefix strings of Y. If a string W has Y as a proper prefix, then W can be written as W = Y + U, where U is a non-empty string of length at most (N - r). Hence, the number of such string W will be K + K^2 + \cdots + K^{N - r}. Since Z is neither a prefix of Y, nor has Y as its prefix, the number of possible values of Z would be:
(K + K^2 + \cdots + K^N) - (K + K^2 + \cdots + K^{N - r}) - r
= (K^{N - r + 1} + K^{N - r + 2} + \cdots + K^N - r).

Also number of ways of choosing a r length string Y is K^r. Hence, the number of such triplets (X, Y, Z), with Y being a r-length string will be:
K^r (K^{N - r + 1} + K^{N - r + 2} + \cdots + K^N - r)
= K^{N + 1} * (1 + K + K^2 + \cdots + K^{r - 1}) - r * K^r

If we take the sum of this expression over all possible values of r, we get the number of triplets (X, Y, Z) satisfying the above criteria, and it will be:
K^{N + 1} (N + (N - 1) K + (N - 2) K^2 + \cdots + K^{N - 1}) - (K + 2 K^2 + 3 K^3 + \cdots + N K^N)

This is also a sum of arithmetic-geometric sequence, and can be computed in logarithmic time. These triplets satisfy two of the above conditions (X \prec Y, with Z having no relationship, and Y \prec X, with Z having no relationship).

2) X \prec Y, X \neq Y, and Z has no prefix relation with Y:
Once again if we pick a string Y of length r, then X will have (r - 1) choices, and Z will have (K^{N - r + 1} + K^{N - r + 2} + \cdots + K^N - r) choices. Hence, the number of such triplets with Y being of length r would be:

r K^r (K^{N - r + 1} + K^{N - r + 2} + \cdots + K^N - r)
= r K^{N + 1} (1 + K + K^2 + \cdots + K^{r - 1}) - r^2 K^r

If we sum it over all possible values of r, we would get the number of triplets (X, Y, Z) satisfying the above condition, which would be (after some simplifications):

{(N + 1) \choose 2} K^{N + 1} (1 + K + K^2 + \cdots + K^N) - K^{N + 1} ({2 \choose 2} K + {3 \choose 2} K^2 + \cdots + {N \choose 2} K^{N - 1})
- (1^2 K + 2^2 K^2 + 3^2 K^3 + \cdots + N^2 K^N).

The number of triplets where exactly two strings satisfy the “prefix” relationship will be 6 ( (number of triplets of first category)/2 + (number of triplets of second category)/1).

Now that we have computed the number of bad triplets, we can subtract them from the total number of triplets to get the number of good ones.

Time Complexity:

O (\log N)

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution
Tester’s solution

gauravdewan007 · May 18, 2015, 3:35pm

If we are able to reduce the 4 cases to nested for loops, we can use wolfram engine to compute summation of series.
Sum function of wolfram does not allow summation of 3 sigmas (over 3 loop variables). However we can nest Summation like this:

link text
For denominator (expressed in terms of k) we can use Fermats theorem to calculate modulo of inverse.
There will be lots of terms and to avoid mistyping special care would need to be taken

anuj95 · May 18, 2015, 3:43pm

I also used this approach. But a friend of mine used matrix exponentiation to solve this problem which was much more cleaner . It would be appreciated if you could post an editorial about that approach too .

anshuman14021 · May 18, 2015, 3:43pm

The approach I used (Which I feel is simple,if not more complex) :

Let the 3 strings be of length x,y,z arranged in increasing order.
Then for the first string,we can have pow(k,x) possibilities, as the string with the minumum length cannot have any other string as its prefix.
For the secod string, the first x bit-string should not be the same as the stirng ‘x’, sof pow(k,x)-1 possibilities for the first x bits,and pow(k,y-x) for the remaining ones.
Similarly for the string ‘z’,we can havepow(k,z)-pow(k,z-x)-pow(k,z-y) possibilities.
Now,if we multiply these three and sum them up over varying x,y and z,we have our answer.

To further simplify the problem,we can divide it into 4 types of cases :

All strings are of equal length (x1)
‘x’ and ‘y’ strings are of equal length (x3)
‘y’ and ‘z’ stings are of equal length (x3)
No string is of the same length. (x6)

Note : ‘x’ and ‘z’ case is the same as all the strings being equal,as x,y,z are in increasing order.

Now,for each of these cases,we can apply summation with appropriate limits.

For the first one,let x=y=z=t and sum t over [1,n].
For the second one,let x=y=t and sum t over [1,z-1] and z over [2,n] respectively.
For the third one,let y=z=t and sum x over[1,t-1] and t over [2,n] respectively.
For the last one,sum x,y and z over [1,y-1] , [2,z-1] and [3,n] respectively.

If you calculate these summations and add them up,you have your answer.
(In my opinion, this is a longer method,but much easier to comprehend).

Hope this helps

ambarpal · May 18, 2015, 4:02pm

There is another approach that does not use any combinatorics at all. We consider the problem as being equivalent to selecting 3 nodes in a prefix tree of height n with a branching factor of k. Every node in the prefix tree corresponds to a word(taken from the root). Now if we can find 3 nodes such that none of the nodes lies in the subtree of the others, we have solved the problem. According to this, we set up 3 DPs, each corresponding to selecting 1, 2 and 3 nodes from the subtree rooted at node r. This gives a linear time solution. To get 100 points I solved the recurrences to get a constant time solution. Refer link, link for the recurrences and complete solution, respectively.

EDIT:
Many people asked me to explain my solution, so here goes:

First of all the prefix tree, for n = 2, k = 2 looks like this:

			.					Level0
	#a				b			Level1
a		+b		*a		^b		Level2

Selecting nodes #, *, and ^ gives a, ba, and bb which is a valid selection.

Selecting nodes #, +, and * gives a, ab, ba which is an invalid selection since a is a prefix of ab. Note that this happened because + was in the subtree of #.

So to solve the problem, what we need to do is find the number of ways to select 3 nodes such that none of them is in a subtree of another. Since the wristwatches are unique, multiply the number of selections by 3 factorial to get the answer.

Now, let f1(i) be the number of ways of selecting one node in the subtree of a node at level i. Note that since the tree is symmetric, the value of f1 is same for all nodes at a given level.

f1(i) = either select the root node or recursively call the same function on one of its children

  = 1 + K * f(i + 1)

Let f2(i) be the number of ways of selecting two nodes in the subtree of a node at level i, such that one is not the parent of another.

f2(i) = select any 1 node in any two of the subtrees rooted at nodes in level (i + 1) or recursively call f2 on a child of i

  = kC2 * f1(i + 1) * f1 (i + 1) + k * f2(i + 1)

Finally, let f3(i) be the number of ways of selecting three nodes in the subtree of a node at level i, such that none is the parent of another

f3(i) = select any three nodes and then select one node in their subtrees or, select two nodes and pick 2 in one and 1 in the other or, recursively call the function on a child of i

  = kC3*f1(i + 1)*f1(i + 1)*f1(i + 1) + kC2*2*f1(i + 1)*f2(i + 1) + k*f3(i + 1)

The answer is f3(0). This is an O(n) solution.

To get a constant time solution, we can first solve f1, then substitute it in the expression for f2, then solve f2. Now with the expressions for f1 and f2, we plug them in f3 and solve it.

kqr45j · May 18, 2015, 5:20pm

@anhuman14021 I went through your solution but got scared after looking at it. Would you mind explaining your solution please.??
Thanks

anshuman14021 · May 18, 2015, 6:13pm

@kqr45j I know,it’s not very neat.

But is simply what I mentioned in my previous answer : breaking down the answer into 4 cases and solving the resultant summations. The resultant formula take upto 2 A4 pages, so yeah…a code this messed up is justified :3 (I think).

anshuman14021 · May 18, 2015, 6:27pm

*But it is

akulsareen · May 18, 2015, 9:40pm

To solve this question I used matrix exponentiation, which in my opinion required far less thinking than the approach listed above.

Lets say I have placed the first x characters in all the 3 strings. The following cases can be true:

String 1 is still going on or it is finished (1 or 0)
Same for string 2
Same for string 3
One of string 1 or string 2 is a prefix of the other or not(1 or 0)
Same for 1 and 3
Same for 2 and 3

So using 2^6 states I can represent what may have happened at the end of first x characters. Now from x I need to go to x+1. If a string has already ended I can only place a ‘-’ there (- signifies a finished string). If a string has not ended I can either place one of the k characters there or decide to finish it there by placing a ‘0’(0 represents string ending). Effectively you can think that we can place (k+2)(k+2)(k+2) 3-tuples after xth character. Evaluate from every state where you would end up due to each tuple and increment the correct position in the transition matrix accordingly.

This gets us this solution which fetches us 3 points which can be optimized to get 53 points as shown here.

Currently our approach fails because 2^6 states means matrix multiplication requires more time than we have. So we need to get rid of redundant states.

For example, if string 1 has ended and string 2 has not and one of string 1 or 2 is a prefix of the other then it means that string 1 is a prefix of string 2 and no matter what happens now it is going to stay that way since string 1 has ended. So we will never reach a state where 1 is not a prefix of 2. Ignoring these states we are left with just 18 states which is enough for getting a full 100 points as shown here.

gvaibhav21 · May 18, 2015, 11:30pm

i had almost the same solution… much easier to solve this way

prateek1996 · May 19, 2015, 12:15am

@anshuman14021 ,Can you please explain in brief how did you find the sum of the 4 cases?? Thanx

kushan2020 · May 19, 2015, 12:51am

After summation and everything you end up having the polynomial which is the answer to every n,k

(k^(n+1) ((-6n^2+6n+5) + (3n^2-6n-7) k + (k^2-3k+3)(3n^2-1)) + k^(2n+2) (6(n+1) + - 6kn) +k^(3n+3) -2k^2 -2k) /(k-1)^3

wolfram - http://goo.gl/a2BPdk

You can get the answer in one step using this f(n,k) and substituting n and k to get answer.

anshuman14021 · May 19, 2015, 9:00am

@prateek1996 It’s pretty simple

For the first one, substitute x=y=z as any variable (say ‘t’). Now,as x,y and z are in increasing order and are all equal, the variable t would very from 1 to n.So, the inequality reduces to 1<=x=y=z<=n .Applying this summation, you get an equation.
For the second part, substitute x=y as any variable (say ‘t’).Now,as x,y and z are in increasing order,the inequality for this case becomes : 1<=x=y<z<=n (because only x and y are equal).It is obvious that t would vary from [1,z-1] and z would vary from [2,n].So,first summing the expression with respect to ‘t’ and then with respect to ‘z’,we get our second equation.
For the third sub-case, we have y=z as any variable (say ‘t’). Again x,y,z are in increasing order, so the order we get is 1<=x<y=z<=n. Clearly, x varies from [1,t-1] and t varies from [2,n]. Summing first w.r.t x and then w.r.t t,we get out third equation.
For the final equation,none of x,y,z are equal,so the inequality reduces to 1<=x<y<z<=n. As it is clearly visible, x varies in [1,y-1], y varies in [2,z-1] and z varies in [3,n].So, summing first w.r.t ‘x’,then w.r.t. ‘y’ and finally w.r.t ‘z’, we get out fourth and final equation.

Multiplying the equations above with the appropriate constants (refer to explanation above) and adding them up,we have our answer.

Sabari · May 21, 2015, 8:48pm

I don’t know whether it will work but currently I am trying this

string trees of n = 3, k = 3:
eg:

a,aa,ab,ac,aaa,aab,aac,aba,abb,abc,aca,acb,acc
b,ba,bb,bc,baa,bab,bac,bba,bbb,bbc,bca,bcb,bcc
c,ca,cb,cc,caa,cab,cac,cba,cbb,cbc,cca,ccb,ccc

There are three type of sets

1.All string are from different string tree

2.Two stings are from same string tree and one from different

3.Three string are from the same string tree

I got formula for Type 1 and Type 2 so far, type 3 is little difficult.

NS = total no of strings in single tree

type 1: T1 = {k\choose 3}*NS^3

type 2:

We have to pick two string from the same tree so root node cannot be picked. Let say I pick a string X for the second one I have to choose from

Y = (NS-ancestors of X - no of strings in sub-tree that has X as root)

X = \sum_{i=1}^{n-1} (Y*k^i-\frac{k^i*(k^i-1)}{2})

T2 = X*(NS*k-1)*k

type 3: in progress

ans = (T1+T2+T3)*6

Will this approach work?

foraml · May 22, 2015, 12:05pm

Can somebody please explain this approach

lakshmi8 · May 27, 2015, 9:58pm

Where are the author’s and tester’s solutions ?