CHEFWORD - Editorial

elfus · November 17, 2014, 6:56pm

PROBLEM LINK

Author: Dmytro Berezin

Tester: Sergey Kulik

Editorialist: Florin Chirica

DIFFICULTY:

medium

PREREQUISITES:

matrix multiplication; basic probability theory

PROBLEM

You’re given an initial word S. You do K claps: each clap a character from S turns into another character with some given probability. You’re also given a list of final words. What’s the probability the word S will be one from the final list after those K claps?

QUICK EXPLANATION

A word S can transform in other word s[i] if and only if their length are equal. Probability the word S to transform in s[i] is prob(S[1], s[1]) * prob(S[2], s[2]) * …, where prob(a, b) is probability the character a will transform in the character b after K claps.

We can calculate dp[k][a][b] = probability character a will turn into character b after k claps. We find the recurrence dp[k][a][b] = sum of dp[1][a][a’] * dp[k - 1][a’][b]. This recurrence can be improved by using matrix multiplication, since it is a linear recurrence.

Explanation

We need to compute probability that word S will become one of the words from s list. When is it possible the initial word S to become one of words s[i]? If the words S and s[i] have the same length, there’s some probability (possibly 0) that word S will become word s[i]. If word S and word s[i] have different lengths, then we know for sure it’s not possible, since we can’t replace a letter by SPACE character.

Let’s consider all words s[i] from s list. We’ll see what’s the probability word S will transform in word s[i] using exactly K claps. The answer of the problem is the sum of those probabilities.

So suppose we want to transform word A in word B using exactly K claps. This means A[1] needs to become B[1], A[2] needs to become B[2] and so on. Denote by P(char1, char2) = probability that character char1 will become char2 after exactly K claps. All the events P(A[1], B[1]) P(A[2], B[2]) and so on need to happen. We know from probability theory that the probability for some independent events to all happen (each of event needs to happen) is the product of each probability. So, in this case, probability word A will become word B is P(A[1], B[1]) * P(A[2], B[2]) * … * P(A[n], B[n]) where by n I noted length of A array.

The problem is reduced to: calculate P(a, b) = probability that character a will transform in character b after K claps. Calculating this in the current form seems difficult, so let’s try to add one more parameter.

Let P(a, b, x) = probability that character a will transform to character b after exactly x steps. For x = 1, the answers are given in the input.

We’ll next try to do dynamic programming. Suppose we calculated P(something1, something2, x - 1) for each something1 and something2, let’s try to calculate P(a, b, x) for each a and b. We focus now to calculate it for two fixed values of a and b. Then, we can iterate each two values a and b and whole P will be calculated for given parameter x.

Suppose we do one step. Let’s imagine character a becomes a’ after 1 clap. We need to do x - 1 claps which must transform character a’ into b. Suppose we fixed a’ now. What’s the probability a will turn into b, given that after one clap it will become a’? It is P(a, a’, 1) * P(a’, b, x - 1). Why? First, character a transforms in a’ using a clap with probabilty P(a, a’, 1). Then, a’ transfors in b after x - 1 steps with probability P(a’, b, x-1). The probability that both independent events to happen is P(a, a’, 1) * P(a’, b, x - 1).

Let’s solve for general case calculation of P(a, b, x). After one step, a character a will turn into some character a’. If we iterate all characters a’ possible, then all cases are covered. We need to calculate the sum of P(1, a, a’) * P(x-1, a’, b). Why do we need the sum? All events are independent, and exactly one event from the list needs to happen (since the character after 1st clap is always different between any two events, they’re independent). Probability that one event will happen is sum of probabilities of all events.

So we get that P(x, a, b) = sum of all P(1, a, a’) * P(x-1, a’, b).

This is how we get an O(K * 26^3) solution. This should time out. However, using some more math theory we can reduce the complexity.

Let’s recall how matrix multiplication is done: suppose you’re given two matrixes A and B and want to calculate C = A * B. We know that C[a][b] = sum of all A[a][a’] * B[a’][b]. This recurrence matches with the one from above to calculate P(x, a, b).

If we note C[a][b] = P(x, a, b), A[a][b] = P(1, a, b) and B[a][b] = P(x-1, a, b), we’re reduced our task to matrix multiplication.

The final observation to solve the task is: suppose A[a][b] is probability character a will turn to character b after exactly 1 clap. Let’s determine probability that character a will turn into character b after x claps. It’s enough to raise matrix A to power x-1, then print element from (A^(x-1))[a][b].

Since matrix multiplication is associative, it can be done by exponentiation by squaring. Hence, our total runtime is reduced to O(26^3 * (log(K)).

AUTHOR’S AND TESTER’S SOLUTIONS:

To be updated soon
To be updated soon

betlista · November 17, 2014, 7:10pm

That problem almost made me mad - I knew that it is matrix multiplication, but I was getting WA again and again without an idea what’s wrong. And then suddenly, I had a question - “are all strings s distinct?”.

Applying one of the golden rules: “Do not assume something not written in problem statement” and modifying my solution got AC finally…

So the answer for same input as in problem statement with string

daa
bbb
ccc
ccc
zz

is still 0.125…

pranjalranjan · November 17, 2014, 7:12pm

Same has happened to me in many problems before. And nowadays after getting WA i do the same Remind myself to never assume anything not given in the problem…

himanshujaju · November 17, 2014, 7:18pm

I used the fact that the matrix becomes almost constant after being raised to the power >=40. Since we are allowed error <= 10^-6, I computed using dp for k = min(given_k,40). This ensures O(k * 26^2 * strlen(string)) to pass in 0.71s.

betlista · November 17, 2014, 7:22pm

Question to problem setter: Why |S| <= 3 ? Why not 100 for example?

shredder · November 17, 2014, 7:24pm

Nooooooooooooo :’( :’( Didn’t realize it during the contest. Oh well, another lesson learned

sudeepdino008 · November 17, 2014, 7:34pm

I was getting a lot of wrong answers in this. I just saw @betlista’s answer. It is my personal opinion that such there shouldn’t be such trickery in the problem statement. It was an excellent question, but if I were a problem setter, I personally wouldn’t want people to have a headache figuring out twists in problem statement.

I really hope I don’t spend 10-15 hours for this sort of trickery in future.

kunal361 · November 17, 2014, 8:12pm

Same with me…after 2 or 3 submissions of WA…i reread the problem statement…saw the constraints…where the number of query strings was greater than the total possible combinations…and then it struck “DUPLICATES”…good problem btw!!!

kunal361 · November 17, 2014, 8:18pm

Same was the algo used by me…the only variation being instead of an upper limit on ‘k’…i let the loop compute till the difference in probabilities of the current iteration and the previous iteration were less than an ‘eps’ value!!!

betlista · November 17, 2014, 9:40pm

good catch @kunal361, I checked that |S| is at most 3 and I was not using it later, because it was not a limiting factor for me…

betlista · November 17, 2014, 9:48pm

Weak test cases, try for K = 41 and K = 40 for matrix

0.99 0.01
0.01 0.99

those are no so similar…

Hm, it’s not the same even for

0.9 0.1
0.1 0.9

lucky you

himanshujaju · November 18, 2014, 12:10am

ok I guess I didnt explain my approach completely. We are multiplying all the probabilities. A difference occurs only if one of them is pretty high, but then it will cause other probabilities to be low and their multiplication would result in an error <= 10^-6. Individually the probabilities maybe incorrect, but they correct themselves overall. There might be weak test cases, but I feel that my algo is right for <=10^-6 relative error.

kunal361 · November 18, 2014, 1:33am

I think the multiplication of probs of so many letters would lead to most answers being 0…suppose the average probability of a char in the string of length 100 is 0.5…mult value will be approx 10^(-31)…now even if u provide 10^6 such queries…addition of all would still be 0!!!

cyberax · November 18, 2014, 3:37am

same for me here. after 2 WA, i thought to myself “noooooo ! omg i was so blind !” haha ! then i submitted, and AC.

betlista · November 18, 2014, 6:39am

If my matrix is the one above, and I ask you to print the probability of changing aaa to aaa in 41 klaps, your answer will be wrong (with respect to eps).

betlista · November 18, 2014, 6:41am

Of course for testing they can use “good” data = such that are no 0 at the end

himanshujaju · November 18, 2014, 8:09am

Hmm. There must have been some bug by me in calculating the exact “k” value. But for k = 1000, it works! http://ideone.com/AV9g0U .

Eu1er · November 18, 2014, 11:52am

Another way to think about the problem can be using graph theory. Let us say the given probability matrix represents adjancy matrix for a weighted directed graph, where a character represent a node. Now Our task is to find how many different paths exist from a given node to another node.

Now this is a well know theorem that in a graph the adjancy matrix raised to power K gives all possible paths from one node to another. For case of probabilities it will directly give probability(why ?).

Hence the answer is directly the given matrix raised to power K and [i][j] element will give probability of getting to character j from i in exactly K claps.

As stated the matrix exponentiation can be done easily using linear recurrence in 26^3*logK.

xpertcoder · November 18, 2014, 8:20pm

I was also getting WA and knew it was Matrix Multiplication, what I did is before printing answer I checked if it’s >1.0(don’t know why but came to mind), if it is I ran an infinite loop.I was given TLE. This way I knew I am counting some probabilities more than once. Then after thinking some time I got that strings can be same(what a realization!! ). Got AC afterwards:)

betlista · November 18, 2014, 8:32pm

Amazing tip again