CARDS777 - Editorial

PROBLEM LINK:

Practice
Contest

Author: Lewin Gan
Testers: Kamil Dębowski
Editorialist: Lewin Gan

DIFFICULTY:

Medium

PREREQUISITES:

Linearity of expectation

PROBLEM:

Find the expected value of an optimal strategy for a game involving cards and tokens.

QUICK EXPLANATION:

The main observation is that all strategies are optimal strategies; all strategies will lead to the same expected value.

EXPLANATION:

Using the observation above, we get a really simple solution. Let’s play all red tokens, then all blue tokens. By linearity of expectation, we can get the total expectation to be

p * (r / (r+b)) + (r+b-p) * (b / (r+b))

.

Intuitively, you can guess this by noticing that the cards are played randomly. We can prove this more formally with induction.

Consider a O(rbp) dp solution as follows. Let f(r,b,p) denote the expected value given we have r red cards, b blue cards, p red tokens, and r+b-p red tokens.

We claim that

f(r,b,p) = p * (r / (r+b)) + (r+b-p) * (b / (r+b)) = (p*r + (r+b-p)*b) / (r+b)

. We prove this by inducting on r+b.

We can check that the base cases f(1,0,1), f(0,1,1), f(1,0,0) and f(1,0,1) all make sense (i.e. f(1,0,1) = 1, since we have a red token and red card, so we get 1 point).

Now, for the inductive case, consider two cases: we either play a red token or blue token.

Play a red token:

f(r,b,p) = (f(r-1,b,p-1) + 1) * (r / (r + b)) + (f(r,b-1,p-1)) * (b / (r + b))
= (((((p-1)(r-1) + (r+b-p)*b) / (r+b-1) + 1) * r + ((p-1)*r + (r+b-p)*(b-1)) / (r+b-1)) * b) / (r+b)

This is a bit hard to simplify, but we can use wolfram alpha here.

Thus, we get

= (p*r + (r+b-p)*b) / (r + b)

Play a blue token:

f(r,b,p) = (f(r-1,b,p)) * (r / (r + b)) + (f(r,b-1,p) + 1) * (b / (r + b))
= ((((p(r-1) + (r+b-p-1)*b) / (r+b-1)) * r + (((p*r + (r+b-p-1)*(b-1)) / (r+b-1)) + 1) * b) / (r+b)

This can be simplified with wolfram alpha again [here][4].

Thus, we get

= (p*r + (r+b-p)*b) / (r + b)

.

Thus, in both cases, we get the same expected value, so this completes the induction step. Thus, we’ve proven this formula works.

AUTHOR’S AND TESTER’S SOLUTIONS:

Setter
Tester

[4]: http://www.wolframalpha.com/input/?i=((((p(r-1)+%2B+(r%2Bb-p-1)b)+%2F+(r%2Bb-1))++r+%2B+(((pr+%2B+(r%2Bb-p-1)(b-1))+%2F+(r%2Bb-1))+%2B+1)+*+b)+%2F+(r%2Bb)

6 Likes

alternately this can be solved in O(R + B) by keeping the expected number of Red and Blue Cards and at each step choosing the one which is higher in value and subtracting the probability of a card being chosen from the expected value.


[1]


  [1]: https://www.codechef.com/viewsolution/13127766
3 Likes

Well, we already showed that every strategy is an optimal strategy, so that’s why this greedy solution works :slight_smile: In fact, it doesn’t matter what card you play, you can choose anything you have left available.

I used this approach.

@rajat1603 I had the same thought,but couldn’t prove it mathematicaly why this strategy would be the best.Can you elaborate?

1 Like

This strategy is the best because all strategies are equally good. The proof is in the statement above (by induction). For more intuition, you can see that your moves are all fixed but cards are randomized, meaning there isn’t a way for you to “beat” the randomness in this case.

1 Like

I have a more natural explanation for this problem. It is exactly dependent on the randomness of the card. Basically, we can number the cards and tokens from 1,2,…,n. So there are n! order of drawing card (the number of permutation) and their probability is equal (1/n!). Considering any two strategies of choosing tokens. There are always two permutations that make the same number of point corresponding to each strategies. For example r+b=3. 2 strategies choosing tokens 1,2,3 and 3,2,1. The permutation (of drawing card) 1,2,3 and 3,2,1 make the same number of points in both cases. Therefore, we proved that all strategy is equally good. What is left is calculate the E[X].
Denote X_i = 1 if the ith tokens matched the ith drawing cards. We can see X=X_1+X_2+…+X_n. By the linearity of expectation, X=E[X_1]+…+E[X_n]. if X_i is red, E[X_i]=r/(r+b) else E[X-i]=b/(r+b). Then it concluded my proof

1 Like

This link would be helpful in understanding the Linearity of Expectation : https://brilliant.org/wiki/linearity-of-expectation/

Hello Yakumo. I am not sure I your explanation works (or maybe I don’t understand it properly). It is said that you can adjust your next token choice based on previous cards. So a strategy is more than a list of tokens. The same strategy can lead to tokens being played in different order, for 2 different card permutations.

@rajat1603 i had attempted the question by keeping the values of r and b and not their expected value ,can you explain why should we keep the expected value of the variables?