PROBLEM LINK:
Setter: David Stolp
Tester: Lewin Gan and Istvan Nagy
Editorialist: Taranpreet Singh
DIFFICULTY:
Hard
PREREQUISITES:
Probability, Lagrange multiplier and Partial Differentiation.
PROBLEM:
Given value of N boxes in the array V, There are K chefs including head chef, all of them having a raffle ticket, each chef can put their ticket in exactly one of the box. We need to find the expected value of the surprise box that a chef wins. it is assumed that all the chefs follow the same strategy i.e. all of them follow the same probability tuple (p_1, p_2, \ldots p_n) which denote the probability to choose the ith box.
QUICK EXPLANATION
- Since all chefs follow the same strategy, all of them have the same expected value. We also know that a box contributes to the expected value of at least one of the chef chooses that particular box. The probability of at least one chef choosing ith box is (1-(1-pi)^k).
- So, The expected gain for head chef is (1/k)*\sum_{i=0}^{N-1} vi*(1-(1-p)^k). The problem reduces to minimizing \sum vi*(1-pi)^k with constraint \sum pi = 1.
- Using Lagrange multiplier and partial differentiation, we can work out that we need qi = (1-p) to be proportional to vi^{-1/(k-1)}.
- Special adjustment is required to be made with qi which end up being > 1 due to the ratios, so it makes sense to set pi= 0 for lower valued prizes.
EXPLANATION
We know expected value is a summation over all items multiplied by the probability of winning corresponding boxes, we’ll be evaluating the summation step by step. Let’s call expected value as E.
E = \sum v_i*p_i* \sum_{j=0}^{K-1} C^{K-1}_{j}*(p_i)^j*(1-p_i)^{K-j-1}/(1+j)
Let us understand what happens above.
For expectation, we multiply the value of the prize box with the probability of head chef choosing that particular prize box. Now, we are left with the remaining K-1 chefs. Let us assume j of these K-1 chefs choose this prize box. This can be done in exactly C^{K-1}_{j} ways, and the probability of this is (p_i)^j*(1-p_i)^{K-j-1}. Now, we have j out of these K-1 chefs, plus the head chef, so the winning probability of chef is the total probability divided by (1+j), the number of tickets in the ith box.
E = \sum v_i/K* \sum_{j = 0}^{K-1} K*C^{K-1}_{j}*(p_i)^{j+1}*(1-p_i)^{K-j-1}/(1+j) (Multiplied inner summation by K, divided outer part by K and pushed the p_i outside summation inside.)
We can rewrite it as E = \sum vi/K* \sum_{j = 0}^{K-1} C^{K}_{j+1}*(p_i)^{j+1}*(1-p_i)^{K-j-1}.
Replacing j with j+1, we have E = \sum vi/K* \sum_{j = 1}^{K} C^{K}_{j}*(p_i)^j*(1-p_i)^{K-j}. The part \sum_{j = 1}^{K} C^{K}_{j}*(p_i)^j*(1-p_i)^{K-j} is just the binomial expansion of 1 = (p_i + (1-p_i))^K - C^{K}_0*(1-p_i)^K = (p_i + (1-p_i))^K - (1-p_i)^K. So, we have our summation as,
E = \sum vi/K* (1-(1-p)^K)
We need to choose p_i so as to maximize this expression and find the final value of this expression.
E = \sum vi* (1-(1-p)^K) = \sum v_i/K - \sum v_i*(1-p_i)^K/K.
First term is constant, so we need to minimize \sum v_i*(1-p_i)^K. Lets replace q_i = 1-p_i. We have to minimize \sum v_i*q_i^K with constraint \sum p_i = 1 = \sum (1-q_i) = N-\sum q_i. So we have constraint \sum q_i = N-1.
Let us use partial differentiation on probability tuple p_i and equate it to the partial differential of the constraint wrt the same variable, we’ll have the stationary point in N-dimensional space. We can use the second derivative, (which shall be negative) to know that the stationary point we figured is a point of minima.
Click to view
Differentiate the expression partially wrt probability tuple and equate it with the partial differential of sum constraint wrt probability tuple, to obtain above proportion
It comes out to the fact that q_i is the product of Lagrange multiplier multiplied by v_i^{-1/(K-1)}. So, we know Lagrange multiplier is constant for all terms, we need all q_i to be in a proportion of v_i^{-1/(K-1)}. We also have \sum q_i = N-1. We have figured out ideal probabilities. But there is still a glitch.
While setting q_i in proportion of v_i^{-1/(K-1)}, we may end up with q_i > 1 which implies negative probability. This happens because we are worse off by choosing this box at all. So, we can just set p_i for this box. It can be observed, that if prizes are sorted by values, q_i is inversely related to v_i^{-1/(K-1)} which implies that if a box with value x have q_i > 1, all boxes with value < x shall have q_i > 1. So, we can just run a binary search, to find the suffix of prize boxes, all of which have q_i \leq 1 which implies p_i \geq 0 and find the expected value. Important point to note is that suppose we choose x prize boxes, we need \sum q_i = x-1 for these x boxes, because for remaining boxes, we have set q_i = 1.
There also exists a linear solution with basic math application which you can refer in solutions below (author’s solution).
Time Complexity
Time complexity is O(N*log(N)) (O(N) also possible with simple insight).
AUTHOR’S AND TESTER’S SOLUTIONS:
Setter’s solution
Tester’s solution
Editorialist’s solution
Feel free to Share your approach, If it differs. Suggestions are always welcomed.