Problem Link:
Difficulty:
Medium
Pre-requisites:
Probability
Problem:
Given integers N and M, and an array C having N elements in the range [0, M], you replace every 0 in the array with an integer in the range [1, M] uniformly at random. Then you ask, what is the expected length of the maximum consecutive sequence of the array having the same value?
Explanation:
Upon seeing the constraints, one realizes that an O(N^4) solution would pass. The problem smells of Dynamic Programming, but what is the state space that can capture the required information?
Assume that you are filling in the values of C from left to right. You first consider the first position, then the next and so on. In fact, let us consider the subproblems where we truncate the array C to the first few elements (at each step).
Further, you need to keep track of the maximum length of consecutive elements. You can update this maximum only when you fill in some value that causes the maximum length to increase. In order for that to happen, you need to know what value to fill in, and what the length is when filling in.
This brings us to the question, “If I were to have filled in the first ‘x’ positions of the array, and the position I have at ‘x’ is ‘i’ and also the length of the sequence having values equal to ‘i’ is ‘j’, and the maximum length of a sequence so far is ‘k’, what is the probability of doing this?” Lets call this probability P(x, i, j, k).
In considering P(x, i, j, k): we are looking at only C[0…x]. We are saying that C[x] = i. We are also saying that C[x-1]…C[x-j+1] = i. We are saying that the maximum set of consecutive equal elements in C[0…x] = k. This trivially implies that P(x, i, j, k) = 0 for k < j, or for C[x] != 0 and C[x] != i.
In fact, we make two cases.
Case 1) C[x] != 0. Lets say C[x] = y.
>Further, either we have the previous position as y, or not.
>If not, then P(x, y, 1, k) gets incremented by sum(i != y) P(x-1, i, j, k).
>If previous position was y, then P(x, y, j, k) gets incremented by P(x-1, y, j-1, k).
Pseudocode for the above is as follows:
int y = C[x];
//first for cases where the previous value was also y
for(int j = 2; j <= N; j++)
P[x][y][j][j] += P[x-1][y][j-1][j-1]; //increment the case where the max length increases by having y here
for(int k = j; k <= N; k++)
P[x][y][j][k] += P[x-1][y][j-1][k];
for(int i = 1; i <= M; i++)
if(i == y) continue; // i==y case considered above
for(int j = 1; j <= N; j++)
for(int k = 1; k <= N; k++)
P[x][y][1][k] += P[x-1][i][j][k];
This handles the case for C[x] != 0.
Now, if C[x] = 0, we just need to try out all cases of C[x] being 1…M, and make the required increments:
P(x-1, i, j-1, j-1) contributes to P(x, i, j, j).
P(x-1, i, j-1, k) contributes to P(x, i, j, k).
Indeed, it is the same as the above case, except for having a for-loop for y. However, note that the case of (i != y) would lead to four nested for-loops (and hence O(N^4) for this step itself). This can be avoided by memoizing for a fixed i, k, and finding the sum across j.
Pseudocode follows:
//memoize sums
for(int k = 1; k <= N; k++)
for(int i = 1; i <= M; i++)
for(int j = 1; j <= N; j++)
sum[i][k] += P[x-1][i][j][k];
for(int y = 1; y <= M; y++)
for(int j = 2; j <= N; j++)
P[x][y][j][j] += P[x-1][y][j-1][j-1]/M;
for(int k = j; k <= N; k++)
P[x][y][j][k] += P[x-1][y][j-1][k]/M;
for(int i = 1; i <= M; i++)
for(int k = 1; k <= N; k++)
if(i != y) P[x][y][1][k] += sum[i][k]/M;
Finally, we take care of the base cases.
If C[0] == 0,
P[0][i][1][1] = 1/M for all i
else
P[0][C[0]][1][1] = 1.
The answer is then, considering x=N-1, taking expected value.
Pseudocode:
ans = 0;
for(int i = 1; i <= M; i++)
for(int j = 1; j <= N; j++)
for(int k = j; k <= N; k++)
ans += k*P[N-1][i][j][k];
Output ans.
Setter’s Solution:
Can be found here
Tester’s Solution:
Can be found here