PROBLEM LINK:
Author: Vaibhav Gupta
Tester: Misha Chorniy
Editorialist: Vaibhav Gupta
DIFFICULTY:
MEDIUM-HARD
PREREQUISITES:
Dynamic Programming, Combinatorics
PROBLEM:
The problem can be reduced to the following grid world scenario.
Chef is at coordinate (0,0) in a 2-D grid and has to reach point (p,q). From any point (x,y) he can move only in increasing direction: either UP(to the point (x, y+1)) or RIGHT(to the point (x+1, y)). Here, UP & RIGHT correspond to a bad & good deed respectively. But he can move UP if he is strictly below the line x = y + c. M points are blocked in the grid(he can’t visit a blocked point), whose coordinates are given.
You have to count number of such paths possible. Two ways are same if and only if path taken is exactly same in both ways. Print the required answer modulo 10^9 + 7.
QUICK EXPLANATION:
The problem can be broken down in 2 parts.
- Make a function that finds the number of ways to reach point (x2,y2) from point (x1,y1) with subject to the constraint that we can’t cross line x = y + c.
- The number of ways of reaching any blocked point (i,j) is independent of any blocked cell having larger x or y coordinates. If we sort the blocked cells on basis of x,y in increasing order - the number of ways to reach a cell in ith index of the array is independent of any cell after it in the array. So we can find the number of ways reaching any blocked cell by using m^2 DP approach.
The overall complexity is O(M^2 + p +q).
GENERAL SUGGESTION:
This editorial will have some hand exercises. It’s highly suggested that you yourself try to figure out the solution to those before proceeding further.
EXPLANATION:
Part 1: Find number of ways to reach point Q(x2, y2) from point P(x1, y1) if x2 >= x1 and y2 >= y1 without crossing the line x = y + c and ignoring the blocked cells:
Consider the following figure.
Here we want to go from P to Q without crossing the line(black line). We will call it L1.
Consider all possible paths from P to Q. There can be 2 types of paths:
- Paths from P to Q without crossing the line L1. We call such a path a good path. NOTE A path is good as long as it doesn’t cross L1. Although, it can it touch it any number of times.
- Paths that cross the line L1. We call such a path a bad path.
In the figure the blue path is a good path whereas the orange path is a bad path.
Now the sum of all good paths and bad paths gives us the total number of paths from P to Q.
Ex1: Find total number of paths from point P(x1, y1) to point Q(x2, y2) where x2 >= x1 and y2 >= y1.
Let x2 - x1 = x and y2 - y1 = y. The total steps needed to be taken is x + y. Also, it’s fixed that each path will have exactly x RIGHT steps and exactly y UP steps. Different paths can be generated by permuting the UP and RIGHT steps amongst them. The total number of ways to permute x + y items such that x and y of them are identical respectively is given by: (x+y)!/(x!y!) where n! denotes n factorial.
Now, we have found the total number of paths from P to Q. But our aim was to find the total number of good paths from P to Q. We can first find the total number of bad paths and then subtract is from the total number of paths. This will give us the total number of good paths from P to Q.
For a path B to be bad, it has to cross L1 at atleast 1 point. Now, if it touches the line L1 at point (a,b) after crossing L1, it will reach point (a,b+1).
Any general point on L1 is x = y + c , so any general point that a bad path reaches on crossing L1 is given by x = y + c - 1 . Let this line be L2(black dotted line). So we can say that any path B that touches L2 atleast once is a bad path.
Let the first point where B touches L2 is X’. Now take reflection of the segment of B from P till X’ about the line L2 as shown in the figure in green color. Consider a new path B’ : formed by the reflected segment of B and the rest of the unreflected segment of B from X’ to Q i.e. the path from P’ to X’ + the path from X’ to Q.
Now there are 2 important observations about the path B’
- Starting point of B’ is always point P’ i.e. reflection of P in L2.
- B’ is a path from P’ to Q that always touches the line L2 atleast at one point.
Now, we easily prove by construction that for every bad path B, that touches the line L2 for the first time at X’, we can construct a corresponding path B’ from P’ to Q by : concatenating the reflected segment of B from P to X’ and the segment of B from X’ to Q.
Also, by similar argument, for every path B’ from P’ to Q we can construct a bad path B by again concatenating the reflected segment of B’ from P’ to X’ and the segment of B’ from X’ to Q. So there is a bijection between bad paths B and the paths B from P’ to Q.
Now, to find the number of bad paths we just have to calculate the total number of paths from P’ to Q. So all we are left to do is to find the reflection of P in L2.
Ex2: Find the reflection of P in L2.
The above formula can be looked through in any high school book. Using it, reflection of P(x1,y1) in L2: x = y + c - 1 is (y1 + (c - 1), x1 - (c - 1)). Now number of paths from (y1 + (c - 1), x1 - (c - 1)) to Q(x2,y2) can be found using the formula above.
Now, by subtracting it from the total number of paths from P to Q we can get the total number of good paths from P to Q as was required. Let F(x1, y1, x2, y2) gives us this number.
So the final expression for number of paths from P(x1,y1) to Q(x2,y2) without crossing L1: x = y + c is given by
Ways = (x1-x2 + y1-y2)C(x1-x2) - (x1-x2 + y1-y2)C(x1-x2 + (c-1)) .
Now, we have the solution to the first part. But wait! there is one more small exercise for you.
Ex3: How to calculate C(N,K) for N, K <= 4 * 10^6?
Sol: We can pre-compute and store all the factorials and inverse-factorials for 1 <= i <= 4e6. Now to compute C(N,K) all we have to do is 2 elementary operations:
long long findnCk(N, K) { ll nCk = fact[N]; nCk = (nCk * inv[K]) % MOD; //where inv[K] is modular inverse of K nCk = (nCk * inv[N-K]) % MOD; return nCk; }
But how to compute factorials and inverse-factorials of very large numbers?
Finding factorial for very N can be done in O(N) by simply using the factorial of previous index
fact[i] = (i * fact[i-1]) % MOD; //where fact[0] = 1
Now, we have to efficiently find the inverse-factorial. We can this too, using the inverse-factorial of next index as follows.
inv[i] = (inv[i+1]*(i+1))%MOD; //where inv[4e6] is modular-inverse of fact[4e6]
So the complexity of this pre-computation is O(N) or O(p + q). These are some relevant links: Modular multiplicative inverse
,Fast Exponentiation ,A small intution of the approach.
Now let’s, put P(x1,y1) = (0,0), Q(x2,y2) = (n,n) and making L1: x = y i.e. c = 0 in the above formula, we get 2nCn - 2nC(n-1) which is the expression for nth catalan number(sequence of natural numbers occurring in various counting problems).
Awesome! This just shows how the catalan numbers can be derived. Now, you are good to go with all catalan related problems.
Ex4: What is the number of full binary trees with n internal vertices?
Part 2: Find number of ways to reach point Q(x2, y2) from point P(x1, y1) without going to blocked cells.
The naive way to incorporate the blocked cells is using Inclusion-Exclusion Principle. Let (x,y) be the starting point. The number of ways to get from (x,y) to (m,n) while avoiding the one restricted point at (a,b) is given by the number of ways to get to (m,n) with no restrictions, minus the number of ways to get to (m,n) that go through (a,b).
F(x,y,m,n) - F(x,y,a,b) * F(a,b,m,n)
Generalising, it we can get the total number of ways to reach (m,n) without going to the blocked point.
Ex5: How? This exercise is left for the user.
But, such a solution will take 2^m time where m = 10^3. Hence, not feasible. We need to improve.
It is subtle observation that the number of ways of reaching any blocked point (i,j) is independent of any blocked point having larger x or y coordinates. So we can sort the blocked cells on basis of the pair (x,y) in increasing order so that the number of ways to reach a cell in ith index of the array is independent of any cell after it in the array. So we have broken the given bigger problem into smaller subproblems.
Nice enough? Uhhh! wait we also have an optimal substructure in the problem. Let the point P_i at ith index is (x_i, y_i). Actual number of ways to reach
P_i = (ways To Reach P_i Ignoring The Blocked Cells) - (ÎŁ(ways To Reach From P_i To P_j) * (ways To Reach P_j Ignoring The Blocked Cells)).
Hurrah! Now we have a straightforward M^2 dynamic programming solution.
Let’s say our set S = {all Blocked Cells + cell(p,q)}. Sort S on increasing basis of x coordinate and then increasing on y. Also let’s assume dp[i] = F(x,y,x_i,y_i) where (x,y) is the starting point of the path. Pseudo code for it is given below:
for(i = 0; i < S.size(); ++i)
for(j = i-1; j >= 0; --j)
if(S[i].x >= S[j].x && S[i].y >= S[j].y)
dp[i] -= (dp[j] * F(S[j].x, S[j].y, S[i].x, S[i].y));
The overall complexity is O(M^2 + p + q)
AUTHOR’S AND TESTER’S SOLUTIONS:
Author’s /Editorialist’s
Tester
RELATED PROBLEMS:
Part 1
UVA - 932 Safe Salutations
Hackerrank - devu and cool graphs
Codechef - Hand Shake
Codechef - Mock Turtle
Part 2
Codeforces - Count Ways
Codeforces - Research Rover