Linear recurrence using Cayley-Hamilton theorem

djdolls · August 14, 2014, 4:03pm

If you are not interested in linear recurrences, or are already aware of Cayley-Hamilton theorem, you can probably stop reading now.

Suppose we want to solve the following recurrence:
(T[0], T[1], T[2]) = (1, 2, 3), and
T[n] = T[n - 1] + 3 T[n - 2] + 8 T[n - 3], for n >= 3.

The usual way to solve such recurrence is via matrix exponentiation. We create a matrix M

In order to find T[n] we compute M^{n - 2}, and take the dot product of the first row of M with the vector v. This gives us an O (k^3 lg n) algorithm assuming that the size of the matrix was (k x k), and we are using the most elementary algorithm for matrix multiplication.

Cayley-Hamilton Theorem: Any square matrix is a solution of its characteristic polynomial. In our example, the characteristic polynomial is:

| (x - 1)   -3   -8 |
|   -1       x    0 |
|    0      -1    x |

= x^3 - x^2 - 3x - 8

Note that, the characteristic polynomial looks very similar to the recurrence we want to solve. This is no coincidence, it can be shown that the characteristic polynomial of the matrix M corresponding to a linear recurrence looks the same as the original recurrence. So we do not need to calculate the symbolic determinant of the matrix, but we can use the recurrence to compute this polynomial.

Now, according to Cayley-Hamilton theorem:
M³ - M² - 3M - 8I = 0, i.e.,
M³ = M² + 3M + 8I
Here I is the identity matrix.

Let’s calculate some higher powers of M.
M⁴ = M (M² + 3M + 8I) = M³ + 3M² + 8M
= (M² + 3M + 8I) + 3M² + 8M = 4M² + 11M + 8I

Similarly,
M⁵ = 15M² + 20M + 32I

One can observe that for any n, the matrix Mⁿ can be represented as a linear combination of M², M and I. Also, if we know this representation for Mⁿ, we can easily calculate the representation for Mⁿ⁺¹.

Mⁿ = aM² + bM + cI
==> M^{n + 1} = M (aM² + bM + cI) = (a + b) M² + (3a + c) M + 8aI

For a (k x k) matrix, the characteristic polynomial will be of degree k, and the above step can be performed in O (k) time, i.e., we can compute the representation of Mⁿ⁺¹ from Mⁿ in O (k) time.

Now, let us see if we can calculate the representation of M²ⁿ from the representation of Mⁿ.

M²ⁿ = (aM² + bM + cI ) (aM² + bM + cI )
= a² M⁴ + 2ab M³ + (b² + 2ac) M² + 2bc M + c² I

We can replace M⁴ and M³ by their corresponding representation that we have already calculated. This means, in order to compute M²ⁿ from Mⁿ, we need to multiply two degree k polynomials, and then replace M^{2k -2}, M^{2k - 3}, …, M^k by their representation. This can be done in O (k^2) time using elementary methods.

Now, we know how to compute the representation of:

Mⁿ⁺¹ from Mⁿ, and
M²ⁿ from Mⁿ

Both steps have O (k^2) complexity. Since these are the only steps needed for binary exponentiation algorithm, we can compute the representation of Mⁿ in O (k^2 lg n) time.

Now, to compute the actual matrix Mⁿ, we still need to add these k matrices which are present in its representation. However, if you notice, we do not need the whole matrix Mⁿ, but only its first row, which again can be computed in O (k^2) time.

Computing the first row of M², M³, …, M^{k - 1} is not difficult either, and can be done is O (k^2) time.

This gives us an O (k^2 lg n) algorithm to solve the linear recurrence, which is faster by a factor k, compared to matrix exponentiation method. The difference probably will not make much difference if k is small. However, for large matrices the difference is significant, e.g., project euler 258

shivam217 · August 14, 2014, 4:46pm

Nice explanation !!!

kuruma · August 14, 2014, 5:14pm

What an amazing explanation indeed

Thank you so, so much @djdolls

This has given me motivation to re-try project euler 258

Thanks!

Also, if I have any troubles, would you mind if I asked you for some pointers?

Best,

Bruno

djdolls · August 14, 2014, 6:05pm

Thanks!

Feel free to ask if something is not clear, I will be happy to help.

gkcs · August 14, 2014, 8:17pm

Wow! I always wanted to know a way to calculate recurrence relations fast.

Thanks so much.

mightymercado · October 21, 2016, 10:40pm

Thanks for this well-explained editorial!

epsilon_0 · November 2, 2016, 9:37pm

Can you please explain how to get the first rows of M^2, \cdots, M^{k-1} in O(k^2)?

Thanks a lot.

epsilon_0 · November 5, 2016, 8:14am

I got it. We can just multiply the top row of M^i with M in O(k) time instead of O(k^2) as each column only has 2 entries.

mugurelionut · June 26, 2017, 1:38am

Thanks for this very good explanation! I just used this a couple of days ago to solve the hard problem in Hackerearth’s June’17 Circuits contest.

prakhariitd · June 26, 2017, 4:05am

How did you find this?