a favourite proof of mine

This commit is contained in:
Rachel Lambda Samuelsson 2023-03-06 18:33:28 +01:00
parent 4aa9cd1849
commit f151e9404e

View File

@ -5,7 +5,7 @@ title: "A favorite proof of mine"
{% katexmm %}
There are a lot of proofs in mathematics. Many of them serve to verify what we know intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. There are proofs which leave us with a headache, some which leave us bored, and some which leave us with wonder and awe. In this blog post I will share a beautiful proof leading to a closed formula for the $n$-th Fibonacci number, taking us on a detour into functional programming and linear algebra.
There are a lot of proofs in mathematics. Many of them serve to verify what we intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. Some proofs leave us with a headache, some leave us bored, and some leave us with wonder. In this blog post I will share a beautiful proof leading to a closed formula for the $n$-th Fibonacci number, taking us on a detour into functional programming and linear algebra.
{% endkatexmm %}
@ -15,7 +15,7 @@ There are a lot of proofs in mathematics. Many of them serve to verify what we k
# The Fibonacci Numbers
The Fibonacci numbers are a sequence of numbers starting with two ones where each number is the sum of the last two. That is $0, 1, 1, 2, 3, 5, 8, 13 \dots$. If we wanted to be more precise we could define a sequence $\{f\}_{n=0}^{\infty}$ by the following recursive definition:
The Fibonacci numbers are a sequence of numbers starting with a zero followed by a one where each consequent number is the sum of the last two: $0, 1, 1, 2, 3, 5, 8, 13 \dots$. If we wanted to be more precise we could define a mathematical sequence $\{f\}_{n=0}^{\infty}$ by the following recursive definition:
$$
f_n = \begin{cases}
@ -31,15 +31,15 @@ Indeed, if we open [Chapter 5: Recursion](https://learnyouahaskell.github.io/rec
> Definitions in mathematics are often given recursively. For instance, the Fibonacci sequence is defined recursively.
Likewise, in [Chapter 1.2.2 Tree Recursion](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book-Z-H-4.html#%_toc_%_sec_1.2.2) of [SICP](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/index.html) we are yet again greeted by an old friend
Likewise, in [Chapter 1.2.2 Tree Recursion](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book-Z-H-4.html#%_toc_%_sec_1.2.2) of [SICP](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/index.html) we are yet again greeted by a familiar face.
> Another common pattern of computation is called tree recursion. As an example, consider computing the sequence of Fibonacci numbers
With this in mind, it might come as a surprise that there is a closed, non-recursive, formula for the $n$-th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP.
# Programatically calculating the $n$-th Fibonacci number
# Programmatically calculating the $n$-th Fibonacci number
A naive way of calculating the $n$-th Fibonacci number is to use the definition above. Check if $n = 0$, if $n = 1$, and otherwise calculating $f_{n-2}$ and $f_{n-1}$. However, unless $n$ is $0$ or $1$. Programatically this corresponds to the following Haskell code:
A naive way of calculating the $n$-th Fibonacci number is to use the definition above. Check if $n = 0$, if $n = 1$, and otherwise calculating $f_{n-2}$ and $f_{n-1}$. This corresponds to the following Haskell code:
```
fib :: Integer -> Integer
fib 0 = 0
@ -49,7 +49,7 @@ fib n = fib (n-2) + fib (n-1)
However, there is an issue with this method, many Fibonacci numbers will be calculated numerous times, as for each Fibonacci number evaluated we split into two paths, evaluating the previous and twice previous Fibonacci number. The reader which prefers visuals might appreciate Figure 1.5 from the SICP chapter.
How might we fix this then? A human calculating the $n$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is actually superfluous to carry all previous numbers, as we only need the previous two in order to calculate the next one. We might think of this as a window, moving along the Fibonacci numbers, taking $n$ steps to arrive at $f_n$. In code we could represent this as follows:
How might we fix this then? A human calculating the $n$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is superfluous to carry around all previous numbers, as we only need the previous two to calculate the next one. We might think of this as a 2-slot window, moving along the Fibonacci numbers, taking $n$ steps to arrive at $f_n$. In code we could represent this as follows:
```
-- steps -> f n-2 -> f n-1 -> f n
window :: Integer -> Integer -> Integer -> Integer
@ -60,11 +60,11 @@ fib :: Integer -> Integer
fib n = window n 0 1
```
In each step we move the 2-slot window by replacing the first slot in the window by what was previously in the second, and filling the new second slot of the window by the sum of the previous two slots. This is then repeated $n$ times, and then the first slot of the window is returned.
In each step we move the window by replacing the first slot in the window by what was previously in the second slot and filling the new second slot of the window with the sum of the previous two slots. This is then repeated $n$ times, and then the first slot of the window is returned.
# Mathematically calculating the $n$-th Fibonacci number
What does this have to do with mathematics, and this beautiful proof which I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent is as a vector. Furthermore, we may view sliding our window one step as a function $S$ from vectors to vectors. This poses an interesting question, is this function a linear transformation?
What does this have to do with mathematics, and this beautiful proof that I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent it as a vector. Furthermore, we may view sliding our window one step as a function $S$ from vectors to vectors. This poses an interesting question: is this function a linear transformation?
$$ S\left(\begin{bmatrix} a \\ b \end{bmatrix}\right) + S\left(\begin{bmatrix} c \\ d \end{bmatrix}\right) = \begin{bmatrix} b \\ a + b \end{bmatrix} + \begin{bmatrix} d \\ c + d \end{bmatrix} $$
@ -78,11 +78,11 @@ It is! This is great news as it means we can represent our step function by a ma
$$ S = \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix} $$
Then to calculate the $n$-th Fibonacci number we take the starting window $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$, multiply it by the $S^n$, and then look at the first entry of the matrix. We now have an analogue of our sliding window expressed entirely in the language of linear algebra, which will let us apply the tools of linear algebra.
Then to calculate the $n$-th Fibonacci number we take the starting window $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$ and multiply it by $S^n$. Now that the sliding window has been expressed purely in the language of linear algebra we may apply the tools of linear algebra.
# Applying the tools of linear algebra
If you're familiar with linear algebra there might be a part of your brain yelling "diagonalization" right now. We've translated our problem into linear algebra, but even for a small matrix calculating $S^n$ can become costly for high $n$, diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
If you're familiar with linear algebra there might be a part of your brain yelling "diagonalization" right now. We've translated our problem into linear algebra, but even for a small matrix calculating $S^n$ can become costly for high $n$. Diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
An eigenvector for our matrix $S$ is a vector $\hat v$ for which $S \hat v = \lambda \hat v$ for some scalar $\lambda$, which we call an eigenvalue. If there are any such vectors we can find them using their definition.
@ -102,9 +102,9 @@ Solving for $\lambda$ yields two eigenvalues:
$$ \lambda_0 = \frac{1 - \sqrt 5}{2} ,\; \lambda_1 = \frac{1 + \sqrt 5}{2}$$
Would you look at that, the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio $\frac{f_{n+1}}{f_n}$ approaches $\frac{1 + \sqrt 5}{2}$.
Would you look at that, $\frac{1 + \sqrt 5}{2}$, the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio $\frac{f_{n+1}}{f_n}$ approaches $\frac{1 + \sqrt 5}{2}$.
Now we can solve $(S-\lambda I_2) \hat v = 0$ for $\lambda_0$ and $\lambda_1$, and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination then yields:
Now we can solve $(S-\lambda I_2) \hat v = 0$ for $\lambda_0$ and $\lambda_1$, and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination yields:
$$ \hat v_0 = \begin{bmatrix} -2 \\ \sqrt 5 - 1 \end{bmatrix},\; \hat v_1 = \begin{bmatrix} 2 \\ \sqrt 5 + 1 \end{bmatrix} $$
@ -121,12 +121,12 @@ Which is very nice since
$$D^n = \begin{bmatrix} \frac{1 - \sqrt 5}{2} & 0 \\ 0 & \frac{1 + \sqrt 5}{2} \end{bmatrix}^n = \begin{bmatrix} \left(\frac{1 - \sqrt 5}{2}\right)^n & 0 \\ 0 & \left(\frac{1 + \sqrt 5}{2}\right)^n \end{bmatrix}$$
After calculating $B^{-1}$ we can solve $\begin{bmatrix} f_{n+1} \\ f_n \end{bmatrix} = BD^nB^{-1}$ for $f_n$ in order to get our closed expression.
After calculating $B^{-1}$ we can solve $\begin{bmatrix} f_{n+1} \\ f_n \end{bmatrix} = BD^nB^{-1}$ for $f_n$ to get our closed expression.
$$ f_n = \frac{1}{\sqrt 5} \left(\left(\frac{1 + \sqrt 5}{2}\right)^n - \left(\frac{1 - \sqrt 5}{2}\right)^n\right) $$
# Final thoughts
Whenever I first happened upon the closed formula for the $n$-th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me an recursively specified integer. After I learned this proof it doesn't feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just an application for geometry, flow balancing or computer graphics.
Whenever I first happened upon the closed formula for the $n$-th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me a recursively specified integer. After I learned this proof it doesn't feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just something to be applied in geometry, flow balancing or computer graphics.
{% endkatexmm %}