got most of the content done for new post, proof reading remains

This commit is contained in:
Rachel Lambda Samuelsson 2023-03-05 17:12:55 +01:00
parent 356736df29
commit 7a393484ce

View File

@ -1,16 +1,21 @@
--- ---
layout: post layout: post
title: "A favorite proof of mine" title: "A favorite proof of mine"
katex: True
--- ---
{% katexmm %}
There are a lot of proofs in mathematics. Many of them serve to verify what we know intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. There are proofs which leave us with a headache, some which leave us bored, and some which leave us with wonder and awe. In this blog post I will share a beautiful proof leading to a closed formula for the $n$-th Fibonacci number, taking us on a detour into functional programming and linear algebra. There are a lot of proofs in mathematics. Many of them serve to verify what we know intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. There are proofs which leave us with a headache, some which leave us bored, and some which leave us with wonder and awe. In this blog post I will share a beautiful proof leading to a closed formula for the $n$-th Fibonacci number, taking us on a detour into functional programming and linear algebra.
{% endkatexmm %}
<!--more--> <!--more-->
{% katexmm %}
# The Fibonacci Numbers # The Fibonacci Numbers
The Fibonacci numbers are a sequence of numbers starting with two ones where each number is the sum of the last two. That is $0, 1, 1, 2, 3, 5, 8, 13 \dots$. If we wanted to be more precise we could define a sequence $\{f\}_{n=0}^{\infty}$ by the following recurisve definition: The Fibonacci numbers are a sequence of numbers starting with two ones where each number is the sum of the last two. That is $0, 1, 1, 2, 3, 5, 8, 13 \dots$. If we wanted to be more precise we could define a sequence $\{f\}_{n=0}^{\infty}$ by the following recursive definition:
$$ $$
f_n = \begin{cases} f_n = \begin{cases}
@ -24,17 +29,17 @@ The Fibonacci numbers have become a bit of a poster child for recursive definiti
Indeed, if we open [Chapter 5: Recursion](https://learnyouahaskell.github.io/recursion.html) of [LYAH](https://learnyouahaskell.github.io/) we are greeted with the following. Indeed, if we open [Chapter 5: Recursion](https://learnyouahaskell.github.io/recursion.html) of [LYAH](https://learnyouahaskell.github.io/) we are greeted with the following.
> Definitions in mathematics are often given recursively. For instance, the fibonacci sequence is defined recursively. > Definitions in mathematics are often given recursively. For instance, the Fibonacci sequence is defined recursively.
Likewise, in [Chapter 1.2.2 Tree Recursion](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book-Z-H-4.html#%_toc_%_sec_1.2.2) of [SICP](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/index.html) we are yet again greeted by an old friend Likewise, in [Chapter 1.2.2 Tree Recursion](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book-Z-H-4.html#%_toc_%_sec_1.2.2) of [SICP](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/index.html) we are yet again greeted by an old friend
> Another common pattern of computation is called tree recursion. As an example, consider computing the sequence of Fibonacci numbers > Another common pattern of computation is called tree recursion. As an example, consider computing the sequence of Fibonacci numbers
With this in mind, it might come as a suprise that there is a closed, non-recursive, formula for the $n$-th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP. With this in mind, it might come as a surprise that there is a closed, non-recursive, formula for the $n$-th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP.
# Programatically calculating the $n$-th Fibonacci number # Programatically calculating the $n$-th Fibonacci number
A naive way of calculating the $n$-th fibonacci number is to use the definition above. Check if $n = 0$, if $n = 1$, and otherwise calculating $f_{n-2}$ and $f_{n-1}$. However, unless $n$ is $0$ or $1$. Programatically this corresponds to the following Haskell code: A naive way of calculating the $n$-th Fibonacci number is to use the definition above. Check if $n = 0$, if $n = 1$, and otherwise calculating $f_{n-2}$ and $f_{n-1}$. However, unless $n$ is $0$ or $1$. Programatically this corresponds to the following Haskell code:
``` ```
fib :: Integer -> Integer fib :: Integer -> Integer
fib 0 = 0 fib 0 = 0
@ -42,9 +47,9 @@ fib 1 = 1
fib n = fib (n-2) + fib (n-1) fib n = fib (n-2) + fib (n-1)
``` ```
However, there is an issue with this method, many fibonacci numbers will be calculated numerous times, as for each fibonacci number evaluated we split into two paths, evaluating the previous and twice previous fibonacci number. The reader which prefers visuals might appreciate Figure 1.5 from the SICP chapter. However, there is an issue with this method, many Fibonacci numbers will be calculated numerous times, as for each Fibonacci number evaluated we split into two paths, evaluating the previous and twice previous Fibonacci number. The reader which prefers visuals might appreciate Figure 1.5 from the SICP chapter.
How might we fix this then? A human calculating the $n$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each fibonacci number only once. While it is possible to do this on the computer it is actually superflous to carry all previous numbers, as we only need the previous two in order to calculate the next one. We might think of this as a window, moving along the Fibonacci numbers, taking $n$ steps to arrive at $f_n$. In code we could represent this as follows: How might we fix this then? A human calculating the $n$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is actually superfluous to carry all previous numbers, as we only need the previous two in order to calculate the next one. We might think of this as a window, moving along the Fibonacci numbers, taking $n$ steps to arrive at $f_n$. In code we could represent this as follows:
``` ```
-- steps -> f n-2 -> f n-1 -> f n -- steps -> f n-2 -> f n-1 -> f n
window :: Integer -> Integer -> Integer -> Integer window :: Integer -> Integer -> Integer -> Integer
@ -61,11 +66,67 @@ In each step we move the 2-slot window by replacing the first slot in the window
What does this have to do with mathematics, and this beautiful proof which I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent is as a vector. Furthermore, we may view sliding our window one step as a function $S$ from vectors to vectors. This poses an interesting question, is this function a linear transformation? What does this have to do with mathematics, and this beautiful proof which I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent is as a vector. Furthermore, we may view sliding our window one step as a function $S$ from vectors to vectors. This poses an interesting question, is this function a linear transformation?
\begin{align} $$ S\left(\begin{bmatrix} a \\ b \end{bmatrix}\right) + S\left(\begin{bmatrix} c \\ d \end{bmatrix}\right) = \begin{bmatrix} b \\ a + b \end{bmatrix} + \begin{bmatrix} d \\ c + d \end{bmatrix} $$
test & test2 \\
test3 & test4
\end{align}
$$ S\left(\begin{bmatrix} a \\ b \end{bmatrix}\right) + S\left(\begin{bmatrix} c \\ d \end{bmatrix}\right) = \begin{bmatrix} b \\ a + b \end{bmatrix} + \begin{bmatrix} d \\ c + d \end{bmatrix} = \begin{bmatrix} b + d \\ a + b + c + d \end{bmatrix} $$ $$ = \begin{bmatrix} b + d \\ a + b + c + d \end{bmatrix} = $$
$$ S\left(\begin{bmatrix} a \\ b \end{bmatrix} + \begin{bmatrix} c \\ d \end{bmatrix}\right) = S\left(\begin{bmatrix} a + c \\ b + d \end{bmatrix}\right) = \begin{bmatrix} b + d \\ a + b + c + d \end{bmatrix} $$ $$ S\left(\begin{bmatrix} a + c \\ b + d \end{bmatrix}\right) = S\left(\begin{bmatrix} a \\ b \end{bmatrix} + \begin{bmatrix} c \\ d \end{bmatrix}\right)$$
$$ S\left(\alpha \begin{bmatrix} a \\ b \end{bmatrix}\right) = \begin{bmatrix} \alpha b \\ \alpha (a + b) \end{bmatrix} = \alpha \begin{bmatrix} b \\ a + b \end{bmatrix} = \alpha S\left(\begin{bmatrix} a \\ b \end{bmatrix}\right) $$
It is! This is great news as it means we can represent our step function by a matrix. With some basic linear algebra one can deduce that
$$ S = \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix} $$
Then to calculate the $n$-th Fibonacci number we take the starting window $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$, multiply it by the $S^n$, and then look at the first entry of the matrix. We now have an analogue of our sliding window expressed entirely in the language of linear algebra, which will let us apply the tools of linear algebra.
# Applying the tools of linear algebra
If you're familiar with linear algebra there might be a part of your brain yelling "diagonalization" right now. We've translated our problem into linear algebra, but even for a small matrix calculating $S^n$ can become costly for high $n$, diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
An eigenvector for our matrix $S$ is a vector $\hat v$ for which $S \hat v = \lambda \hat v$ for some scalar $\lambda$, which we call an eigenvalue. If there are any such vectors we can find them using their definition.
$$ S \hat v = \lambda \hat v = \lambda I_2 \hat v $$
Subtracting $\lambda I_2 v$ from both sides yields:
$$ 0 = S \hat v - \lambda I_2 \hat v = (S-\lambda I_2) \hat v $$
An equation of the form $0 = A \hat u$ will only have non-trivial solutions if the column vectors of $A$ are linearly dependent, that is if $\textrm{det}(A) = 0$. Thus we can find all scalars $\lambda$ for which there are non-trivial vector solutions by solving $\textrm{det}(S-\lambda I_2) = 0$. Because of this property the polynomial $\textrm{det}(A-\lambda I)$ is called the characteristic polynomial of $A$.
In our case we have the following:
$$ \textrm{det}(S-\lambda I_2) = \begin{vmatrix} - \lambda & 1 \\ 1 & 1 - \lambda \end{vmatrix} = \lambda^2 - \lambda - 1 = 0$$
Solving for $\lambda$ yields two eigenvalues:
$$ \lambda_0 = \frac{1 - \sqrt 5}{2} ,\; \lambda_1 = \frac{1 + \sqrt 5}{2}$$
Would you look at that, the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio $\frac{f_{n+1}}{f_n}$ approaches $\frac{1 + \sqrt 5}{2}$.
Now we can solve $(S-\lambda I_2) \hat v = 0$ for $\lambda_0$ and $\lambda_1$, and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination then yields:
$$ \hat v_0 = \begin{bmatrix} -2 \\ \sqrt 5 - 1 \end{bmatrix},\; \hat v_1 = \begin{bmatrix} 2 \\ \sqrt 5 + 1 \end{bmatrix} $$
These vectors are indeed linearly dependent, and we can use them as basis vectors for our diagonal matrix. We will now to write $S = BDB^{-1}$ where
$$ B = \begin{bmatrix} -2 & 2 \\ \sqrt 5 - 1 & \sqrt 5 + 1 \end{bmatrix}$$
$$ D = \begin{bmatrix} \frac{1 - \sqrt 5}{2} & 0 \\ 0 & \frac{1 + \sqrt 5}{2} \end{bmatrix}$$
We then have that
$$S^n = (BDB^{-1})^n = \underbrace{BDB^{-1} BDB^{-1} \dots BDB^{-1}}_n $$
$$ = \underbrace{BDIDI \dots DB^{-1}}_n = BD^nB^{-1}$$
Which is very nice since
$$D^n = \begin{bmatrix} \frac{1 - \sqrt 5}{2} & 0 \\ 0 & \frac{1 + \sqrt 5}{2} \end{bmatrix}^n = \begin{bmatrix} \left(\frac{1 - \sqrt 5}{2}\right)^n & 0 \\ 0 & \left(\frac{1 + \sqrt 5}{2}\right)^n \end{bmatrix}$$
After calculating $B^{-1}$ we can solve $\begin{bmatrix} f_{n+1} \\ f_n \end{bmatrix} = BD^nB^{-1}$ for $f_n$ in order to get our closed expression.
$$ f_n = \frac{1}{\sqrt 5} \left(\left(\frac{1 + \sqrt 5}{2}\right)^n - \left(\frac{1 - \sqrt 5}{2}\right)^n\right) $$
# Final thoughts
Whenever I first happened upon the closed formula for the $n$-th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me an recursively specified integer. After I learned this proof it doesn't feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just an application for geometry, flow balancing or computer graphics.
{% endkatexmm %}