switch to kramdown katex engine for markdown files

master
Rachel Lambda Samuelsson 2023-12-09 15:05:40 +01:00
parent a9b9444fc9
commit 5b0723ba31
10 changed files with 131 additions and 115 deletions

View File

@ -21,6 +21,11 @@ katex:
rendering_options:
throw_error: true
kramdown:
math_engine: katex
math_engine_opts:
throw_on_error: true
sass:
sourcemap: always
sass_dir: assets/css

View File

@ -1,26 +1,25 @@
---
layout: post
title: "The algebra in algebraic datatypes"
katex: True
---
In this post I will explore the algebraic properties of Haskell's types, give an informal description of category theory, F-algebras, homomorphisms, catamorphisms and a roundabout way to write a `foldr` that recurses backwards.
<!--more-->
{% katexmm %}
# What do types have to do with algebra?
The first step to modeling Haskell's datatypes algebraically is to build an intution for why such a thing should be possible in the first place. In order to motivate this we shall give an interpretation of some haskell types into sums, products, and exponents on numbers.
We shall make the following intepretation:
* `Either a b`, the disjoint sum type, is interpteted as $a + b$.
* `Either a b`, the disjoint sum type, is interpteted as $$a + b$$.
* `(a, b)`, the product type, is interpteted as $a \cdot b$.
* `(a, b)`, the product type, is interpteted as $$a \cdot b$$.
* `a -> b`, the function type is interpteted as $b^a$.
* `a -> b`, the function type is interpteted as $$b^a$$.
We will motivate this interpretation by considering the cardinality, or the number of elements of a type. For example, `()` has cardinality 1, and `Bool` has cardinality two. Now then, what is the cardinality of `Either a b`, given that `a` and `b` have cardinalities $n$ and $m$? Well, there are two ways of constructing a value of `Either a b`, `Left :: a -> Either a b` or `Right :: b -> Either a b`. We can construct $n$ elements using `Left`, since we have $n$ elements of `a`, and $m$ elements using right, as there are $m$ elements of `b`. Then in total we can construct $m + n$ elements of `Either a b`. Likewise we will see that in the case of `(a, b)` we will have $n$ choices for the first element, and then for each element of type `a` we will have another $m$ choices for the element of type `b`, yielding $n \cdot m$ elements in total. The function type being exponentiation might seem more arcane, but by defining equality of functions and using some basic combinatronics we can make perfect sense of it. If we consider two functions equal given that for every input they give the same output (a principle known as function extensionality) we will see that the way to enumerate the amount of elements of a function type is to enumerate how many different ways we can pick outputs for all our inputs. Consider then, any function `f :: a -> b` for each of the $n$ elements `αᵢ : a` we can choose $m$ different elements of `b` to send `f αⱼ` to. Thus we are making $n$ repeated choices of $m$ elements, yielding $m^n$ different options.
We will motivate this interpretation by considering the cardinality, or the number of elements of a type. For example, `()` has cardinality 1, and `Bool` has cardinality two. Now then, what is the cardinality of `Either a b`, given that `a` and `b` have cardinalities $$n$$ and $$m$$? Well, there are two ways of constructing a value of `Either a b`, `Left :: a -> Either a b` or `Right :: b -> Either a b`. We can construct $$n$$ elements using `Left`, since we have $$n$$ elements of `a`, and $$m$$ elements using right, as there are $$m$$ elements of `b`. Then in total we can construct $$m + n$$ elements of `Either a b`. Likewise we will see that in the case of `(a, b)` we will have $$n$$ choices for the first element, and then for each element of type `a` we will have another $$m$$ choices for the element of type `b`, yielding $$n \cdot m$$ elements in total. The function type being exponentiation might seem more arcane, but by defining equality of functions and using some basic combinatronics we can make perfect sense of it. If we consider two functions equal given that for every input they give the same output (a principle known as function extensionality) we will see that the way to enumerate the amount of elements of a function type is to enumerate how many different ways we can pick outputs for all our inputs. Consider then, any function `f :: a -> b` for each of the $$n$$ elements `αᵢ : a` we can choose $$m$$ different elements of `b` to send `f αⱼ` to. Thus we are making $$n$$ repeated choices of $$m$$ elements, yielding $$m^n$$ different options.
If we let `| a |` denote the cardinality of `a`, then we can write what we just learned quite succintly.
@ -38,7 +37,7 @@ Now that we've established that we can interpret some of our types in numbers, w
First let's visit the distributive law of addition and multiplication:
* $a · (b + c) = a · b + a · c$
* $$a · (b + c) = a · b + a · c$$
Now, in Haskell we cannot prove equality of types, instead we shall construct an isomorphism, or bijection between our types, a function with an inverse such that their compositio is the identity. Constructing these functions corresponds to creating two functions
@ -70,9 +69,9 @@ Proving these are inverses is quite trivial (hint: compare the patterns) and lef
The real interesting identities, however, are those to do with exponents, we shall prove that the haskell equivalent of the following exponent laws hold:
* $a^b \cdot a^c = a^{b+c}$
* $$a^b \cdot a^c = a^{b+c}$$
* $(a^b)^c = a^{c \cdot b}$
* $$(a^b)^c = a^{c \cdot b}$$
Corresponding to creating the following two Haskell functions, and their inverses
@ -106,14 +105,14 @@ In order to start formalising our notion of algebra in Haskell we will begin to
* For every ordered pair of objects a set of arrows from the first to the second object
* For every object $a$ an arrow $id_a : a \to a$
* For every object $$a$$ an arrow $$id_a : a \to a$$
* A composition operation, denoted by $\circ$ such that if we have arrows $g : a \to b$ and $f : b \to c$ then there is another arrow $f \circ g : a \to c$
* A composition operation, denoted by $$\circ$$ such that if we have arrows $$g : a \to b$$ and $$f : b \to c$$ then there is another arrow $$f \circ g : a \to c$$
Additionally, the follwing requirements are made
* For any arrow $f : a \to b$ we have that $f \circ id_a = f$ and $id_b \circ f = f$
* For three arrows $h : a \to b$, $g : b \to c$ and $f : a \to b$ we have that $f \circ (g \circ h) = (f \circ g) \circ h$
* For any arrow $$f : a \to b$$ we have that $$f \circ id_a = f$$ and $$id_b \circ f = f$$
* For three arrows $$h : a \to b$$, $$g : b \to c$$ and $$f : a \to b$$ we have that $$f \circ (g \circ h) = (f \circ g) \circ h$$
Now of course this all seems very abstract, and it is. The strenght of category theory is how general it is, letting us make precise concepts such as sums and products which pop up all over math. In order to make category theory less abstract we will use Haskell as an example, since it's what we'll mainly be working with in this post.
@ -179,10 +178,6 @@ We do however, of course, want to be able to talk about elements of types still,
* category of endofunctors
* natural transformations
* relate monoid example from before
* multiplication given by composition instead of $\times$
* multiplication given by composition instead of $$\times$$
* General recursion
{% endkatexmm %}

View File

@ -7,8 +7,6 @@ title: "A look at SECD machines"
The SECD machine is a virtual machine designed to evaluate lambda expressions. It's purpose is to be a more natural goal for compilers than assembly while maintaining reasonable execution speed. In this blog post I will give you a quick intro to SECD machines and an overview of a simple implementation.
<!--more-->
{% katexmm %}
## The SECD Machine
SECD stands for Stack Environment Control Dump, all of which but the environment are stacks in the SECD machine. The machine operates by reading instructions from the control stack which operate on itself and the other stacks. A lambda is implemented as its body coupled with its environment. Application is done with the "apply" instruction which pops a lambda off the stack and adds the next element of the stack to the lambdas environment environment, binding the variable of the lambda. The previous stacks are then pushed onto the dump stack. When the lambda has been fully reduced the return instruction is used to save the top of the stack, the result of the reduction, and restores the stacks from the dump stack.
@ -21,7 +19,7 @@ $$\lambda x.x \xRightarrow[]{\text{De Brujin}} \lambda \#0$$
$$\lambda f.\lambda g.\lambda x. f(gx) \xRightarrow[]{\text{De Brujin}} \lambda\lambda\lambda \#2(\#1\#0)$$
The benefit in using De Brujin notation is that, rather than a map, a dynamic array may be used for the environment, with variables being idecies into said array. An additional benifit is that the machine does not need to concern itself with alpha equivalence. As a side-note, counting conventionally starts at $\#1$, not $\#0$, however in a virtual machine counting from $\#0$ is, of course, much more natural.
The benefit in using De Brujin notation is that, rather than a map, a dynamic array may be used for the environment, with variables being idecies into said array. An additional benifit is that the machine does not need to concern itself with alpha equivalence. As a side-note, counting conventionally starts at $$\#1$$, not $$\#0$$, however in a virtual machine counting from $$\#0$$ is, of course, much more natural.
The second improvement to be made is to "dump" the dump stack. Rather than using the dump stack the current state will be pushed onto the stack as a closure which the return instruction can then restore when exiting a lambda.
@ -36,14 +34,14 @@ In this section a SECD machine implemented in Haskell will be introduced bit by
### Data structures
The machine will only be able to recognise two types of values. Integers and Closures.
```haskell
```hs
data Val
= I Int
| Clo Int [Val] [Inst] -- number of arguments, environment, instructions
deriving (Eq, Show)
```
The machine comes with the following instruction set.
```haskell
```hs
data Inst
= Const Int
| Global Int
@ -78,6 +76,7 @@ data Inst
All stacks grow from right to left, that is, the left most element is at the top of the stack.
{% katexmm %}
<table>
<tr>
<th align=center colspan="3">Before</th><th align=center colspan="3">After</th>
@ -113,6 +112,7 @@ All stacks grow from right to left, that is, the left most element is at the top
<td>Add : c </td><td>e</td><td>v : s</td><td>c</td><td>e</td><td>v + v : s</td>
</tr>
</table>
{% endkatexmm %}
### An example
@ -124,23 +124,23 @@ It composes a lambda which increments its input by one with an identical lambda
$$(\lambda \lambda \lambda \#2(\#1\#0)) (\lambda 1 + \#1) (\lambda 1 + \#1)0$$
Now in order to create an instruction tape for the SECD machine it must be rewritten, using Closure as lambdas, Local $\#i$ for variables and keeping in mind to explicitly apply to and return from lambdas. Additionally it will have to be written in postfix form due to the nature of stack based machines. After performing these actions the expression has been transformed into the following tape:
Now in order to create an instruction tape for the SECD machine it must be rewritten, using Closure as lambdas, Local $$\#i$$ for variables and keeping in mind to explicitly apply to and return from lambdas. Additionally it will have to be written in postfix form due to the nature of stack based machines. After performing these actions the expression has been transformed into the following tape:
$Const 0 : Closure [ Const 1 : Local \#0 : Add : Ret ] : Closure [ Const 1 : Local \#0 : Add : Ret ] :$
$Closure^3 [ Local \#0 : Local \#1 : App : Local \#2 : App : Ret ] : App : App : App$
$$Const 0 : Closure [ Const 1 : Local \#0 : Add : Ret ] : Closure [ Const 1 : Local \#0 : Add : Ret ] :$$
$$Closure^3 [ Local \#0 : Local \#1 : App : Local \#2 : App : Ret ] : App : App : App$$
Reading this (or one of the previous lambda expressions) backwards you should be able to convince yourself of the validity of this translation.
This can now directly be translated into a list of instructions for the Haskell machine implementation.
```haskell
```hs
[ Const 0, Closure 1 4, Const 1, Local 0, Add, Ret, Dup
, Closure 3 6, Local 0, Local 1, App, Local 2, App, Ret
, App, App, App]
```
Which can then be evaluated in order to yield the normalized expression.
```haskell
```hs
λ eval [] [ Const 0, Closure 1 4, Const 1, Local 0, Add, Ret, Dup
, Closure 3 6, Local 0, Local 1, App, Local 2, App, Ret
, App, App, App]
@ -158,7 +158,7 @@ While this is a functional implementation, it does have a few fatal flaws. First
The implementation itself is trivial and is little more than a Haskell translation of the instruction table above.
```haskell
```hs
module SECD where
data Inst
@ -205,8 +205,6 @@ eval globals insts = go (SEC [] [] insts globals)
Add -> let (x1:x2:st) = stack in go (SEC (vadd x1 x2:st) env xs global)
```
{% endkatexmm %}
## Licensing Information
Feel free to use any of the code or concepts here in your own projects! The code can be considered free of copyright and all notions of liability and/or warranty.

View File

@ -1,17 +1,16 @@
---
layout: post
title: "Why be pure?"
katex: True
---
Newcomers to Haskell are often told that Haskell is "pure", or that it is "mathematical". It is often not clear what this means, or why these properties would be of interest to begin with. In this blog post I aim to combat this by showing why purity can be helpful, both in writing programs and improving performance.
<!--more-->
{% katexmm %}
## What is purity?
Before we get into the why, we have to get into the what. Haskell is a purely functional language, but what does that mean? Purity tells us that functions in Haskell behave like mathematical functions. That is, functions in Haskell must always give eqaul output for equal input. In mathematics we'd hope that if we have a function $f$, and two values $x$ and $y$ such that $x = y$ then $f(x) = f(y)$ and the same is true for Haskell. This is the reason that you are not allowed to modify the value of variables in Haskell as you are in other programming languages such as Python. Consider the following Python program:
Before we get into the why, we have to get into the what. Haskell is a purely functional language, but what does that mean? Purity tells us that functions in Haskell behave like mathematical functions. That is, functions in Haskell must always give eqaul output for equal input. In mathematics we'd hope that if we have a function $$f$$, and two values $$x$$ and $$y$$ such that $$x = y$$ then $$f(x) = f(y)$$ and the same is true for Haskell. This is the reason that you are not allowed to modify the value of variables in Haskell as you are in other programming languages such as Python. Consider the following Python program:
```python
n = 0
@ -35,7 +34,7 @@ The function `f` will give a different result each time we call it with the same
6
```
But then $f(2)$ is not always equal to $f(2)$, so Python is not pure.
But then $$f(2)$$ is not always equal to $$f(2)$$, so Python is not pure.
Now that we know what it means for Haskell to be pure, we can move on to the more interesting question. Why place these restrictions on our language? Requiring Haskell to be pure might seem arbitrary, or even counter-productive, but playing by the rules of mathematics will let us use the tools of mathematics!
@ -114,5 +113,3 @@ How does this relate to Haskell and purity? Since Haskell is pure data can never
## The Takeaway
The purity of Haskell allows for mathematical reasoning about programs. This not only makes it possible to be more confident in the correctness of our programs but can be used in order to optimize code as well. In fact, the primary Haskell compiler, GHC, uses these guarantees in order to optimize programs. The restrictions imposed by purity turn into properties that programmers and compilers alike may use to their advantage.
{% endkatexmm %}

View File

@ -1,40 +1,39 @@
---
layout: post
title: "Haskell and Logic"
katex: True
---
Haskell programs are proofs, don't you know? Hidden inside the types of Haskell lies a deep connection to mathematical logic. Given some basic Haskell knowledge and mathematical interest, you too may experience the sense of wonder I first did when discovering the Curry-Howard isomorphism.
<!--more-->
{% katexmm %}
## What is logic?
Before we can talk about logic in Haskell we have to talk about propositional logic. Propositional logic is a set of rules telling us how to construct propositions, and how to derive conclusions from true ones.
First, we shall get a feel for what a proposition is by breaking down some examples. Consider the following statements: "It is sunny outside", "The bank is open", "It is either sunny or rainy outside", and "If it is rainy then the bank is not open". The first two statements are "atomic", they cannot be broken down into smaller propositions, unlike the latter two statements which are built up from smaller propositions. Let us then specify the forms a propositions may have.
* The atomic propositions, which we shall call $a, b, c \dots$
* Truth and falsehood, written as $\top$ (truth) and $\bot$ (falsehood)
* The connectives, given two propositions $P, Q$ lets us talk about
* $P$ and $Q$ are true, written as $P \land Q$
* $P$ or $Q$ (or both) are true, written as $P \lor Q$
* Implication, given two propositions $P, Q$ lets us talk about
* if $P$ is true then $Q$ is, written as $P \to Q$
* Negation, which given a proposition $P$ lets us talk about
* $P$ is false, written as $\lnot P$
* The atomic propositions, which we shall call $$a, b, c \dots$$
* Truth and falsehood, written as $$\top$$ (truth) and $$\bot$$ (falsehood)
* The connectives, given two propositions $$P, Q$$ lets us talk about
* $$P$$ and $$Q$$ are true, written as $$P \land Q$$
* $$P$$ or $$Q$$ (or both) are true, written as $$P \lor Q$$
* Implication, given two propositions $$P, Q$$ lets us talk about
* if $$P$$ is true then $$Q$$ is, written as $$P \to Q$$
* Negation, which given a proposition $$P$$ lets us talk about
* $$P$$ is false, written as $$\lnot P$$
Combining these we can create larger propositions such as $a \to a \lor b$ or $a \lor (b \land \lnot c)$. We can now translate English non-atomic propositions into the language of propositional logic. "If it is rainy then the bank is not open" becomes $\textrm{it is rainy} \to \lnot (\textrm{bank is open})$.
Combining these we can create larger propositions such as $$a \to a \lor b$$ or $$a \lor (b \land \lnot c)$$. We can now translate English non-atomic propositions into the language of propositional logic. "If it is rainy then the bank is not open" becomes $$\textrm{it is rainy} \to \lnot (\textrm{bank is open})$$.
Given the knowledge that a proposition $P$ is true, we may make conclusions about the propositions used to construct $P$.
Given the knowledge that a proposition $$P$$ is true, we may make conclusions about the propositions used to construct $$P$$.
* If $P \land Q$ is true, then both $P$ and $Q$ are true
* If $P \lor Q$ is true, then at least one of $P$ or $Q$ is true
* If $P \to Q$ is true, then $Q$ must be true whenever $P$ is
* If $\lnot P$ is true, then $P$ is false
* If $$P \land Q$$ is true, then both $$P$$ and $$Q$$ are true
* If $$P \lor Q$$ is true, then at least one of $$P$$ or $$Q$$ is true
* If $$P \to Q$$ is true, then $$Q$$ must be true whenever $$P$$ is
* If $$\lnot P$$ is true, then $$P$$ is false
To prove things one more concept needs to be introduced, that of tautologies and contradictions. A tautology is a statement that is true no matter if the atoms inside of it are true or not, and conversely, a contradiction is a statement that is always false. An example of a tautology is $a \to a \lor b$, if $a$ is true then $a \lor b$ is true since $a$ is, if $a$ is false then we need not concern ourselves with the motive of the implication. An important tautology following the same reasoning is $\bot \to P$. An example of a contradiction is $a \land \lnot a$, this proposition states that $a$ is both true and false, which is impossible.
To prove things one more concept needs to be introduced, that of tautologies and contradictions. A tautology is a statement that is true no matter if the atoms inside of it are true or not, and conversely, a contradiction is a statement that is always false. An example of a tautology is $$a \to a \lor b$$, if $$a$$ is true then $$a \lor b$$ is true since $$a$$ is, if $$a$$ is false then we need not concern ourselves with the motive of the implication. An important tautology following the same reasoning is $$\bot \to P$$. An example of a contradiction is $$a \land \lnot a$$, this proposition states that $$a$$ is both true and false, which is impossible.
In logic we are usually interested in proving that a proposition is either a tautology or a contradiction, proving or disproving its truth. One usually does this by following the rules of the logic, repeatedly applying them to yield a proof. However there is another way to do logic, through Haskell!
@ -42,33 +41,33 @@ In logic we are usually interested in proving that a proposition is either a tau
How does one translate from propositional logic to Haskell? We shall interpret types as propositions, and terms (elements) of a type as witnesses or proofs that the proposition (the type they inhabit) is true. The typechecking of Haskell will then double as a proof checker. This might seem abstract at first, but once you see how the building blocks of propositions translate into types it will become clear.
Atomic propositions ($a,b,c,\dots$) are interpreted as type variables `a`, `b`, `c`, etc. You might notice that there is no way to prove `a`, since one would have to create a proof `p :: a`, that is, a term inhabiting every type. This corresponds to how one cannot prove or disprove atomic propositions in propositional logic.
Atomic propositions ($$a,b,c,\dots$$) are interpreted as type variables `a`, `b`, `c`, etc. You might notice that there is no way to prove `a`, since one would have to create a proof `p :: a`, that is, a term inhabiting every type. This corresponds to how one cannot prove or disprove atomic propositions in propositional logic.
$\top$ may be interpreted as any type of which we know a term, that is, any proposition of which we know a proof. For the sake of readability we shall define a truth type, with a single constructor bearing witness to its truth.
$$\top$$ may be interpreted as any type of which we know a term, that is, any proposition of which we know a proof. For the sake of readability we shall define a truth type, with a single constructor bearing witness to its truth.
```hs
data Truth = TruthWitness
```
Translating $\bot$ might seem more tricky, but it turns out Haskell lets us define empty types without constructors, that is, a proposition with no proof. We thus define the following.
Translating $$\bot$$ might seem more tricky, but it turns out Haskell lets us define empty types without constructors, that is, a proposition with no proof. We thus define the following.
```hs
data Falsehood
```
The connectives might seem more tricky, but turn out to be fairly straightforward as well. A proof of $P \land Q$ is simply a proof of $P$ and a proof of $Q$, but wait, that's a pair! $P \land Q$ is thus interpreted as `(P, Q)`. Likewise, a proof of $P \lor Q$ is *either* a proof of $P$ or a proof of $Q$, that's right, it's the `Either` type! $P \lor Q$ is interpreted as `Either P Q`.
The connectives might seem more tricky, but turn out to be fairly straightforward as well. A proof of $$P \land Q$$ is simply a proof of $$P$$ and a proof of $$Q$$, but wait, that's a pair! $$P \land Q$$ is thus interpreted as `(P, Q)`. Likewise, a proof of $$P \lor Q$$ is *either* a proof of $$P$$ or a proof of $$Q$$, that's right, it's the `Either` type! $$P \lor Q$$ is interpreted as `Either P Q`.
The implication $P \to Q$ tells us that if we have a proof of $P$ then we have a proof of $Q$. So what we need is a way of turning terms of type $P$ into terms of type $Q$. That is, a function `P -> Q`, would you look at that, it's even the same arrow!
The implication $$P \to Q$$ tells us that if we have a proof of $$P$$ then we have a proof of $$Q$$. So what we need is a way of turning terms of type $$P$$ into terms of type $$Q$$. That is, a function `P -> Q`, would you look at that, it's even the same arrow!
Perhaps surprisingly the trickiest part of the translation is negation. For $\lnot P$ to be true $P$ must be false, that is, there may not be any terms `p :: P` bearing witness to the truth of P. How can we express this within Haskell? The trick is to redefine $\lnot P$ as $P \to \bot$, that is, $\bot$ is true if $P$ is. But since we *know* that $\bot$ is false, this means $P$ cannot be true. Thus we interpret $\lnot P$ as `P -> Falsehood`. To aid readability we will make the definition below.
Perhaps surprisingly the trickiest part of the translation is negation. For $$\lnot P$$ to be true $$P$$ must be false, that is, there may not be any terms `p :: P` bearing witness to the truth of P. How can we express this within Haskell? The trick is to redefine $$\lnot P$$ as $$P \to \bot$$, that is, $$\bot$$ is true if $$P$$ is. But since we *know* that $$\bot$$ is false, this means $$P$$ cannot be true. Thus we interpret $$\lnot P$$ as `P -> Falsehood`. To aid readability we will make the definition below.
```hs
type Not p = p -> Falsehood
```
Now that we've introduced our language, let us speak! We shall prove the above-mentioned tautology $a \to a \lor b$ by constructing a function `a -> Either a b`. If you've used the `Either` type before then the proof is trivial.
Now that we've introduced our language, let us speak! We shall prove the above-mentioned tautology $$a \to a \lor b$$ by constructing a function `a -> Either a b`. If you've used the `Either` type before then the proof is trivial.
```hs
proof1 :: a -> Either a b
proof1 = Left
```
Now let us prove that $a \land \lnot a$ is a contradiction, we shall do this by proving its negation $\lnot (a \land \lnot a)$. In doing so we need to construct a term of type `Not (a, Not a)`, that is, `(a, a -> Falsehood) -> Falsehood`. Thus, we must construct a function to `Falsehood` which has no inhabitants, however, we are given a function `a -> Falsehood` which we may use to get a term of the empty `Falsehood` type by applying the supplied value of type `a` to it.
Now let us prove that $$a \land \lnot a$$ is a contradiction, we shall do this by proving its negation $$\lnot (a \land \lnot a)$$. In doing so we need to construct a term of type `Not (a, Not a)`, that is, `(a, a -> Falsehood) -> Falsehood`. Thus, we must construct a function to `Falsehood` which has no inhabitants, however, we are given a function `a -> Falsehood` which we may use to get a term of the empty `Falsehood` type by applying the supplied value of type `a` to it.
```hs
proof2 :: Not (a, Not a) -- (a, a -> Falsehood) -> Falsehood)
proof2 (a, f) = f a
@ -81,12 +80,12 @@ I strongly encourage you to play around with this on your own, coming up with so
<tr> <td>Proposition</td> <td>Type</td> </tr>
<tr> <td>Proof</td> <td>Term (element of type)</td> </tr>
<tr> <td>Atom</td> <td>Type variable</td> </tr>
<tr> <td>$\top$</td> <td>Type with a known term</td> </tr>
<tr> <td>$\bot$</td> <td>Type with no terms</td> </tr>
<tr> <td>$P \land Q$</td> <td><code>(P, Q)</code></td> </tr>
<tr> <td>$P \lor Q$</td> <td><code>Either P Q</code></td> </tr>
<tr> <td>$P \to Q$</td> <td><code>P -> Q</code></td> </tr>
<tr> <td>$\lnot P$</td> <td><code>P -> EmptyType</code></td> </tr>
<tr> <td>$$\top$$</td> <td>Type with a known term</td> </tr>
<tr> <td>$$\bot$$</td> <td>Type with no terms</td> </tr>
<tr> <td>$$P \land Q$$</td> <td><code>(P, Q)</code></td> </tr>
<tr> <td>$$P \lor Q$$</td> <td><code>Either P Q</code></td> </tr>
<tr> <td>$$P \to Q$$</td> <td><code>P -> Q</code></td> </tr>
<tr> <td>$$\lnot P$$</td> <td><code>P -> EmptyType</code></td> </tr>
</table>
@ -113,13 +112,13 @@ The programming languages studied by logicians guarantee termination and don't i
## Excluding the middle
The logic which we do in Haskell has a name, it's called constructive logic. The name is due to the fact that one must "construct" a proof, such as a function, in order to prove a theorem. Every theorem has an explicit construction that shows us how the proof is performed. This requirement, as it turns out, is quite strict, so strict that we can no longer prove all the things we could in classical logic. In particular, we will be unable to prove both $\lnot \lnot P \to P$ and $P \lor \lnot P$. Proving these classically is trivial, just consider the case where $P$ is true, and the case where $P$ is false, in both cases these propositions hold. In Haskell however, it's trickier, think about it, how could one write these programs?
The logic which we do in Haskell has a name, it's called constructive logic. The name is due to the fact that one must "construct" a proof, such as a function, in order to prove a theorem. Every theorem has an explicit construction that shows us how the proof is performed. This requirement, as it turns out, is quite strict, so strict that we can no longer prove all the things we could in classical logic. In particular, we will be unable to prove both $$\lnot \lnot P \to P$$ and $$P \lor \lnot P$$. Proving these classically is trivial, just consider the case where $$P$$ is true, and the case where $$P$$ is false, in both cases these propositions hold. In Haskell however, it's trickier, think about it, how could one write these programs?
```hs
impossible1 :: Not (Not a) -> a -- ((a -> Falsehood) -> Falsehood) -> a
impossible2 :: Either a (Not a) -- Either a (a -> Falsehood)
```
These two propositions have names, $P \lor \lnot P$ is called the law of excluded middle, and $\lnot \lnot P \to P$ is called double negation. Though it might not be obvious, double negation follows from excluded middle. By assuming excluded middle, we can even use Haskell to prove this.
These two propositions have names, $$P \lor \lnot P$$ is called the law of excluded middle, and $$\lnot \lnot P \to P$$ is called double negation. Though it might not be obvious, double negation follows from excluded middle. By assuming excluded middle, we can even use Haskell to prove this.
```hs
assume_excluded_middle :: Either a (Not a)
@ -150,5 +149,3 @@ If you've made it this far, then congratulations! Even though I've tried to make
This blog post only covers propositional logic, but one can interpret predicate logic and even higher-order logic within programming languages. This is done with dependent type theory. In dependent type theories, types may themselves *depend* upon values, letting one create types such as `is-even n` which depend on a natural number `n`. This type would have terms which are witnesses that `n` is even. These programming languages, or proof assistants as they are usually called, enable you to prove properties of both programs and mathematical objects. In doing so they provide a way to automatically check mathematical proofs for correctness. There are already many mathematicians using these tools to create formalized proofs of mathematical theorems.
If you are interested in learning more about dependent type theory, then you might be interested in downloading and playing around with a proof assistant. I recommend [agda](https://agda.readthedocs.io/en/latest/getting-started/installation.html) to those familiar with Haskell; a lengthy [list of agda tutorials](https://agda.readthedocs.io/en/latest/getting-started/tutorial-list.html) is included in its documentation. If you wish to learn more about how doing math with types rather than sets might lead to new insights and connections to topology, then I encourage you to learn about homotopy type theory, for which there are some great resources: [HoTTEST Summer School 2022 lectures](https://www.youtube.com/playlist?list=PLtIZ5qxwSNnzpNqfXzJjlHI9yCAzRzKtx), [HoTTEST Summer School 2022 GitHub](https://github.com/martinescardo/HoTTEST-Summer-School), [Introduction to Homotopy Type Theory](https://arxiv.org/abs/2212.11082), [1lab](https://1lab.dev/1Lab.intro.html), [HoTT Book](https://homotopytypetheory.org/book/).
{% endkatexmm %}

View File

@ -1,21 +1,16 @@
---
layout: post
title: "A favorite proof of mine"
katex: True
---
{% katexmm %}
There are a lot of proofs in mathematics. Many of them serve to verify what we intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. Some proofs leave us with a headache, some leave us bored, and some leave us with wonder. In this blog post I will share a beautiful proof leading to a closed formula for the $n$-th Fibonacci number, taking us on a detour into functional programming and linear algebra.
{% endkatexmm %}
There are a lot of proofs in mathematics. Many of them serve to verify what we intuitively know to be true, some of them shed light on new methods, and some reveal new ways to view old ideas. Some proofs leave us with a headache, some leave us bored, and some leave us with wonder. In this blog post I will share a beautiful proof leading to a closed formula for the $$n$$-th Fibonacci number, taking us on a detour into functional programming and linear algebra.
<!--more-->
{% katexmm %}
# The Fibonacci Numbers
The Fibonacci numbers are a sequence of numbers starting with a zero followed by a one where each consequent number is the sum of the last two: $0, 1, 1, 2, 3, 5, 8, 13 \dots$. If we wanted to be more precise we could define a mathematical sequence $\{f\}_{n=0}^{\infty}$ by the following recursive definition:
The Fibonacci numbers are a sequence of numbers starting with a zero followed by a one where each consequent number is the sum of the last two: $$0, 1, 1, 2, 3, 5, 8, 13 \dots$$. If we wanted to be more precise we could define a mathematical sequence $$\{f\}_{n=0}^{\infty}$$ by the following recursive definition:
$$
f_n = \begin{cases}
@ -35,11 +30,11 @@ Likewise, in [Chapter 1.2.2 Tree Recursion](https://mitp-content-server.mit.edu/
> Another common pattern of computation is called tree recursion. As an example, consider computing the sequence of Fibonacci numbers
With this in mind, it might come as a surprise that there is a closed, non-recursive, formula for the $n$-th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP.
With this in mind, it might come as a surprise that there is a closed, non-recursive, formula for the $$n$$-th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP.
# Programmatically calculating the $n$-th Fibonacci number
# Programmatically calculating the $$n$$-th Fibonacci number
A naive way of calculating the $n$-th Fibonacci number is to use the definition above. Check if $n = 0$, if $n = 1$, and otherwise calculating $f_{n-2}$ and $f_{n-1}$. This corresponds to the following Haskell code:
A naive way of calculating the $$n$$-th Fibonacci number is to use the definition above. Check if $$n = 0$$, if $$n = 1$$, and otherwise calculating $$f_{n-2}$$ and $$f_{n-1}$$. This corresponds to the following Haskell code:
```
fib :: Integer -> Integer
fib 0 = 0
@ -49,7 +44,7 @@ fib n = fib (n-2) + fib (n-1)
However, there is an issue with this method, many Fibonacci numbers will be calculated numerous times, as for each Fibonacci number evaluated we split into two paths, evaluating the previous and twice previous Fibonacci number. The reader which prefers visuals might appreciate Figure 1.5 from the SICP chapter.
How might we fix this then? A human calculating the $n$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is superfluous to carry around all previous numbers, as we only need the previous two to calculate the next one. We might think of this as a 2-slot window, moving along the Fibonacci numbers, taking $n$ steps to arrive at $f_n$. In code we could represent this as follows:
How might we fix this then? A human calculating the $$n$$-th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is superfluous to carry around all previous numbers, as we only need the previous two to calculate the next one. We might think of this as a 2-slot window, moving along the Fibonacci numbers, taking $$n$$ steps to arrive at $$f_n$$. In code we could represent this as follows:
```
-- steps -> f n-2 -> f n-1 -> f n
window :: Integer -> Integer -> Integer -> Integer
@ -60,13 +55,18 @@ fib :: Integer -> Integer
fib n = window n 0 1
```
In each step we move the window by replacing the first slot in the window by what was previously in the second slot and filling the new second slot of the window with the sum of the previous two slots. This is then repeated $n$ times, and then the first slot of the window is returned.
In each step we move the window by replacing the first slot in the window by what was previously in the second slot and filling the new second slot of the window with the sum of the previous two slots. This is then repeated $$n$$ times, and then the first slot of the window is returned.
# Mathematically calculating the $n$-th Fibonacci number
# Mathematically calculating the $$n$$-th Fibonacci number
What does this have to do with mathematics, and this beautiful proof that I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent it as a vector. Furthermore, we may view sliding our window one step as a function $S$ from vectors to vectors. This poses an interesting question: is this function a linear transformation?
What does this have to do with mathematics, and this beautiful proof that I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent it as a vector. Furthermore, we may view sliding our window one step as a function $$S$$ from vectors to vectors. This poses an interesting question: is this function a linear transformation?
$$ S\left(\begin{bmatrix} a \\ b \end{bmatrix}\right) + S\left(\begin{bmatrix} c \\ d \end{bmatrix}\right) = \begin{bmatrix} b \\ a + b \end{bmatrix} + \begin{bmatrix} d \\ c + d \end{bmatrix} $$
$$
S \left(\begin{bmatrix} a & b \end{bmatrix}\right) \\
+ S\left(\begin{bmatrix} c & d \end{bmatrix}\right) \\
= \begin{bmatrix} b a + b \end{bmatrix} \\
+ \begin{bmatrix} d c + d \end{bmatrix}
$$
$$ = \begin{bmatrix} b + d \\ a + b + c + d \end{bmatrix} = $$
@ -78,37 +78,37 @@ It is! This is great news as it means we can represent our step function by a ma
$$ S = \begin{bmatrix} 0 & 1 \\ 1 & 1 \end{bmatrix} $$
Then to calculate the $n$-th Fibonacci number we take the starting window $\begin{bmatrix} 0 \\ 1 \end{bmatrix}$ and multiply it by $S^n$. Now that the sliding window has been expressed purely in the language of linear algebra we may apply the tools of linear algebra.
Then to calculate the $$n$$-th Fibonacci number we take the starting window $$\begin{bmatrix} 0 \\ 1 \end{bmatrix}$$ and multiply it by $$S^n$$. Now that the sliding window has been expressed purely in the language of linear algebra we may apply the tools of linear algebra.
# Applying the tools of linear algebra
If you're familiar with linear algebra there might be a part of your brain yelling "diagonalization" right now. We've translated our problem into linear algebra, but even for a small matrix calculating $S^n$ can become costly for high $n$. Diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
If you're familiar with linear algebra there might be a part of your brain yelling "diagonalization" right now. We've translated our problem into linear algebra, but even for a small matrix calculating $$S^n$$ can become costly for high $$n$$. Diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
An eigenvector for our matrix $S$ is a vector $\hat v$ for which $S \hat v = \lambda \hat v$ for some scalar $\lambda$, which we call an eigenvalue. If there are any such vectors we can find them using their definition.
An eigenvector for our matrix $$S$$ is a vector $$\hat v$$ for which $$S \hat v = \lambda \hat v$$ for some scalar $$\lambda$$, which we call an eigenvalue. If there are any such vectors we can find them using their definition.
$$ S \hat v = \lambda \hat v = \lambda I_2 \hat v $$
Subtracting $\lambda I_2 v$ from both sides yields:
Subtracting $$\lambda I_2 v$$ from both sides yields:
$$ 0 = S \hat v - \lambda I_2 \hat v = (S-\lambda I_2) \hat v $$
An equation of the form $0 = A \hat u$ will only have non-trivial solutions if the column vectors of $A$ are linearly dependent, that is if $\textrm{det}(A) = 0$. Thus we can find all scalars $\lambda$ for which there are non-trivial vector solutions by solving $\textrm{det}(S-\lambda I_2) = 0$. Because of this property the polynomial $\textrm{det}(A-\lambda I)$ is called the characteristic polynomial of $A$.
An equation of the form $$0 = A \hat u$$ will only have non-trivial solutions if the column vectors of $$A$$ are linearly dependent, that is if $$\textrm{det}(A) = 0$$. Thus we can find all scalars $$\lambda$$ for which there are non-trivial vector solutions by solving $$\textrm{det}(S-\lambda I_2) = 0$$. Because of this property the polynomial $$\textrm{det}(A-\lambda I)$$ is called the characteristic polynomial of $$A$$.
In our case we have the following:
$$ \textrm{det}(S-\lambda I_2) = \begin{vmatrix} - \lambda & 1 \\ 1 & 1 - \lambda \end{vmatrix} = \lambda^2 - \lambda - 1 = 0$$
Solving for $\lambda$ yields two eigenvalues:
Solving for $$\lambda$$ yields two eigenvalues:
$$ \lambda_0 = \frac{1 - \sqrt 5}{2} ,\; \lambda_1 = \frac{1 + \sqrt 5}{2}$$
Would you look at that, $\frac{1 + \sqrt 5}{2}$, the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio $\frac{f_{n+1}}{f_n}$ approaches $\frac{1 + \sqrt 5}{2}$.
Would you look at that, $$\frac{1 + \sqrt 5}{2}$$, the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio $$\frac{f_{n+1}}{f_n}$$ approaches $$\frac{1 + \sqrt 5}{2}$$.
Now we can solve $(S-\lambda I_2) \hat v = 0$ for $\lambda_0$ and $\lambda_1$, and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination yields:
Now we can solve $$(S-\lambda I_2) \hat v = 0$$ for $$\lambda_0$$ and $$\lambda_1$$, and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination yields:
$$ \hat v_0 = \begin{bmatrix} -2 \\ \sqrt 5 - 1 \end{bmatrix},\; \hat v_1 = \begin{bmatrix} 2 \\ \sqrt 5 + 1 \end{bmatrix} $$
These vectors are indeed linearly independent, and we can use them as basis vectors for our diagonal matrix. We will now to write $S = BDB^{-1}$ where
These vectors are indeed linearly independent, and we can use them as basis vectors for our diagonal matrix. We will now to write $$S = BDB^{-1}$$ where
$$ B = \begin{bmatrix} -2 & 2 \\ \sqrt 5 - 1 & \sqrt 5 + 1 \end{bmatrix}$$
$$ D = \begin{bmatrix} \frac{1 - \sqrt 5}{2} & 0 \\ 0 & \frac{1 + \sqrt 5}{2} \end{bmatrix}$$
@ -121,18 +121,16 @@ Which is very nice since
$$D^n = \begin{bmatrix} \frac{1 - \sqrt 5}{2} & 0 \\ 0 & \frac{1 + \sqrt 5}{2} \end{bmatrix}^n = \begin{bmatrix} \left(\frac{1 - \sqrt 5}{2}\right)^n & 0 \\ 0 & \left(\frac{1 + \sqrt 5}{2}\right)^n \end{bmatrix}$$
After calculating $B^{-1}$ we can solve $\begin{bmatrix} f_{n+1} \\ f_n \end{bmatrix} = BD^nB^{-1}$ for $f_n$ to get our closed expression.
After calculating $$B^{-1}$$ we can solve $$\begin{bmatrix} f_{n+1} \\ f_n \end{bmatrix} = BD^nB^{-1}$$ for $$f_n$$ to get our closed expression.
$$ f_n = \frac{1}{\sqrt 5} \left(\left(\frac{1 + \sqrt 5}{2}\right)^n - \left(\frac{1 - \sqrt 5}{2}\right)^n\right) $$
# Final thoughts
Whenever I first happened upon the closed formula for the $n$-th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me a recursively specified integer. After I learned this proof it doesn't feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just something to be applied in geometry, flow balancing or computer graphics.
Whenever I first happened upon the closed formula for the $$n$$-th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me a recursively specified integer. After I learned this proof it doesn't feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just something to be applied in geometry, flow balancing or computer graphics.
# Addendum
It was pointed out to me [on mastodon](https://mastodon.vierkantor.com/@Vierkantor/109978590835441958) that this technique is of interest even if it is not possible to diagonalize the stepping matrix. This is because using fast binary exponentiation one can perform matrix exponentiation in logarithmic time. Thus any linearly recursive algorithm can be calculated in logarithmic time!
Fast binary exponentiation uses the identity $A^{2k} = (A \cdot A)^k$, thus we can split the exponent in 2 when even, rather than performing $2k$ multiplications. Recursively doing this each time the exponent is even yields logarithmic time exponentiation.
{% endkatexmm %}
Fast binary exponentiation uses the identity $$A^{2k} = (A \cdot A)^k$$, thus we can split the exponent in 2 when even, rather than performing $$2k$$ multiplications. Recursively doing this each time the exponent is even yields logarithmic time exponentiation.

View File

@ -3,7 +3,6 @@ layout: centered
title: "about"
---
{% katexmm %}
<article class="aba">
<div>
<h2>me</h2>
@ -22,4 +21,3 @@ title: "about"
<figcaption>me with π</figcaption>
</figure>
</article>
{% endkatexmm %}

View File

@ -80,20 +80,47 @@ with pkgs; let
version = "0.1.0";
});
katex = (buildRubyGem {
gemName = "katex";
dependencies = [ rubyPackages.execjs ];
groups = ["default"];
platforms = [];
source = {
remotes = ["https://rubygems.org"];
sha256 = "f17360dd98b7b42288cb35dfa7d405ce9ec7e290c9d8a06d7f0fc8774fbfa6eb";
type = "gem";
};
version = "0.10.0";
});
kramdown-math-katex = (buildRubyGem {
gemName = "kramdown-math-katex";
dependencies = [ katex rubyPackages.kramdown ];
groups = ["default"];
platforms = [];
source = {
remotes = ["https://rubygems.org"];
sha256 = "7119b299fed5ec49b9e2107d18e88f90acf4bbdf1b82a7568fd488790bb04cdc";
type = "gem";
};
version = "1.0.1";
});
rubyEnv = ruby.withPackages (ps: with ps; [
jekyll
jekyll-feed
jekyll-katex
jekyll-minifier
jekyll-agda
execjs
cssminify2
htmlcompressor
json-minify
uglifier
kramdown
kramdown-parser-gfm
kramdown-math-katex
katex
execjs
uglifier
htmlcompressor
cssminify2
json-minify
webrick
]);
in
rubyEnv.gems
[ nodejs ] ++ rubyEnv.gems

View File

@ -11,6 +11,9 @@
deps = import ./deps.nix { inherit pkgs; };
in {
devShells.default = pkgs.mkShell {
shellHook = ''
export EXECJS_RUNTIME=Node
'';
packages = deps;
};
packages.default = pkgs.stdenv.mkDerivation {
@ -18,7 +21,7 @@
buildInputs = deps;
src = ./.;
buildPhase = ''
JEKYLL_ENV=production jekyll build -d $out
EXECJS_RUNTIME=Node JEKYLL_ENV=production jekyll build -d $out
'';
};
});

View File

@ -3,7 +3,6 @@ layout: centered
title: "work"
---
{% katexmm %}
<h2>projects</h2>
<ul>
@ -21,4 +20,3 @@ title: "work"
<tr><td>Teaching Assistant</td><td>TDA555 - Introduction to functional programming</td><td>2022-08-22 - 2022-11-04</td></tr>
<tr><td>Teaching Assistant</td><td>TDA555 - Introduction to functional programming</td><td>2023-08-22 - 2023-11-04</td></tr>
</table>
{% endkatexmm %}