--- layout: post title: "A look at SECD machines" --- The SECD machine is a virtual machine designed to evaluate lambda expressions. It's purpose is to be a more natural goal for compilers than assembly while maintaining reasonable execution speed. In this blog post I will give you a quick intro to SECD machines and an overview of a simple implementation. ## The SECD Machine SECD stands for Stack Environment Control Dump, all of which but the environment are stacks in the SECD machine. The machine operates by reading instructions from the control stack which operate on itself and the other stacks. A lambda is implemented as its body coupled with its environment. Application is done with the "apply" instruction which pops a lambda off the stack and adds the next element of the stack to the lambdas environment environment, binding the variable of the lambda. The previous stacks are then pushed onto the dump stack. When the lambda has been fully reduced the return instruction is used to save the top of the stack, the result of the reduction, and restores the stacks from the dump stack. ## The Modern SECD Machine This approach, while sound, suffers from some performance issues. Thankfully, there are a number of improvements which can be made. The first of which is the use of De Brujin indecies. In De Brujin index notation each variable is a number indexing which lambda it was bound at (counting outwards). Here are some examples: $$\lambda x.x \xRightarrow[]{\text{De Brujin}} \lambda \#0$$ $$\lambda f.\lambda g.\lambda x. f(gx) \xRightarrow[]{\text{De Brujin}} \lambda\lambda\lambda \#2(\#1\#0)$$ The benefit in using De Brujin notation is that, rather than a map, a dynamic array may be used for the environment, with variables being idecies into said array. An additional benifit is that the machine does not need to concern itself with alpha equivalence. As a side-note, counting conventionally starts at $$\#1$$, not $$\#0$$, however in a virtual machine counting from $$\#0$$ is, of course, much more natural. The second improvement to be made is to "dump" the dump stack. Rather than using the dump stack the current state will be pushed onto the stack as a closure which the return instruction can then restore when exiting a lambda. The third improvement is to introduce a new, static "stack". The rationale for this is that many programs will include top level definitions which are in scope throughout the program. Including these definitions in the environment whenever a closure is created is a waste of both memory and processing power. The fourth improvement is allowing multi-variable lambdas. In the case of the SECD machine using a multi-variable lambda rather than multiple single variable lambdas corresponds to not nesting closures, which would lead to redundant copies of the same environment being made. Note that this does not sacrifice currying, as will become clear in later sections. ## An implementation of a simple SECD machine In this section a SECD machine implemented in Haskell will be introduced bit by bit. This machine is by no means an efficient nor powerful implementation, it serves only to demonstrate the core concepts of the SECD machine. ### Data structures The machine will only be able to recognise two types of values. Integers and Closures. ```hs data Val = I Int | Clo Int [Val] [Inst] -- number of arguments, environment, instructions deriving (Eq, Show) ``` The machine comes with the following instruction set. ```hs data Inst = Const Int | Global Int | Local Int | Closure Int Int -- number of args, number of instructions | App | Ret | Dup | Add deriving (Eq, Show) ``` ### The instructions * Const pushes a constant onto the stack. * Global pushes a global variable onto the stack. * Local pushes a local variable onto the stack. * Closure creates a closure taking a set number of arguments and consuming a set amount of instructions from the control stack. The resulting closure is pushed onto the stack. * App pops a closure and an argument from the stack. If the closure takes more than one argument the argument is pushed onto the closures environment and the closure is pushed back onto the stack. If the closure takes only one argument the closure will, rather than being pushed onto the stack, replace the current environment, such that the closures instructions are placed in the control stack, and its environment placed in the environment stack. A closure is then formed from the old control and environment stacks, which is pushed onto the stack. * Ret pops a closure in the second index of the stack and installs it as the current control and environment stacks. The element at the top of the stack remains untouched, yielding the result of an application. * Dup duplicates the element at the top of the stack. It's only included as a matter of convenience. * Add pops the top two elements off the stack, adds them, and pushes the result back onto the stack. #### Instruction Table All stacks grow from right to left, that is, the left most element is at the top of the stack. {% katexmm %}
Before | After | ||||
---|---|---|---|---|---|
Control | Env | Stack | Control | Env | Stack |
Const i : c | e | s | c | e | i : s |
Global i : c | e | s | c | e | Globals[i] : s |
Local i : c | e | s | c | e | e[i] : s |
Closure n a : $c_1$ ... $c_n$ : c | e | s | c | e | Closure a {e} [$c_1$ ... $c_n$] : s |
App : c | e | Closure {e'} [c'] : v : s | c' | v : e' | Closure$^0$ {e} [c] : s |
App : c | e | Closure$^n$ {e'} [c'] : v : s | c | e | Closure$^{n - 1}$ {v : e'} [c'] : s |
Ret : c | e | v : Closure$^0$ \{e'\} [c'] : s | c' | e' | v : s |
Dup : c | e | v : s | c | e | v : v : s |
Add : c | e | v : s | c | e | v + v : s |