Machinebased interpreters
The CESK machine is a machinebased interpreter.
(Most would actually call it a semantics, since it is formally defined.)
At a highlevel, a machinebased interpreter has four components:
 $\mathit{Prog}$  the set of programs.
 $\Sigma$  the set of machine states.
 $\mathit{inject} : \mathit{Prog} \to \Sigma$  a program to initialstate injection function.
 $\mathit{step} : \Sigma \rightharpoonup \Sigma$  a (partial) state to state transition function.
Given a program $p \in \mathit{Prog}$, the interpreter first injects it into an initial machine state $\varsigma_0$:
\[ \varsigma_0 = \mathit{inject}(p)\text. \]The algorithm for running the interpreter is then simple:
ς := ς_{0} while ς is defined for step: ς := step(ς)
A note on determinism
The structure of the $\mathit{step}$ function assumes deterministic evaluation. Nondeterministic evaluation requires a function that yields multiple potential successor states: $\mathit{step} : \Sigma \rightarrow \mathcal{P}(\Sigma)$.
C, E, S and K
The CESK machine takes its name from the four components that comprise a state of its execution: the control string, the environment, the store and the continuation.
C
Depending on the language being interpreted, the control string could be as simple as a program counter, a statement or an expression.
In this article, the control string is an expression.
E
The environment is a structure, almost always a map, that associates variables with an address in the store.
The environment can be implemented as a purely functional map (hash or treebased) or even directly as a firstclass function.
S
The store, which some might liken to a heap or memory, is a map from addresses to values.
Like the environment, the store can be a map (hash or treebased) or a firstclass function.
K
The continuation component is a representation of the program stack, often times represented exactly as a list of frames, or as an implicitly linked list.
ANormal Form
ANormal Form is a normalized variant of the lambdacalculus.
Transforming a language to ANF is straightforward, and it simplifies the structure of an interpreter.
Here's a sample BNF grammar for a reasonable variant on ANF:
lam ::= (λ (var_{1} ... var_{N}) exp) aexp ::= lam  var  #t  #f  integer  (prim aexp_{1} ... aexp_{N}) cexp ::= (aexp_{0} aexp_{1} ... aexp_{N})  (if aexp exp exp)  (call/cc aexp)  (set! var aexp)  (letrec ((var_{1} aexp_{1}) ... (var_{N} aexp_{N})) exp) exp ::= aexp  cexp  (let ((var exp)) exp) prim ::= +    *  =
There are three kinds of expressions:
 Atomic expressions (aexp) are those for which evaluation must always terminate, never cause an error and never produce a side effect.

Complex expressions (cexp) may not terminate, may produce an error and may have a side effect.
However, a complex expression may defer execution to only one other complex expression.
For instance,
letrec
defers directly to its body, and if defers to only one of its arms.  Expressions (exp) can be atomic, complex or letbound. A letbound expression will first defer execution to the binding expression, and then resume execution in the body.
This structure forces order of evaluation to be specified syntactically.
For instance, the meaning of the expression ((f x) (g y))
is undefined until we know whether
(f x)
or (g y)
is executed first.
In ANF, this expression is illegal, and must be written:
(let ((fx (f x))) (let ((gy (g y))) (fx gy)))or
(let ((gy (g y))) (let ((fx (f x))) (fx gy)))
so that there is no ambiguity.
A formal definition
A formal definition of the CESK machine guides the code.
If you're unfamiliar with formal mathematical notation, you may want to review my article on the connection between discrete mathematics and code.
If you're only interested in running code, skip ahead.
The statespace, $\Sigma$, of the CESK machine for ANF has four components:
\[ \varsigma \in \Sigma = \mathsf{Exp} \times \mathit{Env} \times \mathit{Store} \times \mathit{Kont} \text. \]In case it's not clear, $\mathsf{Exp}$ is the set of of all expressions defined by the earlier grammar. Also, the notation $\varsigma \in \Sigma$ is a hint that the symbol $\varsigma$ will be used to denote members of the set $\Sigma$.
Environments
The environment in a machine state is a partial function that maps a variable to its address:
\[ \rho \in \mathit{Env} = \mathsf{Var} \rightharpoonup \mathit{Addr} \]It has to be a partial function, because not all variables are in every scope.
Once again, the hint $\rho \in \mathit{Env}$ indicates that the symbol $\rho$ will be used to denote environments.
Stores
A store maps addresses to values:
\[ \sigma \in \mathit{Store} = \mathit{Addr} \rightharpoonup \mathit{Value} \text. \]In a CESK machine, variable lookup is a twostage process: first to an address (through some environment), then to a value (through the store).
Values
There are five kinds of values in this machinevoid, booleans, integers, closures and firstclass continuations:
\[ \mathit{val} \in \mathit{Value} ::= \mathbf{void} \mathrel{} \mathit{z} \mathrel{} \mathtt{\#t} \mathrel{} \mathtt{\#f} \mathrel{} \mathbf{clo}(\mathit{lam}, \rho) \mathrel{} \mathbf{cont}(\kappa) \]In the set of values, $z$ is an integer, while $\mathtt{\#t}$ and $\mathtt{\#f}$ are booleans.
A closure pairs a lambda term
with an environment to define the values of its free variables.
The environment is necessary because a term like
(λ () x)
is undefined, unless an environment
specifies the value of x
.
Continuations are included in values because
the language includes call/cc
,
which enables the creation of firstclass continuations.
Continuations
A continuation is effectively the program stack.
Creating a continuation allows us to divert to a complex subcomputation and return later.
So, a continuation needs enough information to resume execution.
For this machine, diverting to a subcomputation only happens in letbound expressions.
Given a letbound expression (let ([v exp]) body)
,
execution will first go to exp
, which means
that when it finishes evaluating exp
,
the result will bind to v
,
and execution will resume with body
.
In a CESK machine, the assumption is that the current computation is always executing on behalf of some continuation awaiting its result. (The special initial continuation, which awaits the result of the program, is called $\mathbf{halt}$.)
Consequently, continuations nest within continuations.
Finally, every continuation must contain the local environment that knows the addresses of the variables in scope.
Putting this all together lets us formally define the space of continuations:
\[ \kappa \in \mathit{Kont} ::= \mathbf{letk}(v, \rho, e, \kappa) \mathrel{} \mathbf{halt} \]Evaluating atomic expressions
Atomic expressions ($\mathit{aexp}$ in the grammar) are easy to evaluate with an auxilary semantic function, $\mathcal{A} : \mathsf{AExp} \times \mathit{Env} \times \mathit{Store} \rightharpoonup \mathit{Value}$:
Integers evaluate to themselves:
\[ \mathcal{A}(z, \rho, \sigma) = z \text. \]Booleans do too:
\[ \mathcal{A}( \mathtt{\#t}, \rho, \sigma) = \#t \text; \] \[ \mathcal{A}( \mathtt{\#f}, \rho, \sigma) = \#f \text. \]Lambda terms become closures:
\[ \mathcal{A}( \mathit{lam}, \rho, \sigma) = \mathbf{clo}( \mathit{lam}, \rho ) \text. \]Primitive expressions are evaluated recursively:
\[ \mathcal{A}( \mathtt{(} \mathit{prim}\; \mathit{aexp}_1 \ldots \mathit{aexp}_n \mathtt{)}, \rho, \sigma) = \\ \mathcal{O}(\mathit{prim}) \langle \mathcal{A}( \mathit{aexp}_1, \rho, \sigma), \ldots, \mathcal{A}( \mathit{aexp}_n, \rho, \sigma) \rangle \text. \]where $\mathcal{O} : \mathit{Prim} \to (\mathit{Value}^* \rightharpoonup \mathit{Value})$ maps a primitive operation (by name) to its corresponding operation.
Stepping forward
To define the step function, $\mathit{step} : \Sigma \rightharpoonup \Sigma$, for this machine, we need a case for each expression type.
Procedure call
In a procedure call, the $\mathit{step}$ function first evaluates the expression for procedure to be invoked, and then the expressions for the arguments to be supplied.
Then it applies that procedure to those arguments.
\[ \mathit{step}( \mathtt{(} \mathit{aexp}_0 \; \mathit{aexp}_1 \ldots \mathit{aexp}_n \mathtt{)}, \rho, \sigma, \kappa ) = \\ applyproc(proc,\langle \mathit{value}_1, \ldots, \mathit{value}_n \rangle,\sigma,\kappa) \text{, where} \\ \begin{array}{l} \mathit{proc} = \mathcal{A}(\mathit{aexp}_0, \rho, \sigma) \\ \mathit{value}_i = \mathcal{A}(\mathit{aexp}_i, \rho, \sigma) \end{array} \] and $applyproc : \mathit{Value} \times \mathit{Value}^* \times \mathit{Store} \times \mathit{Kont} \rightharpoonup \Sigma$ is an auxiliary function (defined below) that applies a procedure to a value.Return
When the expression under evaluation is an atomic expression, it indicates that the current subcomputation is finished and we need to return the result to the current continuation, which has been patiently awaiting it:
\[ \mathit{step}(\mathit{aexp}, \rho, \sigma, \kappa) = \mathit{applykont}(\kappa, \mathcal{A}(\mathit{aexp}, \rho, \sigma), \sigma) \text, \]where the auxilary function $\mathit{applykont} : \mathit{Kont} \times \mathit{Value} \times \mathit{Store} \rightharpoonup \mathit{State}$ (defined below) applies a continuation to a value.
Conditionals
Conditional evaluation is straightforward: the condition is evaluated, and the right expression is chosen for the next state.
\[ \mathit{step}( \mathtt{(if}\; \mathit{aexp}\; \mathit{e}_{\mathrm{true}} \; \mathit{e}_{\mathrm{false}} \mathtt{)}, \rho, \sigma, \kappa) = \begin{cases} ( \mathit{e}_{\mathrm{true}}, \rho, \sigma, \kappa ) & \mathcal{A}(\mathit{aexp}, \rho, \sigma) \neq \mathtt{\#f} \\ ( \mathit{e}_{\mathrm{false}}, \rho, \sigma, \kappa ) & \text{otherwise.} \end{cases} \]Let
Evaluating let
will force the creation of a continuation.
Since execution first evaluates the bound expression, the continuation will contain
enough information to resume execution in the body of the let
.
where $\kappa' = \mathbf{letk}(v,\mathit{body}, \rho, \kappa)$.
Mutation
The CESK approach makes mutation straightforward: look up the address to be changed, and then overwrite that address in the store. \[ \mathit{step}( \mathtt{(set!}\; \mathit{v}\; \mathit{aexp} \mathtt{)}, \rho, \sigma, \kappa) = \mathit{applykont}(\kappa,\mathbf{void},\sigma') \text, \] where $\sigma' = \sigma[\rho(v) \mapsto \mathcal{A}(\mathit{aexp}, \rho, \sigma)]$.Notation. Given a function (or partial function) $f : X \to Y$, the function $f[x \mapsto y]$ is identical to $f$ except that $x$ yields $y$:
\[ (f[x \mapsto y])(x) = y \] \[ (f[x \mapsto y])(x') = f(x') \text{ if } x \neq x' \text. \]Recursion
Handling recursion requires establishing selfreference.
In a language like Scheme, the construct letrec
is often compiled into "lets and sets";
that is:
(letrec ([v1 exp1] ... [vN expN]) body)
becomes:
(let ([v1 (void)] ... [vN (void)]) (set! v1 exp1) ... (set! vN expN) body)
A CESK machine can fake this by extending the environment first, and then evaluating the expressions in the context of the extended environment:
\[ \mathit{step}( \mathtt{(letrec\; (} \mathtt{[}\mathit{v}_1\; \mathit{aexp}_1\mathtt{]} \ldots \mathtt{[}\mathit{v}_n\; \mathit{aexp}_n\mathtt{]} \mathtt{)}\; \mathit{body} \mathtt{)}, \rho, \sigma, \kappa) = (\mathit{body}, \rho', \sigma', \kappa)\text, \]where:
\[ a_1,\ldots,a_n \text{ are fresh addresses in } \sigma \\ \rho' = \rho[v_i \mapsto a_i] \\ \mathit{value}_i = \mathcal{A}(\mathit{aexp}_i, \rho', \sigma) \\ \sigma' = \sigma[a_i \mapsto \mathit{value}_i] \text. \]Firstclass continuations
Firstclass continuations are a powerful construct, since they allow the simulation of so many other control constructs. For instance, exceptions are merely syntactic sugar on top of continuations.
And, continuations can do many other things too.
The procedure call/cc
captures the current continuation as a firstclass procedure:
Applying procedures
The auxiliar function $applyproc : \mathit{Value} \times \mathit{Value}^* \times \mathit{Store} \times \mathit{Kont} \rightharpoonup \Sigma$ applies a procedure to a value.
\[ \mathit{applyproc}(\mathbf{clo}( \mathtt{(\lambda\; (\mathit{v}_1 \ldots \mathit{v_n})\; \mathit{body})},\rho) \langle \mathit{value}_1, \ldots \mathit{value}_n \rangle, \sigma, \kappa ) = \\ (\mathit{body}, \rho', \sigma', \kappa) \text{, where } \] \[ a_1,\ldots,a_n \text{ are fresh addresses in } \sigma \\ \rho' = \rho[v_i \mapsto a_i] \\ \sigma' = \sigma[a_i \mapsto \mathit{value}_i] \text. \]Applying continuations
The auxilary function $\mathit{applykont} : \mathit{Kont} \times \mathit{Value} \times \mathit{Store} \rightharpoonup \mathit{State}$ applies a continuation to a return value:
\[ \mathit{applykont}( \mathbf{letk}(v,e,\rho,\kappa), \mathit{value}, \sigma) = (e, \rho[v \mapsto \mathit{a}], \sigma[a \mapsto \mathit{value}], \kappa) \text, \]where $a \not\in \mathit{dom}(\sigma)$ is a fresh address.
As running code
I've transliterated the math here directly into working Racket code for a CESK interpreter.
Further reading
There are a few good books on implementing compilers and interpreters for functional languages.
The classic MIT text, Structure and Interpretation of Computer Programs, is worth the read:
Lisp in Small Pieces is a consistent recommendation in the courses I teach:
For advanced techniques, Appel's Compiling with Continuations remains my favorite reference: