A-Normal Form
Syntactically, A-Normal form partitions expressions into two forms: atomic expressions and complex expressions.
An expression is atomic if:
- it is guaranteed to terminate;
- it causes no side effects;
- it causes no control effects; and
- it never produces an error.
In A-Normal Form, all non-atomic (complex) expressions must be let-bound, or else appear in tail position.
The following grammar describes a minimal A-Normal form, adapted from Flanagan et al.’s The Essence of Compiling with Continuations:
<aexp> ::= NUMBER | STRING | VAR | BOOLEAN | PRIMOP
| (lambda (VAR ...) <exp>)
<cexp> ::= (<aexp> <aexp> ...)
| (if <aexp> <exp> <exp>)
<exp> ::= (let ([VAR <cexp>]) <exp>)
| <cexp>
| <aexp>
Examples
Neither the function terms nor the arguments can be applications themselves:
((f g) (h x) 3)
=>
(let ([t0 (f g))
(let ([t1 (h x))
(t0 t1 3)))
Factorial:
(define (f n)
(if (= n 0)
1
(* n (f (- n 1)))))
(f 20)
=>
(define f
(λ (n)
(let ((g1478 (= n 0)))
(if g1478
1
(let ((g1479 (- n 1)))
(let ((g1480 (f g1479)))
(* n g1480)))))))
(f 20)
Connection to assembly code
Juxtaposing some assembly pseudocode with code in A-Normal form hints at the connection between the two:
(define (celsius F) ; celsius: mov F, a0
(let ((t1 (/ 5 9))) ; div t1, 5, 9
(let ((t2 (- F 32))) ; sub t2, F, 32
(* t1 t2)))) ; mul rv, t1, t2
; ret rv
The normalization process imparts a clear order to code.
Transforming to A-Normal Form
In the original paper, Flanagan et al. provide an A-Normalization algorithm in Scheme.
Remarkably, the code in that paper still works!
Not only does it work, but the code produces efficient output, and its structure is readily extended to new language forms.
Here it is with only cosmetic embellishment in Racket:
(define (Value? M)
(match M
[`(quote ,_) #t]
[(? number?) #t]
[(? boolean?) #t]
[(? string?) #t]
[(? char?) #t]
[(? symbol?) #t]
[(or '+ '- '* '/ '=) #t]
[else #f]))
(define (normalize-term M) (normalize M (λ (x) x)))
(define (normalize M k)
(match M
[`(λ ,params ,body)
(k `(λ ,params ,(normalize-term body)))]
[`(let ([,x ,M1]) ,M2)
(normalize M1 (λ (N1)
`(let ([,x ,N1])
,(normalize M2 k))))]
[`(if ,M1 ,M2 ,M3)
(normalize-name M1 (λ (t)
(k `(if ,t ,(normalize-term M2)
,(normalize-term M3)))))]
[`(,Fn . ,M*)
(normalize-name Fn (λ (t)
(normalize-name* M* (λ (t*)
(k `(,t . ,t*))))))]
[(? Value?) (k M)]))
(define (normalize-name M k)
(normalize M (λ (N)
(if (Value? N) (k N)
(let ([t (gensym)])
`(let ([,t ,N]) ,(k t)))))))
(define (normalize-name* M* k)
(if (null? M*)
(k '())
(normalize-name (car M*) (λ (t)
(normalize-name* (cdr M*) (λ (t*)
(k `(,t . ,t*))))))))
With side effects and a top-level
It’s not hard to extend the original Flanagan et al. normalizer to handle top-level defines and side effects:
;; Input language:
;; <prog> ::= <dec> ...
;; <dec> ::= (define (<var> <name> ...) <exp>)
;; | (define <var> <exp>)
;; | <exp>
;; <exp> ::= (let ([<var> <exp>] ...) <exp>)
;; | (if <exp> <exp> <exp>)
;; | (set! <var> <exp>)
;; | (λ (<name> ...) <exp>)
;; | <number>
;; | <boolean>
;; | <string>
;; | <var>
;; Output language:
;; <prog> ::= <dec> ...
;; <dec> ::= (define <var> <exp>)
;; | (begin <dec> ...)
;; | <exp>
;; <aexp> ::= (λ (<name> ...) <exp>)
;; | <number>
;; | <boolean>
;; | <string>
;; | <var>
;; | (void)
;; <cexp> ::= (<aexp> <aexp> ...)
;; | (if <aexp> <exp> <exp>)
;; | (set! <var> <exp>)
;; <exp> ::= (let ([<var> <cexp>]) <exp>)
;; | <aexp>
;; | <cexp>
(define (atomic? exp)
(match exp
[`(quote ,_) #t]
[(? number?) #t]
[(? boolean?) #t]
[(? string?) #t]
[(? char?) #t]
[(? symbol?) #t]
[(or '+ '- '* '/ '=) #t]
[else #f]))
;; Expression normalization:
(define (normalize-term exp) (normalize exp (λ (x) x)))
(define (normalize exp k)
(match exp
[`(λ ,params ,body)
(k `(λ ,params ,(normalize-term body)))]
[`(let () ,exp)
(normalize exp k)]
[`(let ([,x ,exp1] . ,clause) ,exp2)
(normalize exp1 (λ (aexp1)
`(let ([,x ,aexp1])
,(normalize `(let (,@clause) ,exp2) k))))]
[`(if ,exp1 ,exp2 ,exp3)
(normalize-name exp1 (λ (t)
(k `(if ,t ,(normalize-term exp2)
,(normalize-term exp3)))))]
[`(set! ,v ,exp)
(normalize-name exp (λ (t)
`(let ([,(gensym '_) (set! ,v ,t)])
,(k '(void)))))]
[`(,f . ,e*)
(normalize-name f (λ (t)
(normalize-name* e* (λ (t*)
(k `(,t . ,t*))))))]
[(? atomic?)
(k exp)]))
(define (normalize-name exp k)
(normalize exp (λ (aexp)
(if (atomic? aexp) (k aexp)
(let ([t (gensym)])
`(let ([,t ,aexp]) ,(k t)))))))
(define (normalize-name* exp* k)
(if (null? exp*)
(k '())
(normalize-name (car exp*) (λ (t)
(normalize-name* (cdr exp*) (λ (t*)
(k `(,t . ,t*))))))))
;; Top-level normalization:
(define (normalize-define def)
(match def
[`(define (,f . ,params) ,body)
`(define ,f ,(normalize-term `(λ ,params ,body)))]
[`(define ,v ,exp)
`(begin ,@(flatten-top (normalize-term exp) v))]))
(define (flatten-top exp v)
(match exp
[`(let ([,x ,cexp]) ,exp)
(cons `(define ,x ,cexp)
(flatten-top exp v))]
[else
`((define ,v ,exp))]))
(define (normalize-program decs)
(match decs
['()
'()]
[(cons `(define . ,_) rest)
(cons (normalize-define (car decs))
(normalize-program rest))]
[(cons exp rest)
(cons (normalize-term exp)
(normalize-program rest))]))
Code
- The classic a-normalizer
- The extended a-normalizer
- An a-normalizer for expanded Racket by Jay McCarthy
More resources
For more on the compilation of functional languages, I recommend:
- Andrew Appel’s Compiling with Continuations; and
- Christian Queinnec’s Lisp in Small Pieces.
- My post on continuation-passing style.
Related articles
- 26 languages in 25 days: Reflections on language design
- 26 languages in 25 days: Strategy, tactics, logistics
- Tree transformations: Desugaring Scheme
- Lexical analysis in Racket
- Grammar: The language of languages (BNF, EBNF, ABNF)
- What is static program analysis?
- Implementing Java as a CESK machine, in Java
- Writing an interpreter, CESK-style
- Order theory for computer scientists
- HOWTO: Translate math into code
- Writing CEK-style interpreters in Haskell
- Closure conversion: How to compile lambda
- How to compile with continuations
- Understand exceptions by implementing them
- Compiling up to the λ-calculus
- Parsing with derivatives (Yacc is dead: An update)
- By example: Continuation-passing style in JavaScript
- 7 lines of code, 3 minutes: Implement a programming language
- Architectures for interpreters
- First-class macros from meta-circular evaluators
- Programming with continuations by example
- Compiling Scheme to C
- Compiling to Java
- Church encodings in Scheme
- Non-termination without loops, iteration or recursion in Javascript
- Memoizing recursive functions in Javascript with the Y combinator
- Advanced programming languages
- Recommended books and papers for grad students
- Desugaring regular operations in context-free grammars
- Parsing S-Expressions in Scala
- Standalone lexers with lex: synopsis, examples, and pitfalls
- An interpreter for Lambdo
- Parsing BibTeX into S-Expressions, JSON, XML and BibTeX
- Low-level web programming in Racket
- Understanding and implementing laziness
- Higher-order list operations
- 99 ways to say '(I love you) in Racket
- Hacking the OS X Address Book with C, Racket and LaTeX
- 2011 Scheme Workshop: Accepting Submissions
- Lexical analysis and syntax-highlighting in JavaScript
- k-CFA: Analyzing types and control-flow in dynamic languages
- Fast vector-structs in Scheme from syntax-rules
- Matching regular expressions with derivatives
- Fermat and Solovay-Strassen primality tests in Scheme
- An implementation of RSA in Scheme
Twitter: @mattmight
Instagram: @mattmight
LinkedIn: matthewmight
Mastodon: @mattmight@mathstodon.xyz
Sub-reddit: /r/mattmight