A-Normal Form
Syntactically, A-Normal form partitions expressions into two forms: atomic expressions and complex expressions.
An expression is atomic if:
- it is guaranteed to terminate;
- it causes no side effects;
- it causes no control effects; and
- it never produces an error.
In A-Normal Form, all non-atomic (complex) expressions must be let-bound, or else appear in tail position.
The following grammar describes a minimal A-Normal form, adapted from Flanagan et al.'s The Essence of Compiling with Continuations:
<aexp> ::= NUMBER | STRING | VAR | BOOLEAN | PRIMOP | (lambda (VAR ...) <exp>) <cexp> ::= (<aexp> <aexp> ...) | (if <aexp> <exp> <exp>) <exp> ::= (let ([VAR <cexp>]) <exp>) | <cexp> | <aexp>
Examples
Neither the function terms nor the arguments can be applications themselves:
((f g) (h x) 3) => (let ([t0 (f g)) (let ([t1 (h x)) (t0 t1 3)))
Factorial:
(define (f n) (if (= n 0) 1 (* n (f (- n 1))))) (f 20) => (define f (λ (n) (let ((g1478 (= n 0))) (if g1478 1 (let ((g1479 (- n 1))) (let ((g1480 (f g1479))) (* n g1480))))))) (f 20)
Connection to assembly code
Juxtaposing some assembly pseudocode with code in A-Normal form hints at the connection between the two:
(define (celsius F) ; celsius: mov F, a0 (let ((t1 (/ 5 9))) ; div t1, 5, 9 (let ((t2 (- F 32))) ; sub t2, F, 32 (* t1 t2)))) ; mul rv, t1, t2 ; ret rv
The normalization process imparts a clear order to code.
Transforming to A-Normal Form
In the original paper, Flanagan et al. provide an A-Normalization algorithm in Scheme.
Remarkably, the code in that paper still works!
Not only does it work, but the code produces efficient output, and its structure is readily extended to new language forms.
Here it is with only cosmetic embellishment in Racket:
(define (Value? M) (match M [`(quote ,_) #t] [(? number?) #t] [(? boolean?) #t] [(? string?) #t] [(? char?) #t] [(? symbol?) #t] [(or '+ '- '* '/ '=) #t] [else #f])) (define (normalize-term M) (normalize M (λ (x) x))) (define (normalize M k) (match M [`(λ ,params ,body) (k `(λ ,params ,(normalize-term body)))] [`(let ([,x ,M1]) ,M2) (normalize M1 (λ (N1) `(let ([,x ,N1]) ,(normalize M2 k))))] [`(if ,M1 ,M2 ,M3) (normalize-name M1 (λ (t) (k `(if ,t ,(normalize-term M2) ,(normalize-term M3)))))] [`(,Fn . ,M*) (normalize-name Fn (λ (t) (normalize-name* M* (λ (t*) (k `(,t . ,t*))))))] [(? Value?) (k M)])) (define (normalize-name M k) (normalize M (λ (N) (if (Value? N) (k N) (let ([t (gensym)]) `(let ([,t ,N]) ,(k t))))))) (define (normalize-name* M* k) (if (null? M*) (k '()) (normalize-name (car M*) (λ (t) (normalize-name* (cdr M*) (λ (t*) (k `(,t . ,t*))))))))
With side effects and a top-level
It's not hard to extend the original Flanagan et al. normalizer to handle top-level defines and side effects:
;; Input language: ;; <prog> ::= <dec> ... ;; <dec> ::= (define (<var> <name> ...) <exp>) ;; | (define <var> <exp>) ;; | <exp> ;; <exp> ::= (let ([<var> <exp>] ...) <exp>) ;; | (if <exp> <exp> <exp>) ;; | (set! <var> <exp>) ;; | (λ (<name> ...) <exp>) ;; | <number> ;; | <boolean> ;; | <string> ;; | <var> ;; Output language: ;; <prog> ::= <dec> ... ;; <dec> ::= (define <var> <exp>) ;; | (begin <dec> ...) ;; | <exp> ;; <aexp> ::= (λ (<name> ...) <exp>) ;; | <number> ;; | <boolean> ;; | <string> ;; | <var> ;; | (void) ;; <cexp> ::= (<aexp> <aexp> ...) ;; | (if <aexp> <exp> <exp>) ;; | (set! <var> <exp>) ;; <exp> ::= (let ([<var> <cexp>]) <exp>) ;; | <aexp> ;; | <cexp> (define (atomic? exp) (match exp [`(quote ,_) #t] [(? number?) #t] [(? boolean?) #t] [(? string?) #t] [(? char?) #t] [(? symbol?) #t] [(or '+ '- '* '/ '=) #t] [else #f])) ;; Expression normalization: (define (normalize-term exp) (normalize exp (λ (x) x))) (define (normalize exp k) (match exp [`(λ ,params ,body) (k `(λ ,params ,(normalize-term body)))] [`(let () ,exp) (normalize exp k)] [`(let ([,x ,exp1] . ,clause) ,exp2) (normalize exp1 (λ (aexp1) `(let ([,x ,aexp1]) ,(normalize `(let (,@clause) ,exp2) k))))] [`(if ,exp1 ,exp2 ,exp3) (normalize-name exp1 (λ (t) (k `(if ,t ,(normalize-term exp2) ,(normalize-term exp3)))))] [`(set! ,v ,exp) (normalize-name exp (λ (t) `(let ([,(gensym '_) (set! ,v ,t)]) ,(k '(void)))))] [`(,f . ,e*) (normalize-name f (λ (t) (normalize-name* e* (λ (t*) (k `(,t . ,t*))))))] [(? atomic?) (k exp)])) (define (normalize-name exp k) (normalize exp (λ (aexp) (if (atomic? aexp) (k aexp) (let ([t (gensym)]) `(let ([,t ,aexp]) ,(k t))))))) (define (normalize-name* exp* k) (if (null? exp*) (k '()) (normalize-name (car exp*) (λ (t) (normalize-name* (cdr exp*) (λ (t*) (k `(,t . ,t*)))))))) ;; Top-level normalization: (define (normalize-define def) (match def [`(define (,f . ,params) ,body) `(define ,f ,(normalize-term `(λ ,params ,body)))] [`(define ,v ,exp) `(begin ,@(flatten-top (normalize-term exp) v))])) (define (flatten-top exp v) (match exp [`(let ([,x ,cexp]) ,exp) (cons `(define ,x ,cexp) (flatten-top exp v))] [else `((define ,v ,exp))])) (define (normalize-program decs) (match decs ['() '()] [(cons `(define . ,_) rest) (cons (normalize-define (car decs)) (normalize-program rest))] [(cons exp rest) (cons (normalize-term exp) (normalize-program rest))]))
Code
- The classic a-normalizer.
- The extended a-normalizer.
- An a-normalizer for expanded Racket by Jay McCarthy.
More resources
For more on the compilation of functional languages, I recommend:
- Andrew Appel's Compiling with Continuations; and
- Christian Queinnec's Lisp in Small Pieces.
- My post on continuation-passing style.