C++ templates: Creating a compile-time higher-order meta-programming language

[article index] [] [@mattmight] [rss]

For the Halloween lecture in my advanced compilers class, I scare students with C++ template meta-programming.

To prove that C++ templates are Turing-complete, we constructed a higher-order functional programming language out of them.

The language supports higher-order functions, literal values (raw types), natural numbers (as types), booleans and conditionals.

Specifically, we made an SICP-like eval/apply interpeter out of templates.

This embedded language shows that any computable function can be computed at compile-time in C++.

How: Template specialization

The Turing-completeness of templates in C++ is an accident arising from the collision of two features: templates and template specialization.

These two features allow C++ templates to act like an untyped term-rewriting language.

A template declares a data type parameterized by some constants and by some types; for example:

template <typename A>
struct ListNode
{
  A data ;
  ListNode<A>* next ;
} ; 

A template specialization allows the programmer to override the definition of a template for some combination of template parameters.

For example, we could override the definition of ListNode when parameterized with the type unsigned int:

template <>
struct ListNode<unsigned int>
{
  /* Don't let them use unsigned ints. */
  int data ;
  ListNode<int>* next ;
} ;

The standard example of why you would want to specialize is so that you could have Vector<bool> implemented as a true bit-vector, instead of wasting one word per element.

Factorial example

The canonical example of compile-time computation is factorial in templates:

template <int N>
struct Factorial 
{
  enum { value = N * Factorial<N-1>::value };
};
 
template <>
struct Factorial<0> 
{
  enum { value = 1 } ;
};

Without the template specialization, a reference to Factorial<4>::value would re-write itself forever, eventually causing the template equivalent of a stack overflow.

Note the use of enumerations to force compile-time evaluation of the member constant value.

With this template in place, code could refer to Factorial<4>::value and get 24 computed at compile-time.

Encoding and pattern-matching data structures

In C++ template meta-programming, types are used to encode structured data, and specializations are used to destructure that data with pattern-matching.

For example, we could encode the natural numbers using a Zero type, and a Succ<Number> template type:

struct Zero 
{
  enum { value = 0 } ;
} ;

template <typename N>
struct Succ 
{
  enum { value = N::value + 1 } ;
} ;

We could then make a template that matches the value 1 encoded as Succ<Zero>:

template <typename N>
struct MatchOne
{
  enum { value = 0 } ;
} ;


template <>
struct MatchOne<Succ<Zero> >
{
  enum { value = 1 } ;
} ;

so that the following code works:

cout << MatchOne<Zero >::value << endl; 
// Prints: 0

cout << MatchOne<Succ<Zero> >::value << endl; 
// Prints: 1

cout << MatchOne<Succ<Succ<Zero> > >::value << endl; 
// Prints: 0

Proving Turing-completeness

To prove Turing-completeness, we have to show that C++ templates can be implemented by a Turing machine (gcc takes care of this), and that C++ templates can implement a Turing machine.

It's well known that the λ-calculus (which actually predates the Turing machine) is Turing-complete, so just have to show that C++ templates can implement the λ-calculus.

In fact, in an effort to make this (a little) more than a toy exercise, the evaluator can handle conditionals, booleans and literal values too.

One could actually do some programming with this language.

The commented code below explains how this is done.

We don't have to utilize much of the C++ language to pull this off.

We don't need templates that generate templates, embedded templates or even templates that accept templates.

We use templates, template specialization, struct, typename and typedef.

And, it's barely 200 lines of code with comments included.

An abstract syntax in templates

We'll use integers for variables names (although we can easily provide aliases for them).

We need to use instantiated template types to represent program terms:

// Anonymous functions:
template 
struct Lambda {} ;

// Function call/Application:
template 
struct App {} ; 

// References:
template 
struct Ref {} ;

// Conditionals:
template 
struct If {} ;

// Literals:
template 
struct Lit {} ;

Environments as templates

Environments, which map names to values, are type-level linked lists:

// EmptyEnv is the empty environment:
struct EmptyEnv ;

// Bindings<Name,Value,Env> is a type than encodes the environment Env
// extended with a binding for Name => Value:
template <int Name, typename Value, typename Env>
struct Binding {} ;

The template metafunction EnvLookup accepts a name and an environment to yield a value:

// EnvLookup<Name,Env> :: result looks up the value of Name in Env:
template <int Name, typename Env>
struct EnvLookup {} ;

template <int Name>
struct EnvLookup <Name,EmptyEnv> {} ; // Name not found.

template <int Name, typename Value, typename Env>
struct EnvLookup <Name, Binding<Name,Value,Env> > 
{
  Value typedef result ;
} ;

template <int Name, int Name2, typename Value2, typename Env>
struct EnvLookup <Name, Binding<Name2,Value2,Env> >
{
  typename EnvLookup<Name,Env> :: result typedef result ;
} ;

EnvLookup illustrates an important convention: to define a type-level meta-function, we typedef its return value to the field result.

Accessing the field result forces the expansion and yields the return value.

Type-level values

Of course, run-time "values" in a template meta-program are actually types:

// Closures:
template 
struct Closure {} ;

// Booleans:
struct True {} ;
struct False {} ;

Eval and apply as meta-functions

Finally, we define eval and apply as type-level meta-functions:

// Eval<Exp,Env> :: result is the value of expression Exp in
// environment Env.
template <typename Exp, typename Env>
struct Eval {} ;

// Apply<Proc,Value> :: result is the value of applying Proc to Value.
template <typename Proc, typename Value>
struct Apply {} ;



// Literals evaluate to themselves:
template <typename T, typename Env>
struct Eval <Lit<T>, Env>
{
  T typedef result ;
} ;

// Variable references are looked up in the current environment:
template <int Name, typename Env>
struct Eval <Ref<Name>, Env>
{
  typename EnvLookup<Name, Env> :: result typedef result ;
} ;

// Lambda terms evaluate into closures:
template <int Name, typename Body, typename Env>
struct Eval <Lambda<Name,Body>, Env>
{
  Closure<Lambda<Name, Body>, Env> typedef result ;
} ;

// Applications apply the value of the function expression to the
// value of the argument expression:
template <typename Fun, typename Arg, typename Env>
struct Eval<App<Fun, Arg> , Env> {
  typename Apply<typename Eval<Fun,Env> :: result ,
                 typename Eval<Arg,Env> :: result > :: result 
           typedef result ;
} ;

// Branch true:
template <typename Then, typename Else, typename Env>
struct Eval<If<True,Then,Else>,Env> {
  typename Eval<Then,Env> :: result typedef result ;
} ;

// Branch false:
template <typename Then, typename Else, typename Env>
struct Eval<If<False,Then,Else>,Env> {
  typename Eval<Else,Env> :: result typedef result ;
} ;

// Evaluate the condition:
template <typename Cond, typename Then, typename Else, typename Env>
struct Eval<If<Cond,Then,Else>,Env> {
  typename Eval<If<typename Eval<Cond,Env> :: result, 
                   Then,
                   Else>,
                Env> :: result 
           typedef result ;
} ;


// Transition to the body of the lambda term inside the closure:
template <int Name, typename Body, typename Env, typename Value>
struct Apply<Closure<Lambda<Name,Body>, Env>, Value> {
  typename Eval<Body, Binding<Name,Value,Env> > :: result 
           typedef result ;
} ;

All together

Now, we can construct and execute simple template meta-programs:

  // Testing ((lambda (x) x) 2):
  enum { X } ;
  
  int x = Eval<App<Lambda<X,Ref<X> >,Lit<Succ<Succ<Zero>
     > > >,EmptyEnv> :: result :: value ;

  assert(x == 2) ;


  // Testing (if #f 0 1):
  int y = Eval<If<Lit<False>,Lit<Zero>,Lit<Succ<Zero>
     > >,EmptyEnv> :: result :: value ;
  
  assert(y == 1) ;

Code

The code is available.

Resources and related pages

  • C++ Templates: The Complete Guide is the definitive guide to C++ templates and template meta-programming:
    It's written by two C++ standards committee members. The afternotes in each chapter which mention the proceedings of the standards committee are fascinating. In case you're wondering, the Turing-completeness of templates was first an accident, and then a feature.
  • There's another implementation of a lambda-calculus-based language for C++ templates out there.
  • Structure and Interpretation of Computer Programs, the classic MIT textbook, covers eval/apply interpreters in Chapter 4:

Exercise

If you would like to practice some of the principles on display in this article, I'm providing a challenge problem: implementing addition as a template meta-program.

The stub code is available.

You need to create (two) specializations of the Add struct to make it work.