What is template meta-programming?
Template meta-programming allows a programmer to perform arbitrary computations and program transformations at compile-time.
It's all thanks to an accident of design.
The combination of templates and partial template specialization allows one to inject a purely functional term-rewriting system into the template-expansion phase of the C++ compiler.
(I gave a lengthier introduction to this topic in an earlier post on embedding an interpreter for the lambda calculus in C++ templates.)
Template meta-programming tip
I advocate a style of template meta-programming that trusts (but verifies) that the
compiler will perform certain optimizations, such as inlining functions
marked inline and folding constants.
Consequently, I don't have to perform all calculations at the template level; this saves labor, and meta-programs end up with a more natural look.
When debugging performance, it's useful to use the gcc flags
-Winline -finline-functions -O2
GCC will report any failed attempt at inlining a function marked inline.
The DSEL
As with previous posts, we'll create a domain-specific embedded language (DSEL) for describing anonymous closures.
Since this is a proof of concept, this DSEL has only five kinds of expressions: sum, product, exponentiation, function arguments and literal values.
More could be added without difficulty.
To implement DSELs, the trick is to overload operations so that instead of executing those operations, they construct a data structure representing the computation.
That structure can then be reinterpreted with the semantics of the DSEL.
What makes template meta-programming interesting is that this reinterpretation can happen at compile-time, and that structure can be packed into the type, rather than just a run-time value.
(Of course, this isn't surprising to Lisp programmers.)
We'll need a struct to describe each kind of non-literal expression:
// E1 + E2 template <typename R1, typename R2> struct Sum ; // E1 * E2 template <typename R1, typename R2> struct Prod ; // E1 ^ E2 template <typename R1, typename R2> struct Pow ; // nth argument, takes type T: template <typename T, int n> struct Arg ;
It's useful to predeclare a few variables to stand in as arguments, but it's also possible to define them on the fly using what become no-op functions:
Arg<double,0> $0d ; Arg<double,1> $1d ; Arg<int,0> $0i ; Arg<int,1> $1i ; // $<T,position>() : Arg<T,position> template <typename T, int position> static inline Arg<T,position> $() ;
The goal is to give an expression like
a * $0d + 3
a type like
Sum< Prod< double , Arg< double , 0 > >, int >
so that the type, which is amenable to inspection by meta-programs, contains enough information about the structure of the computation to reconstitute that structure.
But, the type doesn't contain the value of a or even the
constant 3; how can a meta-program reconstitute the computation
from its type?
The answer is that these values must be packed into the run-time struct
created by the expression a*$0d + 3, so that a meta-program can
recurse over this struct as it unfolds the type back into a
computation.
Take a look at the definition of the struct for Sum:
template <typename R1, typename R2>
struct Sum : public Exp<Sum<R1,R2> > {
const R1 lhs ;
const R2 rhs ;
Sum<R1, R2> (R1 l, R2 r) : lhs(l), rhs(r) {}
} ;
So, a Sum contains the run-time values of both its operands (if any).
What's nice about this approach is that if an operand is an argument, it's a struct of size zero, so it disappears. If everything is known statically for the entire closure, the whole struct disappears; this approach is pay-as-you-go.
Sum inherits from Exp, and
oddly, it passes its own type as the parameter.
This cyclic trickery gives the base class a handle on the derived class.
This handle is useful for pushing all of the syntactic sugar in the
DSEL into the definition of the struct Exp:
template <typename T>
struct Exp {
// Operators build up a syntax tree:
template <typename M>
inline Sum<T,M> operator + (M m) ;
template <typename M>
inline Prod<T,M> operator * (M m) ;
template <typename M>
inline Pow<T,M> operator ^ (M m) ;
// Application inlines the function:
inline
typename Infer<T>::type
operator () () ;
template <typename A0>
inline
typename Infer<T>::type
operator () (A0 arg0) ;
template <typename A0, typename A1>
inline
typename Infer<T>::type
operator () (A0 arg0, A1 arg1) ;
} ;
The overloads for sum, product and exponentation (what was XOR) are straightforward.
The overloads for application, which allow a static closure to be applied as if it were a regular function, take some explaining.
A meta-program for type inference
It's annoying to have to write down the return type of an anonymous function. It clutters things up.
To get around this, a template meta-program can infer the return type of a static closure.
If T is the type of a static closure expression,
then Infer<T>::type is the type that closure
will return if invoked.
As the portion of Infer that handles inference of types over
sums shows, it is a straightforward meta-program:
template <typename E>
struct Infer {
// If it's not a syntax expression,
// then it's a free value, so its
// type is itself:
typedef E type ;
} ;
// Arguments carry their type:
template <typename T, int n>
struct Infer< Arg<T,n> > {
typedef T type ;
} ;
// type(double + double) => double
template < >
struct Infer< Sum<double,double> > {
typedef double type ;
} ;
// type(double + int) => double
template < >
struct Infer< Sum<double,int> > {
typedef double type ;
} ;
// type(int + double) => double
template < >
struct Infer< Sum<int,double> > {
typedef double type ;
} ;
// type(int + int) => int
template < >
struct Infer< Sum<int,int> > {
typedef int type ;
} ;
// type(E1 + E2) => type(type(E1) + type(E2))
template <typename E1, typename E2>
struct Infer < Sum<E1,E2> > {
typename Infer<E1>::type typedef E1T ;
typename Infer<E2>::type typedef E2T ;
typename Infer<Sum<E1T,E2T> >::type typedef type ;
} ;
The rules for Prod and Pow are similar.
The meta-compiler
What makes anonymous closures actually work is the meta-compiler,
Inline, which recurs over the type of the anonymous closure to
produce its computation:
template <typename T>
struct Inline {
// In the base case, we have a run-time value
// stored in the struct built by the static
// closure, so we inline it here.
static inline T at (const T dynamicValue) {
return dynamicValue ;
}
template <typename A0>
static inline T at (const T dynamicValue, A0 arg0) {
return dynamicValue ;
}
template <typename A0, typename A1>
static inline T at (const T dynamicValue,
A0 arg0,
A1 arg1) {
return dynamicValue ;
}
} ;
// compile($0, x0, ...) => x1
template <typename X1>
struct Inline < Arg<X1,0> > {
template <typename A0>
static inline A0 at (const Arg<X1,0> rtVal, A0 arg0) {
return arg0 ;
}
template <typename A0, typename A1>
static inline A0 at (const Arg<X1,0> rtVal, A0 arg0,
A1 arg1) {
return arg0 ;
}
} ;
// compile($1, x0, x1, ...) => x1
// ...
// compile(E1 + E2, ...) =>
// compile(E1, ...) + compile(E2, ...)
template <typename E1, typename E2>
struct Inline <Sum<E1,E2> > {
typedef Sum<E1,E2> Base ;
typename Infer<Base>::type typedef T ;
static inline T at (const Sum<E1,E2> rtVal) {
return Inline<E1>::at(rtVal.lhs) +
Inline<E2>::at(rtVal.rhs) ;
} ;
template <typename A0>
static inline T at (const Sum<E1,E2> rtVal, A0 arg0) {
return Inline<E1>::at(rtVal.lhs, arg0) +
Inline<E2>::at(rtVal.rhs, arg0) ;
} ;
template <typename A0, typename A1>
static inline T at (const Sum<E1,E2> rtVal, A0 arg0,
A1 arg1) {
return Inline<E1>::at(rtVal.lhs, arg0, arg1) +
Inline<E2>::at(rtVal.rhs, arg0, arg1) ;
} ;
} ;
The rules for Prod and Pow
look much like those for Sum.
In action
Here's the code to INTEGRATE,
as motivated by the introduction:
template < typename F >
static inline
double INTEGRATE (F f, double a, double b, int n) {
double delta = (b - a) / n ;
double area = 0.0 ;
double x = a ;
for (int i = 0; i < n; ++i, x += delta) {
double y = f(x) ;
area += (y * delta) ;
}
return area ;
}
This function assumes that a static closure will be passed in as the first argument (or some other function that overloads apply).
And, the static closure will unfold itself
at the point of application: f(x).
Here are a couple more examples of how static closures can be used:
double j = ($1i + 10 + $0d)(39.0, 50) ; Arg<double,0> x ; double m = 0.37 ; double b = 0.41 ; double A = INTEGRATE(x*m + b, 0.0, 3.0, 2000) ;
Source code
The full source code, with examples, is available: SC.cpp.
More resources
- Boost::Proto.
-
The C++ Programming Language
by C++ creator Bjarne Stroustrup.
- My post on embedding lambdas in C++.
- My lambda-calculus interpreter as a C++ template metaprogram.
-
C++ Templates: The Complete Guide
, which I used for these posts.