A fast yet portable and reflectable struct library for Scheme using syntax-rules and macro-defining macros

This article covers the implementation of a fast yet flexible struct system in Scheme using syntax-rules and macro-defining macros. I was inspired by Oleg's article on how to build "identifiers" with syntax-rules.

Oleg points out that a common criticism of syntax-rules macros--that they cannot produce new identifiers--is true but misguided. Using macro-defining macros, we can achieve substantially the same effect as creating new identifiers. Oleg proves this by building a basic record library using R5RS scheme. For example, we can replace (make-foo field-value ...) with (make foo field-value ...) and (foo->x struct) with (foo x struct). The library below is based on Scheme's vectors, so it should have performance on par with using a struct in C.

Even the overhead of computing the index of the field is melted away at compile-time. The information necessary for run-time reflection is also computed purely at compile-time, since the quote form must (by definition) yield a compile-time global data structure. For instance, (make foo field-value ...) expands into (vector '(information about the struct and its fields) field-value ...). And, (foo x struct) will become (vector-ref struct 3) at compile-time. [Actually, it will expand into (vector-ref obj (+ 1 (+ 1 (+ 1 0))), but all Scheme compilers will fold the constants at compile time.] At the same time, (foo get 'x struct) can use the information stored in the struct to reflectivly compute the index of field x at run-time. Thus, programmers don't have to choose between reflectivity and performance.

All of the code below uses pure R5RS and syntax-rules.

Vector-based struct library interface

A syntax-rules macro cannot introduce new identifiers, but it can bind given identifiers to macros. Thus, the core trick for the vector struct library is that (define-vector-struct name field ...) expands into (define-syntax name ...).

The efficient, compile-time interface for the vector struct library is then:

  • (make name field-values ...) to allocate a new vector struct.
  • (name fields) to get a compile-time list of the field names as symbols.
  • (name index-of field) to compute the offset of field at compile time.
  • (name field struct) to inline the access to field.
  • (name field set! struct value) to inline the mutation of field.
  • (name ? struct) to inline the type test.

For convenience, Curried forms of the above are also provided; for example:

  • (name field) is the accessor for field.
  • (name field set!) is the mutator for field.
  • (name ?) is the predicate for the struct name.

It is also possible to look up the a field at run-time given its symbol. Thus, for convenience, there are also run-time variants of the above:

  • (name get field-exp struct) evaluates field-exp to a symbol, and then fetches that field.
  • (name set! field-exp struct value) evaluates field-exp to a symbol, and then mutates that field.

The expected Curried forms of the above are also provided.

Finally, one need not even know the type of the struct to interact with it. The name of the struct and its field names are embedded in the first field of the vector as a compile-time quoted list constant. Hence, the library provides the following reflective methods:

  • (vector-struct? value) is true if the supplied value is a vector-struct.
  • (vector-struct->name struct) returns the name of the vector struct.
  • (vector-struct->fields struct) returns a list of field names.
  • (vector-struct-get struct field-exp) returns the value of the field.
  • (vector-struct-set! struct field-exp value) mutates the field.

This library provides a fast, efficient implementation of structs based on vectors. Where speed is concerned, programmers can use the first set of commands, which desugars directly into (vector ...), (vector-ref ...) and (vector-set! ...) with no compile-time overhead. Where more run-time flexibility is required, programmers can choose the second set commands, which offers the ability to look-up and mutate fields based on run-timed-generated symbol names when the type of the struct is still known. The third (purely reflective) set of commands, while least efficient, is perfectly suited for applications where run-time reflection is convenient (e.g. databases and web programming).

Code

matt.might.net is powered by linode | legal information