A fast yet portable and reflectable struct library for Scheme using syntax-rules and macro-defining macros
This article covers the implementation of a fast yet flexible struct system in Scheme using syntax-rules and macro-defining macros. I was inspired by Oleg's article on how to build "identifiers" with syntax-rules.
Oleg points out that a common criticism of syntax-rules
macros--that they cannot produce new identifiers--is true but
misguided.
Using macro-defining macros, we can achieve substantially the same
effect as creating new identifiers.
Oleg proves this by building a basic record library using R5RS scheme.
For example, we can replace (make-foo field-value ...)
with
(make foo field-value ...)
and (foo->x struct)
with
(foo x struct)
.
The library below is based on Scheme's vectors, so it should have
performance on par with using a struct
in C.
Even the overhead of computing the index of the field is
melted away at compile-time.
The information necessary for run-time reflection is also computed
purely at compile-time, since the quote
form must (by
definition) yield a compile-time global data structure.
For instance, (make foo field-value ...)
expands into
(vector '(information about the struct and its fields) field-value ...)
.
And, (foo x struct)
will become (vector-ref struct
3)
at compile-time.
[Actually, it will expand into (vector-ref obj (+ 1 (+ 1 (+ 1
0)))
, but all Scheme compilers will fold the constants at
compile time.]
At the same time, (foo get 'x struct)
can use the
information stored in the struct to reflectivly compute the index of
field x
at run-time.
Thus, programmers don't have to choose between reflectivity and performance.
All of the code below uses pure R5RS and syntax-rules.
Vector-based struct library interface
A syntax-rules macro cannot introduce new identifiers, but it can
bind given identifiers to macros.
Thus, the core trick for the vector struct library is that
(define-vector-struct name field ...)
expands into
(define-syntax name ...)
.
The efficient, compile-time interface for the vector struct library is then:
(make name field-values ...)
to allocate a new vector struct.(name fields)
to get a compile-time list of the field names as symbols.(name index-of field)
to compute the offset of field at compile time.(name field struct)
to inline the access to field.(name field set! struct value)
to inline the mutation of field.(name ? struct)
to inline the type test.
For convenience, Curried forms of the above are also provided; for example:
(name field)
is the accessor for field.(name field set!)
is the mutator for field.(name ?)
is the predicate for the struct name.
It is also possible to look up the a field at run-time given its symbol. Thus, for convenience, there are also run-time variants of the above:
(name get field-exp struct)
evaluates field-exp to a symbol, and then fetches that field.(name set! field-exp struct value)
evaluates field-exp to a symbol, and then mutates that field.
The expected Curried forms of the above are also provided.
Finally, one need not even know the type of the struct to interact with it. The name of the struct and its field names are embedded in the first field of the vector as a compile-time quoted list constant. Hence, the library provides the following reflective methods:
(vector-struct? value)
is true if the supplied value is a vector-struct.(vector-struct->name struct)
returns the name of the vector struct.(vector-struct->fields struct)
returns a list of field names.(vector-struct-get struct field-exp)
returns the value of the field.(vector-struct-set! struct field-exp value)
mutates the field.
This library provides a fast, efficient implementation of structs based on vectors.
Where speed is concerned, programmers can use the first set of
commands, which desugars directly into (vector ...)
,
(vector-ref ...)
and (vector-set! ...)
with
no compile-time overhead.
Where more run-time flexibility is required, programmers can choose
the second set commands, which offers the ability to look-up and
mutate fields based on run-timed-generated symbol names when the type
of the struct is still known.
The third (purely reflective) set of commands, while least efficient,
is perfectly suited for applications where run-time reflection is
convenient (e.g. databases and web programming).