Lazy-list-based streams in Scala

Lazy lists are a concise and expressive solution to the problem of how to represent character and token sequences in a compiler. Lazy lists compute and then cache their tails on demand, instead of building the entire list all at once. This means that lazy lists can represent infinite lists, and they introduce pay-as-you-go memory consumption.

Because lazy lists don't compute their tails until needed, you can even throw an exception in the tail of the list without causing an error; for example, if *>> is the lazy list cons operator:

 val myInts = 3 *>> (throw new Exception("END OF LIST!") *>> LazyNil)
then this code won't throw an exception! (Until the program looks at the tail of myInts.)

Under the hood, a lazy list is either the empty lazy list, or it's a lazy cons cell. A lazy cons cell has a value for the head of the lazy list, but either a computation or a cached value for the tail of the list. The first time the tail is accessed, the stored computation is executed to produce the tail of the list, and then cached. Each subsequent access to the tail gets the cached value.

In a compiler, the lexical analysis phase converts a sequence of characters into a sequence of tokens. Lazy lists are an excellent data structure for representing both of these sequences.

A bad option for encoding the input sequence of characters would be to use a C-like getchar() call. Occasionally, lexers need to do back-tracking, and the ANSI C standard guarantees only one character can be un-getchar'd. The parser may also want to do look-ahead peeking and backtracking, and lazy lists of tokens provide a convenient way to do this as well. Reading the entire program into an array or an ordinary list isn't a great option either, since it forces the entire input program to be held in memory all at once. (This is less of a concern for modern machines.)

Scala happens to be a great language for implementing lazy lists in a natural way, thanks to its support for by-name parameters, lazy fields, custom operators and first-class view patterns. The demonstratitve implementation of lazy lists (below) shows off these features.

Scala also supports streams in the Scala API, but without the syntactic sugar.

Exercise

Implement a CharLazyList extension to LazyList class, which has three extra fields: position, line, column. Have CharLazyList update these fields automatically.

matt.might.net is powered by linode | legal information