Understand and implement laziness with examples in Scala, JavaScript, Swift and Racket

[article index] [] [@mattmight] [+mattmight] [rss]

At first glance, laziness seems to be about efficiency.

In computing, laziness means delaying the computation of a value until the value is necessary.

(And, if the value is never necessary, it is never computed.)

In reality, laziness is more about expressiveness: laziness allows the expression of programs that would otherwise not terminate.

In some languages (like Scala, Swift and Perl), laziness also makes it possible to add new constructs to a language without using macros.

Since most languages do not provide support for laziness, this article explores how to implement laziness in languages that support closures.

We’ll look at manual transformations from languages that have native support for laziness (Scala) into (1) languages that do not support laziness (JavaScript) and (2) languages that have partial support (Swift).

We’ll also use macro-generating macros in Racket to implement native and transparent support for multiple kinds of laziness in a strict language.

What is laziness in languages?

Most languages are strict languages: they compute the value of an expression immediately.

Some languages (like Haskell) are lazy: every expression’s evaluation waits for its (first) use.

Other languages (like Scala) are strict by default, but lazy if explicitly specified for given variables or parameters.

As such, Scala (even if you’ve never seen the language), makes it easy to demonstrate the difference.

The following Scala program:

val x = { print ("foo") ; 10 }
print ("bar")
print (x)

will print foo, then bar then 10.

But, the following Scala program:

lazy val x = { print ("foo") ; 10 }
print ("bar")
print (x)

will print bar, then foo, then 10, since it delays the computation of x until it’s actually needed.

What is laziness?

Of what is laziness made?

Laziness is made of lambdas – anonymous functions closed over their lexical scope.

Let’s translate the Scala example from earlier into JavaScript, a language that lacks laziness:

var x = ( console.log("foo"), 10 )
console.log("bar") 
console.log(x)

As expected, this prints foo, then bar, then 10.

But, what if we wanted to make x lazy?

We can do that by wrapping the definition of x in an anonymous function, and then the use of x in an invocation of that function:

var x = function () { return ( console.log("foo"), 10 ) }
console.log("bar") 
console.log(x())

This prints bar, then foo then 10.

To demonstrate a subtlety, let’s change the Scala example from earlier:

lazy val x = { print ("foo") ; 10 }
print ("bar")
print (x)
print (x)

which will now print bar, then foo, then 10, then 10.

It did not print foo twice, because lazy cached the result.

But, of course:

var x = function () { return ( console.log("foo"), 10 ) }
console.log("bar") 
console.log(x())
console.log(x())

does print foo twice – once for each call to x.

By-name versus by-need

There are two reasonable interpretations of laziness.

In the by-name interpretation, the computation of a value is delayed until necessary, and the computation is repeated each time it is used.

In the by-need interpretation, the computation of a value is delayed until necessary, and then the result is cached for subsequent uses.

In Scala, the lazy keyword provides by-need laziness.

Simply wrapping values in anonymous functions provides by-name laziness.

Laziness in Swift

Swift provides a @lazy attribute for variables, but it only works on fields for classes and structs.

Swift’s block notation for closures makes it a little less verbose to fake laziness in Swift.

The prior example in Swift would be:

let x : () -> Int = { print("foo"); return 10 }
print("bar")
print(x())
print(x())

As in JavaScript, this prints foo twice.

Swift also provides an @auto_closure attribute for types, which can provide by-name variables, but @auto_closure is more useful for by-name parameters, to be covered momentarily.

To use @auto_closure for a by-name variable, we would have to lift the side effects out, leading to the following:

func footen() -> Int {
  print("foo")
  return 10 
}

let x : @auto_closure () -> Int = footen()
print("bar")
print(x())
print(x())

which would end up printing foo twice.

Clearly, the explicit closure form was more convenient.

Implementing by-need laziness

To get by-need laziness in JavaScript, we can create a helper function, memothunk that will cache the value of a no-argument anonymous procedure it receives:

function memothunk(f) {
  var cached = false ;
  var result ;
  return function () {
    if (cached)
      return result ;
    cached = true ;
    result = f() ;
    return result ;
  } ;
}

We can change the prior example:

var x = memothunk(function () { return ( console.log("foo"), 10 ) })
console.log("bar") 
console.log(x())
console.log(x())

so that it prints out bar, then foo, then 10, then 10.

In Swift, we can implement memothunk as well:

func memothunk<T> (f : () -> T) -> (() -> T) {
    var cache : T? = nil
    return {
        if let x = cache {
            return x
        } else {
            cache = f()
            return cache!
        }
    }
}

and run the analogous code:

let x : () -> Int = memothunk({print("foo") ; return 10})
print("bar")
print(x())
print(x())

which prints bar, then foo, then 10, then 10.

Implementing laziness with eta-expansion

Implementing laziness in a language like JavaScript requires modifying both the point of definition and the point of use.

The exception to this case is when the computation being suspended already produces a procedure.

In this case, we can achieve laziness through eta-expansion at the point of definition.

In eta-expansion, a function-producing expression is wrapped in an anonymous function that passes its arguments directly to the expression.

Consider an expression exp that evaluates to a function that takes one argument.

In JavaScript, the function (x) { return ( exp )(x) } behaves identically to exp, in the absence of side effects in exp and assuming exp terminates.

Eta-expansion plays a key role in the derivation of a Y combinator that works in strict languages like JavaScript.

Lazy parameters

Some languages allow the definition of explictly lazy parameters to functions.

Lazy parameters are not evaluated until they’re used, and in languages that allow explicitly marked lazy parameters, lazy parameters are by-name.

In contrast, in lazy languages like Haskell, every parameter is implicitly by-need.

Consider this code in Scala:

def printTwice(x : Int) {
  print ("twice")
  print (x) 
  print (x)
}

printTwice( { print("foo") ; 10 } )

This will print foo, then twice, then 10, then 10.

If, however, we mark the parameter x as by-name (with =>):

def printTwice(x : => Int) {
  print ("twice")
  print (x) 
  print (x)
}

printTwice( { print("foo") ; 10 } )

then it prints twice, then foo, then 10, then foo, then 10.

To be clear, we don’t have to supply a code block; we can pass anything of type Int, so that:

printTwice(10)

prints twice, then 10, then 10.

Lazy parameters in Swift

In Apple’s language Swift, by-name parameters are marked with @auto_closure, so that:

func printTwice(x : @auto_closure () -> Int) {
    print("twice")
    print(x())
    print(x())
}

printTwice(10)

will implicitly wrap a closure around the argument 10, printing twice, then 10, then 10.

Of course, if the evaluation of the argument has side effects, then those side effects are repeated. The following code:

func ten() -> Int {
    print("ten")
    return 10
}

printTwice(ten())

prints twice, then ten, then 10, then ten, then 10.

From a language design standpoint, it seems that Swift’s designers wanted to allow lazy parameters, but still force the programmer to consider the fact that the argument will be wrapped in a procedure, which carries a cost.

Lazy parameters in JavasScript

Unfortunately, lazy parameters require care in JavaScript. There is no way to modify a funtion so that, whenever it is called, its arguments are automatically wrapped in closures.

Rather, the user must remember to wrap the arguments in closures at every point of use.

Extending a language with by-name parameters

By-name parameters are a convenient way of extending a language with new constructs.

For example, suppose Scala didn’t have a while construct.

We could add one with laziness (and Currying):

def myWhile (cond : => Boolean) (body : => Unit) {
  if (cond) { body ; myWhile (cond) (body) }
}

and then use it:

var x = 3

myWhile (x > 0) {
  print (x)
  x = x - 1   
}

to print 3, then 2, then 1.

Or, in Swift, we might add a togglable assert construct:

var AssertsEnabled : Bool = true ;

func assert(cond : @auto_closure () -> Bool) {
    if (AssertsEnabled && cond()) {
        println("Assert failed!") ;
        exit(EXIT_FAILURE) ;
    }
}

In this way, it’s fine to use assert(expensiveTest()) and know that the cost of that test will only be paid if assertions are enabled.

At first, Swift’s support for extending the language does not seem to be as fluid as Scala’s. This is the closest I could get to a custom while form:

func myWhile (cond : @auto_closure () -> Bool) (body : () -> ()) {
    if (cond()) { body() ; myWhile(cond)(body: body) }
}

var y : Int = 3

myWhile (y > 0) (body: {
  print (y)
  y = y - 1
})

But, on twitter, @jspahrsummers pointed out that (somewhat confusingly) uncurrying actually makes it work in Swift:

func myWhile (cond : @auto_closure () -> Bool, body : () -> ()) {
    if (cond()) { body() ; myWhile(cond, body) }
}

var y : Int = 3

myWhile (y > 0) {
  print (y)
  y = y - 1
}

Implementing laziness in Racket

Racket has implicit support for manual by-need laziness in the form of promises.

In Racket, we can wrap delay around any expression to have it converted into a promise.

Applying force to a promise carries out the computation, or if the computation has already been completed, it returns the cached value.

The examples from earlier look like this in Racket:

(define x (delay (begin (display "foo") 10)))
(display "bar")
(display (force x))
(display (force x))

which prints out bar, then foo, then 10, then 10.

It would be nice, of course, if we could write this as:

(define/lazy x (begin (display "foo") 10))
(display "bar")
(display x)
(display x)

and get the same effect.

This is Racket. So, of course we can.

We need to create the define/lazy.

To understand how this is possible, you first need to know that even variables in Racket can be macro-expanded.

For instance, we can define a macro called three, which will expand into 3, so that the following code:

(define-syntax (three stx) #'3)
(display three)

prints 3.

With this in mind, we can take a (failed) shot at a macro-generating macro:

(define-syntax (define/lazy stx)
  (syntax-case stx ()
    [(_ var value)
     (with-syntax ([($var)   (generate-temporaries #'(var))])
       #'(begin
           (define-syntax (var _)
             #'(force $var))
           (define $var (delay value))))]))

The define/lazy macro defines a macro for the orginal variable var, and it assigns a special temporary variable $var to the delay of the original expression, value.

The macro definition forces the temporary variable $var.

This code works just fine for the prior example, but there’s a subtle mistake in this code that shows up if we try to do something else.

Suppose we try to lazily define a procedure:

(define/lazy f (λ (x) (+ x 1)))
(display (f 10))

We’d expect this to print 11, but to my initial horror, it printed #<procedure>.

Examining the original code reveals why: when we use f in a function position, it becomes a classic macro expansion, so that instead of expanding just f, it macro expands (f 10).

Oops.

To fix this, we need to have syntax-case discriminate between these two cases:

(define-syntax (define/lazy stx)
  (syntax-case stx ()
    [(_ var value)
     (with-syntax ([($var)   (generate-temporaries #'(var))])
       #'(begin
           (define-syntax (var stx*)
             (syntax-case stx* ()
               [(f arg (... ...)) #'((force f) arg (... ...))]
               [_ #'(force $var)]))
           (define $var (delay value))))]))

With this definition, this code:

(define/lazy f (λ (x) (+ x 1)))
(display (f 10))

behaves exactly as expected, printing 11.

Implementing by-need in Racket

Using the same techniques as above, it is straightforward to create a define/by-need form in Racket that turns all of the parameters in a function definition into by-need parameters.

In this case, this code:

(define/by-need (h a b c)
  (+ a b))

(h 10 20 (error "fail"))

prints 30 because (error "fail") never gets evaluated.

To make define/by-need work, the redefinition of the function will macro-expand the parameters as in the definition of define/lazy:

(define-syntax (define/by-need stx)
  (syntax-case stx ()
    [(_ (f v ...) body ...)
     (with-syntax ([($f) (generate-temporaries #'(f))])
       #'(begin
           (define-syntax f
             (syntax-rules ()
               [(_ arg (... ...))
                ($f (delay arg) (... ...))]))
           (define ($f v ...)
             (let-syntax ([v (lambda (stx*)
                               (syntax-case stx* ()
                                 [(_ arg (... ...)) 
                                  #'((force v) arg (... ...))]
                                 [_ #'(force v)]))] ...)
               body ...))))]))

Related pages

For more on functional programming (for non-functional programmers), I have recommendations:

  • For learning a purely lazy language (like Haskell), there's Real World Haskell by Bryan O'Sullivan, John Goerzen and Don Stewart:
  • For learning Scala, Martin Odersky has written an excellent guide with co-authors Lex Spoon and Bill Venners:
  • And, for purely functional data structures, there's Okasaki's classic:
    Purely Functional Data Structures by Chris Okasaki

From my blog:

Exercises

  1. Implement define/by-name in Racket.

  2. Implement lambda/by-need in Racket.

  3. Get the by-name Y-combinator to work for factorial:

     (define/by-something (y f) (f (y f)))
    
     (define fact (y (lambda/something (f)
                      (lambda/something (n)
                        (if (= n 0)
                            1
                            (* n (f (- n 1)))))))
    
     (display (fact 5)) ; prints 120
    

    Hint: Think about which expression needs to be delayed.