You don't understand exceptions, but you should

[article index] [] [@mattmight] [+mattmight] [rss]

When writing a compiler, it's hard to get exceptions right.

Exceptions seem straightforward: just unwind the stack until you hit a handler--right?

No. That's "not even wrong."

Exceptions are not fundamentally built upon stacks.

(After all, stackless implementations still have exception-handling.)

Trying to understand exceptions in terms of the stack is like trying to explain the world in purely Newtonian terms: it's a reasonable model, but it fails to capture (rare but important) corner-case behavior.

A "quantum" understanding of exceptions requires escape continuations.

Exceptions interact nontrivially with other language features like while and for loops (or rather, with break and continue), and even with return.

When you throw in finally blocks, the complexity skyrockets.

Consider some Python:

  def f():
    try: 
      return 10
    finally:
      print("got here!")
    return 20

The function above returns 10, but prints "got here!"

  def f():
    try: 
      raise Exception()
    except:
      return 10
    finally:
      print("got here!")
    return 20

The function above returns 10, but still prints "got here!"

  def f():
    try: 
      raise Exception()
    except:
      return 10
    finally:
      print("got here!")
      return 20

The function above returns 20 (and still prints "got here!")

  while True:
    try:
      continue
    finally:
      print "got here!"

This is an infinite loop that prints "got here!" forever.

In this article, I'll explain how to implement return, while, break, continue, try, catch, throw and finally in terms of efficient escape continuations.

To understand exceptions is to implement exceptions.

The provided code uses Racket macros to add all of these features to the language. It desugars them in terms of a single construct: call/ec.

(The techniques are not Racket-specific; for example, if you're compiling to C, you can fake call/ec with setjmp and longjmp.)

My basic model for exception-handling is lifted directly from Andrew Appel's masterwork, Compiling with Continuations.

Why Python and Racket?

My undergrad compilers class has to implement a compiler for Python.

The intermediate representation used by their compiler is a cut-down Racket extended with Pythonic control-flow constructs.

That's why you'll find motivating examples in Python that get desugared into a Racket-like code.

A note on call/ec

This code makes heavy use of call/ec.

Not all languages support escape continuations, but many support the (more general) full continuations. Full continuations work in place of escape continuations with a mild performance penalty.

You can also simulate call/ec in C with setjmp and longjmp.

Implementing return

Newcomers to functional programming are often struck by the lack of a return statement (and, I suppose, by the lack of statements entirely).

There are natural coding patterns that exploit the ability to bail out of a function early, like the following:

  def f():
    if easy case:
       return this
    if also easy case:
       return that

    do something 
    complicated
    down here

Fortunately, it's easy to add a return statement to functional languages that support continuations.

A return statement is a "second-class escape continuation."

Conversely, think of an escape continuation as a first-class return statement.

That is, imagine that return was just a special procedure bound every time you entered a function, and that to return from the function, you passed the return value to this special procedure.

In Racket, we can grab the escape continuation for a function with call-with-escape-continuation, or call/ec for short.

If we want to add a define/return form to the language so that the following works:

(define/return (max a b)
  (when (> a b)
    (return a))
  
  (return b))

we can create a macro that desugars it into:

(define (max a b)
 (call/ec (lambda (return)
  (when (> a b)
    (return a))
  
  (return b))))

In Racket:

(define-syntax (define/return stx)
  (syntax-case stx ()
    [(_ f-params body ...)
     ; =>
     (with-syntax ([return (datum->syntax #'f-params 'return)])
     #'(define f-params 
         (call/ec
          (λ (return) body ...))))]))

It's just as easy to add a lambda/return form.

Implementing while

A while loop seems easy to implement in terms of tail-recursion or gotos.

In fact, it is.

But, break and continue complicate matters--especially when it comes time to implement exceptions.

Fortunately, break and continue, like return, are also escape continuations.

It's also best to implement these features as escape continuations.

Take the general while loop form in Python for example:

while cond:
  body
else:
  otherwise

This would be desugared in Racket as:

(call/ec (λ (break)
  (letrec ([loop (λ ()
       (when cond
          (call/ec (λ (continue)
                      body))
          (loop)))])
   (loop)
   otherwise)))

A call to break bails out of the loop, skipping over the else case.

A call to continue jumps to the next iteration of the loop.

A macro to transform (while cond body else) forms is:

(define-syntax (while stx)
  (syntax-case stx ()
    [(_ cond body else)
     ; =>
     (with-syntax ([break    (datum->syntax #'body 'break)]
                   [continue (datum->syntax #'body 'continue)])
       #'(call/ec (λ (break)
                    (letrec ([loop (λ ()
                                     (when cond
                                       (call/ec (λ (continue)
                                                  body))
                                       (loop)))])
                                                  
                      (loop)
                      else))))]))

Implementing try

Exceptions are the canonical example of escape continuations.

A reasonable approach for implementing exceptions is to create a special "handler" cell in memory. That cell contains the escape continuation to invoke once an exception is raised.

Let's start by compiling a (try body handler) form. This is a try form without finally code.

We're going to add two global procedures to help--current-handler and set-current-handler!:

 (define $current-handler 
         (lambda (ex) (error "top-level exception!")))

 (define (current-handler) $current-handler)

 (define (set-current-handler! handler)
         (set! $current-handler handler))

The variable $current-handler holds a pointer to the current exception handler.

An isolated desugaring of the try form is not hard:

(let ([$old (current-handler)])
   (call/ec (lambda (ec)
               (set-current-handler! 
                  (lambda (ex)
                    (set-current-handler! $old)
                    (ec (handler ex))))
                (let ([rv body])
                   (set-current-handler! $old)
                    rv))))

This code:

  • saves the old exception handler in $old;
  • captures the current escape continuation in ec;
  • sets a new handler that invokes ec; and
  • runs the body.

Whether the body returns naturally, or by invoking an exception, the old handler gets restored.

But, what happens if we have code like the following?

def f():
  try:
    return 10
  except:
    print("you should never see me")

This function leaves the wrong handler installed after it returns!

This is because return bypasses the clean-up routines we established with the desugaring of try, leaving the handler installed.

To fix this, we need to bind new versions of return (and break and continue) that perform handler clean-up before exiting the loop:

(let ([$old (current-handler)])
   (let* ([return   (λ args 
                       (set-current-handler! $old)
                       (apply return args))]
          [continue (λ ()
                       (set-current-handler! $old)
                       (continue))]
          [break    (λ ()
                       (set-current-handler! $old)
                       (break))])
       (call/ec 
         (λ (ec)
            (set-current-handler! 
               (λ (ex) (set-current-handler! $old)
                       (ec (handler ex))))
            (let ([rv body])
                 (set-current-handler! $old)
                 rv)))))

But, if we want to allow a finally expression, through an expanded try form (try body handler finally), things get more complicated.

The finally code must run when we leave the dynamic extent of the try.

It must run whether we return out of it, whether we break out of it, whether we continue out of it, or where we raise out of it.

But, what it must do after the finally code depends on how we got into it:

  • If we fall through by exiting the try body normally, it should continue after the finally.
  • If we return out, then after the finally code executes, it should immediately return the provided return value.
  • If we break out, then it should run the finally code and jump out of the enclosing loop form.
  • If we continue out, then it should run the finally code and jump to the top of the enclosing loop form.
  • If we raise out, then it should run the finally code and re-raise the exception--unless the exception was handled, in which case it should fall through the bottom of finally.

We can make all of this possible by creating a pointer to a thunk that should perform the post-finally action.

We need to capture and redirect all escaping continuations to pass through finally by having them install their ordinary behavior in that thunk.

There's a further wrinkle in that if during the course of the exception handling, we throw another exception, we still have to run through the finally code.

In other words, the exception handler has to install an exception handler.

Here's the desugaring to make it all happen:

(let* ([$val (void)]
       [$fin (λ () $val)]
       [$old (current-handler)])
  (call/ec
   (λ (fin)
    (let* ([return   (λ args
                       (set! $fin (λ () (apply return args)))
                       (fin))]
           [continue (λ ()
                       (set! $fin continue)
                       (fin))]
           [break    (λ ()
                       (set! $fin break)
                       (fin))])
         (call/ec 
           (λ (ec)
              (set-current-handler! 
                (λ (ex)
                  ; if the handler
                  ; throws an exception,
                  ; re-throw it after
                  ; the finally block:
                  (set-current-handler! 
                    (λ (ex*)
                       (set! $fin (λ () (throw ex*))) 
                       (ec #f)))
                    (ec (let ([rv (handler ex)])
                        (set! $fin (λ () rv))))))
              (let ([rv body])
                 (set! $fin (λ () rv))
                 (fin)))))))
 (set-current-handler! $old)
 (set! $val finally)
 ($fin))) 

We also need to consider a try form without an exception handler, (try body finally). Fortunately, all it has to do is re-raise the exception:

(let* ([$val (void)]
       [$fin (λ () $val)]
       [$old (current-handler)])
   (call/ec
     (λ (fin)
       (let* ([return   (λ args
                         (set! $fin (λ () (apply return args)))
                         (fin))]
              [continue (λ ()
                         (set! $fin continue)
                         (fin))]
              [break    (λ ()
                         (set! $fin break)
                         (fin))])
          (call/ec 
            (λ (ec)
              (set-current-handler! 
                (λ (ex)
                  ; re-throw after finally:
                  (set! $fin (λ () (throw ex)))
                  (fin)))
               (let ([rv body])
                  (set! $fin (λ () rv))
                  (fin)))))))
   (set-current-handler! $old)
   (set! $val finally)
   ($fin)))

Implementing throw

Compared to all the excitement above, throw desugars with a wimper:

(define (throw ex)
  ((current-handler) ex))

Code

Macro-expanders for all of the above are available in compiling-with-exceptions.

More resources


[article index] [] [@mattmight] [+mattmight] [rss]