Implementing yield

First, let's see what yield is by looking at some usage examples.

foo = with-yield(λ(yield){
  yield(1);
  yield(2);
  yield(3);
  "DONE";
});

println(foo());  # prints 1
println(foo());  # prints 2
println(foo());  # prints 3
println(foo());  # prints DONE

When called, yield stops execution of the function and returns a value. Pretty much like return so far. But when you call the function again, it resumes from where the last yield left off, as you can see in the above example. It prints 1, 2, 3, DONE.

A more real-world usage:

fib = with-yield(λ(yield){
  let loop (a = 1, b = 1) {
    yield(b);
    loop(b, a + b);
  };
});

## print the first 50 fibonacci numbers
let loop (i = 0) {
  if i < 50 {
    println(fib());
    loop(i + 1);
  };
};

The fib function contains an infinite loop. There is no termination condition. But yield interrupts that loop and returns the current number. Calling the function again resumes from where it left off, so to print the first 50 numbers then we just need to call the function 50 times.

This program will also run much faster than the duble-recursive version of fib, because it keeps track of the previous two values internally and just returns the next one.

A failed implementation attempt

I wouldn't waste your time with a failed attempt unless there were valuable lessons to learn from it, so please bear with me.

To implement yield it seems we need to save the initial continuation of the function. We need that so we can exit early, like we do for return. However, before actually returning from a yield we also have to save yield's own continuation, in order to be able to return later to that point inside the function. That's what it seems, at least. We can attempt an implementation like this:

with-yield = λ(func) {
  ## with-yield returns a function of no arguments
  λ() {
    CallCC(λ(kret){        # kret is the "return" continuation
      let (yield) {

        ## define yield
        yield = λ(value) { # yield takes a value to return
          CallCC(λ(kyld){  # kyld is yield's own continuation…
            func = kyld;   # …save it for later
            kret(value);   # and return.
          });
        };

        ## finally, call the current func, passing it the yield.
        func(yield);
      };
    });
  };
};

This was my first stab at implementing yield and it all seemed to make sense. But if we try to run the first sample above for this definition of with-yield, it will be an endless loop. Try it below if you wish, but be prepared to refresh the page (I actually modified Execute in this page to use setTimeout instead of exceptions for clearing the stack, so it won't freeze your browser). If you click Run you'll briefly notice in the output that it prints 1, 2, 3, followed by an endless stream of "DONE".

The reason why it loops is subtle. But first, if you want to see something even more bizarre, edit the code above and change the last 4 println lines to this:

println(foo());
foo();

The output will be exactly the same. Again: if you ran this example you should refresh your page, otherwise it will run endlessly, draining the battery and eventually all your computer's RAM.

Problem 1: yield never changes

One problem to notice is that the continuation we save for the first yield (kyld, which becomes the next func to call, if you follow the implementation of with-yield) embeds this computation:

  yield(2);
  yield(3);
  "DONE";

But who is yield in this continuation? It's the same as the first yield, which actually returns to the first kret continuation ever saved, which is to print the result and continue from there, resulting in a loop. Fortunately there's an easy fix for that. Clearly enough, we should create the yield closure only once, since we cannot change it in the function after the first call; in order to return to the proper exit point we maintain a new return variable:

with-yield = λ(func) {
  let (return, yield) {
    yield = λ(value) {
      CallCC(λ(kyld){
        func = kyld;
        return(value);       # return is set below, on each call to the function
      });                    # it's always the next return point.
    };                       #
    λ(val) {                 #
      CallCC(λ(kret){        #
        return = kret;       # <- here
        func(val || yield);
      });
    };
  };
};

Additionally, recognizing that it doesn't make sense to pass yield each time to the function, but only the first time, this new version allows passing another value on each invocation (except for the first). That will be yield's own return value.

The following code still uses the “foo” example, but it has println(foo()) only three times:

It appears to work properly. It's a clear improvement over the first version, which looped on code like print(foo()); foo(). But what happens if you add another println(foo())? LOOP!

Problem 2: WTF?

The reason for looping this time is a little more profound. It has to do with the unlimited nature of our continuations: they encode the whole future of the computation and that happens to include returning from the first call to foo(). What happens when we return from it? — we start all over:

println(foo()); ## yield 1 <----------------- HERE -------------------+
println(foo()); ## yield 2                                            |
println(foo()); ## yield 3                                            |
println(foo()); ## we reached "DONE", so the first foo() RETURNS -->--+

Let's look at this line from with-yield:

        func(val || yield);
        #...

When func exits via an invocation of yield, it calls a continuation, so the execution doesn't reach the #... line. But by the time it finishes, for reaching the end of the function (the "DONE" line), func will have been effectively reduced to a function which returns "DONE" to the original continuation, which is to call the first print. The foo() on the second line just loops over, but all those "DONE" lines are printed by the first print line. You can verify this by running the following code:

println(foo());
println("bar");
println(foo());
println(foo());
foo();

The output will be: 1, bar, 2, 3, DONE, bar, DONE, bar, ....

So for a possible fix, we simply must set func to something else when it returns “naturally” (that is to say simply, “finishes”). We'll make it a do-nothing function which just returns "no more continuations".

        val = func(val || yield);
        func = λ() "NO MORE CONTINUATIONS";
        kret(val);

It no longer loops now, but be prepared to be confused when running the code below:

We would have hoped for 1, 2, 3, DONE, but we also get "NO MORE CONTINUATIONS" three times. To figure out what happens we can interleave some prints, like this:

print("A. "); println(foo());
print("B. "); println(foo());
print("C. "); println(foo());
print("D. "); println(foo());

## and the output is:
A. 1
B. 2
C. 3
D. DONE
B. NO MORE CONTINUATIONS
C. NO MORE CONTINUATIONS
D. NO MORE CONTINUATIONS
***Result: false

Which shows that the problem remained: a natural exit from the function still goes back to the first continuation; but because the function is now a no-op, the "B." line won't trigger a loop.

Our implementation is still useful, provided that the function never finishes and only exits via yield. Here's for the Fibonacci example:

But if we want a solid implementation (so, one that doesn't return to the first call site when the function finishes), we need another concept called “delimited continuations”.

Delimited continuations: reset and shift

We're going to implement yield in terms of two other operators, reset and shift. They provide “delimited continuations” — these are continuations that return, like an ordinary function. reset creates a frame and shift intercepts the continuation only upto that frame, instead of “the whole future of the program” like CallCC does.

Both reset and shift take a single function argument. During the execution of the reset function, a call to shift will allow us to return a value to the place where reset was called.

First, let's see how with-yield will look:

with-yield = λ(func) {
  let (yield) {
    ## yield uses shift to return a value to the innermost
    ## reset call.  before doing that, it saves its own
    ## continuation in `func` — so calling func() again will
    ## resume from where it left off.
    yield = λ(val) {
      shift(λ(k){
        func = k;  # save the next continuation
        val;       # return the currently yielded value
      });
    };
    ## from with-yield we return a function with no arguments
    ## which uses reset to call the original function, passing
    ## to it the yield (initially) or any other value
    ## on subsequent calls
    λ(val) {
      reset( λ() func(val || yield) );
    };
  }
};

Note that each call to the function is now embedded in a reset. This ensures that we don't enclose the whole continuation of the program, but only the part up to this reset. By definition it is now impossible to loop. When the function finishes naturally, the program will just continue execution instead of going back to the place of the first call.

You can run our troublesome program below and you'll see that it finally behaves as expected. It prints nothing more nor less than it should, and it doesn't loop.

Implementation of reset / shift

These operators are quite tricky to implement, although the code is short. Prepare for some headaches. I'll give you two solutions. I am not the author of this method and I can tell you I stared at the code for a long time before it clicked. I found it in this Scheme program by Oleg Kiselyov. I recommend these articles to understand more about this mind-blowing concept.

As primitives

In this page they are implemented as primitive functions. In primitives, the current continuation is explicit (we don't need CallCC), and this makes it somewhat easier to understand what's going on under the covers. Here's the complete implementation (left):

var pstack = [];

function _goto(f) {
    f(function KGOTO(r){
        var h = pstack.pop();
        h(r);
    });
}

globalEnv.def("reset", function(KRESET, th){
    pstack.push(KRESET);
    _goto(th);
});

globalEnv.def("shift", function(KSHIFT, f){
    _goto(function(KGOTO){
        f(KGOTO, function SK(k1, v){
            pstack.push(k1);
            KSHIFT(v);
        });
    });
});

It can be useful to have with-yield by the side, where we used reset and shift.

with-yield = λ(func) {
  let (yield) {
    yield = λ(val) {
      shift(λ(SK){
        func = SK;
        val;         ## return val
      });
    };
    λ(val) {
      reset( λ() func(val || yield) );
    };
  }
};

The key is that both reset and shift use _goto, which isn't a “normal” function. It has no continuation of its own. _goto receives a normal function and invokes it with a continuation (KGOTO). Any continuations caught during the execution of f (even by CallCC) can only capture “the future of the computation” upto this KGOTO, since there's nothing above it. Therefore, no matter if f exits normally, or by calling a continuation, it will eventually get to run KGOTO — which takes the next continuation from the stack and applies it on the result.

reset pushes its own continuation (KRESET) to the stack before _goto(th). If it didn't do that, the program would stop when the function exits, because nothing else happens after _goto. So when the function exits, in any way, KGOTO will resume to this KRESET.

Finally, shift calls the function with the KGOTO continuation, so if it exits normally then KGOTO will resume from the top of pstack, and it passes to it a SK function which is able to return to the point where shift itself was called (by using shift's own continuation, KSHIFT). SK is a delimited continuation — it has the ability to return a value, like a function — hence we need to push its k1 continuation to the stack as well. To illustrate the difference I included in this page a shift2 primitive which is just like the shift above but without this line: pstack.push(k1);. Try this sample:

shift gives us a continuation (k) which is the delimited continuation upto the enclosing reset. In this case the continuation is to add 1 to whatever the result of shift was:

1 + [?]

When we're calling k, we're replacing the ? above with a value. We call it twice k(k(2)). We supply 2 and continue the program, hence the inner k(2) will return 3. Therefore what follows is k(3) which supplies a 3, also in place of the question mark, yielding 4 as the final result.

shift2 is incorrect: the inner k(2) never returns.

CallCC-based

As stated, if we had no means to define primitive functions and CallCC was all we had, it'd still be possible to implement delimited continuations. The code below does pretty much the same as the primitives version, but since we don't have JS arrays it uses lists to maintain the stack (as we saw some time back, we can implement lists directly in λanguage, although here they are provided as primitives). The code is a bit longer because we need those CallCC-s in order to fetch the current continuation (which we get as an explicit argument in primitives).

The trickiest part in the code below is how it implements goto, and why it has to do it like this, but I'll leave to you the pleasure of figuring it out.

pstack = NIL;

goto = false;

reset = λ(th) {
  CallCC(λ(k){
    pstack = cons(k, pstack);
    goto(th);
  });
};

shift = λ(f) {
  CallCC(λ(k){
    goto(λ(){
      f(λ(v){
        CallCC(λ(k1){
          pstack = cons(k1, pstack);
          k(v);
        });
      });
    });
  });
};

let (v = CallCC( λ(k){ goto = k; k(false) } )) {
  if v then let (r = v(), h = car(pstack)) {
    pstack = cdr(pstack);
    h(r);
  }
};