A few improvements
Here we'll do what we can by slight modifications of the CPS transformer and code generator.
"prog"
nodes
We'll start with a simple one, the cps_prog
— the idea is
that when the first expression has no side effects, we need not repeat it
in the output. The following helper function will decide if an AST node
can or cannot have side effects. It's not perfect; the golden rule here
is to err on the safe side. If we're not sure, then it does have side
effects.
For example, it's not easy to tell if "assign"
or "call"
nodes have no side effects. Assignment is a side effect by definition,
but it can be discarded if that variable is never used again, or
just immediately assigned again. A function call on the other hand might
have no side effects if the called function is “pure”, but there is no
easy way to attest this. Since we can't easily be sure, we assume they do
have side effects.
function has_side_effects(exp) {
switch (exp.type) {
case "call":
case "assign":
return true;
case "num":
case "str":
case "bool":
case "var":
case "lambda":
return false;
case "binary":
return has_side_effects(exp.left)
|| has_side_effects(exp.right);
case "if":
return has_side_effects(exp.cond)
|| has_side_effects(exp.then)
|| (exp.else && has_side_effects(exp.else));
case "let":
for (var i = 0; i < exp.vars.length; ++i) {
var v = exp.vars[i];
if (v.def && has_side_effects(v.def))
return true;
}
return has_side_effects(exp.body);
case "prog":
for (var i = 0; i < exp.prog.length; ++i)
if (has_side_effects(exp.prog[i]))
return true;
return false;
}
return true;
}
I hope this function is clear without further comments. Having it,
our cps_prog
becomes this:
function cps_prog(exp, k) {
return (function loop(body){
if (body.length == 0) return k(FALSE);
if (body.length == 1) return cps(body[0], k);
if (!has_side_effects(body[0]))
return loop(body.slice(1));
return cps(body[0], function(first){
if (has_side_effects(first)) return {
type: "prog",
prog: [ first, loop(body.slice(1)) ]
};
return loop(body.slice(1));
});
})(exp.prog);
}
So, the first two cases are the same: if the list is empty then we must
return false
, and if there's a single expression we must compile
it because we want its result. But if there are at least two expressions,
then we check: can we determine at compile time that the first one has no
side effects? If so, then we drop it—we don't even compile it—just loop
over the rest. And if it does have a side effect, we still check if its
result has a side effect in the continuation (because it might be
just a plain variable) and only include it if so.
"if"
nodes
The previous cps_if
passed the k
on compiling
both branches, resulting in massive code growth for consecutive
if
-s. The (quite expensive) fix is to wrap the rest of the
program in a continuation, and we'll transform the "if"
node into
an IIFE which
receives that continuation as an argument. The body of this IIFE will
just call the continuation with the result of the if
expression.
Here's a sample λ-to-λ transformation to show what we're after:
a = if foo then 1 else 2;
print(a);
|
→ |
(λ (ifcont){
if foo then ifcont(1) else ifcont(2);
})(λ (ifret){
a = ifret;
print(a);
});
|
This does add some overhead, but at least the growth is linear, not
exponential (i.e. you don't see print(a)
twice in the output).
Here's the new cps_if
:
function cps_if(exp, k) {
return cps(exp.cond, function(cond){
var cvar = gensym("I");
var cast = make_continuation(k);
k = function(ifresult) {
return {
type: "call",
func: { type: "var", value: cvar },
args: [ ifresult ]
};
};
return {
type: "call",
func: {
type: "lambda",
vars: [ cvar ],
body: {
type: "if",
cond: cond,
then: cps(exp.then, k),
else: cps(exp.else || FALSE, k)
}
},
args: [ cast ]
};
});
}
We have to invent a name for the continuation (cvar
). After
compiling the rest of the program into the cast
continuation
we turn k
into a function which returns a "call"
node for invoking the cast
(which will be available in the
value of cvar
) with the argument which will contain the
result of the if
expression. That is the k
that we
now pass to both branches.
Adding stack guards and dropping returns
For this we need a small change in the code generator. The
new js_lambda
becomes:
function js_lambda(exp) {
var code = "(function ";
var CC = exp.name || "β_CC";
code += make_var(CC);
code += "(" + exp.vars.map(make_var).join(", ") + ")";
code += "{ GUARD(arguments, " + CC + "); " + js(exp.body) + " })";
return code;
}
If the function we compile is unnamed, we name it "β_CC", so we can pass
it to GUARD
. I have reversed GUARD
's arguments, because
they will add quite a lot to the output (see, we got rid of
return
-s, but must place these guards instead) — and by providing
a longer constant string of characters we enable better compression of the
output code after gzip.
After these “improvements”, the output for the fib function looks “turrible”:
fib = λ(n) {
if n < 2 then n
else
fib(n - 1) +
fib(n - 2);
};
|
→ |
Let's try to run it:
By my measurements, it appears to be about 6 times faster than the
straight evaluator and 30 times faster
than the CPS evaluator.
It's also roughly 30 times slower than the hand-coded JS. The thing which
slows it down, besides the overhead introduced by the CPS transformation,
is guarding the stack. Unfortunately, there is no way around this. If I
drop the GUARD
calls from the JS generator, the example above
aborts with stack overflow.
Another bloat is added by the way we handle if
-s. It's totally
not necessary in the example above, since the if
expression is the
last thing in the function (its continuation just calls fib's
continuation).
In the next section we'll write an optimizer that will further improve the quality of the output code.