JavaScript code generator
Generating JS code from the AST is very simple. Following the shape of
our first evaluator,
make_js
is a recursive function which receives an AST. The
environment isn't so useful this time because we're not executing the AST;
instead we transform it into the equivalent JavaScript code, and return it
as a string.
The entry point looks like this:
function make_js(exp) {
return js(exp);
function js(exp) {
switch (exp.type) {
case "num" :
case "str" :
case "bool" : return js_atom (exp);
case "var" : return js_var (exp);
case "binary" : return js_binary (exp);
case "assign" : return js_assign (exp);
case "let" : return js_let (exp);
case "lambda" : return js_lambda (exp);
case "if" : return js_if (exp);
case "prog" : return js_prog (exp);
case "call" : return js_call (exp);
default:
throw new Error("Dunno how to make_js for " + JSON.stringify(exp));
}
}
// NOTE, all the functions below will be embedded here.
}
The js
function dispatches based on the type of the current
node, and just calls one of the other generators.
The “atomic” values transform into the appropriate JS constant, and
JSON.stringify
saves us from doing any work:
function js_atom(exp) {
return JSON.stringify(exp.value); // cheating ;-)
}
For now, we'll output variable names as they are, which isn't
exactly correct: in λanguage we permit variable names to contain for
example the minus character, which is not valid in JS. To make it easy to
fix this later, all names go through a make_var
function, so
we can change later in a single place:
function make_var(name) {
return name;
}
function js_var(exp) {
return make_var(exp.value);
}
For binary operators, we just compile the left
and
the right
and join them with the operator. Note that we
fully parenthesize everything and not worry about beautifying the output —
we can use UglifyJS for that.
function js_binary(exp) {
return "(" + js(exp.left) + exp.operator + js(exp.right) + ")";
}
A careful reader will notice some bugs here, compared to the evaluator: in
the evaluator we ensured that numerical operators receive operands of the
proper types, and that there is no division by zero; also, the semantics
of &&
and ||
(concerning what's considered falsy) should
be a bit different than in plain JS. Let's worry later about this.
// assign nodes are compiled the same as binary
function js_assign(exp) {
return js_binary(exp);
}
"lambda"
nodes are pretty simple as well. Just output a
JavaScript function
expression. We parenthesize, to avoid any
possible errors, output the function
keyword followed by the name
if exists, then list of arguments. The body is a single expression and we
return
its result.
function js_lambda(exp) {
var code = "(function ";
if (exp.name)
code += make_var(exp.name);
code += "(" + exp.vars.map(make_var).join(", ") + ") {";
code += "return " + js(exp.body) + " })";
return code;
}
"let"
nodes are handled with IIFE-s. This is overkill and we could do
better, but we won't bother right now. To handle the node with IIFE is
pretty simple: if vars
is empty, then just compile the body.
Otherwise create the IIFE node (it's a "call"
node) for the first
variable/definition (if the definition is missing, we pass the
FALSE
node) — and finally, compile the IIFE node instead. That's
called “desugaring”.
function js_let(exp) {
if (exp.vars.length == 0)
return js(exp.body);
var iife = {
type: "call",
func: {
type: "lambda",
vars: [ exp.vars[0].name ],
body: {
type: "let",
vars: exp.vars.slice(1),
body: exp.body
}
},
args: [ exp.vars[0].def || FALSE ]
};
return "(" + js(iife) + ")";
}
To compile an "if"
node, we use JavaScript's ternary operator.
We maintain the fact that the only falsy value in the language is
false
(so we only execute the "then"
node if the result
of the condition is not false). If the "else"
node is missing,
just compile FALSE
.
Unlike the evaluator, which explored only the "then"
or the
"else"
depending on the condition, here we must recurse to both
branches, to generate code — obviously, as we can't know the value of the
condition at this time.
function js_if(exp) {
return "("
+ js(exp.cond) + " !== false"
+ " ? " + js(exp.then)
+ " : " + js(exp.else || FALSE)
+ ")";
}
Finally, "prog"
and "call"
— very trivial nodes. Note
that for "prog"
we use JavaScript's sequencing operator (the
comma), in order to get a single expression out. Fortunately the comma
has similar semantics (it returns the value of the last expression) which
makes it perfect for our need.
function js_prog(exp) {
return "(" + exp.prog.map(js).join(", ") + ")";
}
function js_call(exp) {
return js(exp.func) + "(" + exp.args.map(js).join(", ") + ")";
}
That's it!
Primitives and usage
This time the generated program actually uses plain JavaScript variables, so primitives will simply be global JS variables or functions. Here's a basic usage example:
// the "print" primitive
window.print = function(txt) {
console.log(txt);
};
// some test code here
var code = "sum = lambda(x, y) x + y; print(sum(2, 3));";
// get the AST
var ast = parse(TokenStream(InputStream(code)));
// get JS code
var code = make_js(ast);
// additionally, if you want to see the beautified JS code using UglifyJS
// (or possibly other tools such as acorn/esprima + escodegen):
console.log( UglifyJS.parse(code).print_to_string({ beautify: true }) );
// execute it
eval(code); // prints 5
Below you can actually see it in action.
Test
Remember how the first version of our
evaluator (the fastest we had so far) took well over a second for
computing fib(27)
? (and that was about 300 times slower than the
hand-written JS). Here is it after compiling to JS:
It's as fast as the hand-written JS code. You can see the generated code
when you click "Run" in your browser's console.log
.
It might be tempting to stop here — it's “as fast as JS”. But it's also “as good as JS”, and that's not good enough. There are no continuations and recursion is again limited. What we've done here could be called at best a “transpiler”.
A compiler transforms a source language into a target language. For example, C++ to assembler. Or λanguage to JavaScript. The real work a compiler does is not code generation, but support for the features that make the source language more powerful than the target language.
Next we're going to work on transforming the AST to continuation-passing style.