JavaScript code generator

Generating JS code from the AST is very simple. Following the shape of our first evaluator, make_js is a recursive function which receives an AST. The environment isn't so useful this time because we're not executing the AST; instead we transform it into the equivalent JavaScript code, and return it as a string.

The entry point looks like this:

function make_js(exp) {
    return js(exp);

    function js(exp) {
        switch (exp.type) {
          case "num"    :
          case "str"    :
          case "bool"   : return js_atom   (exp);
          case "var"    : return js_var    (exp);
          case "binary" : return js_binary (exp);
          case "assign" : return js_assign (exp);
          case "let"    : return js_let    (exp);
          case "lambda" : return js_lambda (exp);
          case "if"     : return js_if     (exp);
          case "prog"   : return js_prog   (exp);
          case "call"   : return js_call   (exp);
          default:
            throw new Error("Dunno how to make_js for " + JSON.stringify(exp));
        }
    }

    // NOTE, all the functions below will be embedded here.
}

The js function dispatches based on the type of the current node, and just calls one of the other generators.

The “atomic” values transform into the appropriate JS constant, and JSON.stringify saves us from doing any work:

function js_atom(exp) {
    return JSON.stringify(exp.value); // cheating ;-)
}

For now, we'll output variable names as they are, which isn't exactly correct: in λanguage we permit variable names to contain for example the minus character, which is not valid in JS. To make it easy to fix this later, all names go through a make_var function, so we can change later in a single place:

function make_var(name) {
    return name;
}
function js_var(exp) {
    return make_var(exp.value);
}

For binary operators, we just compile the left and the right and join them with the operator. Note that we fully parenthesize everything and not worry about beautifying the output — we can use UglifyJS for that.

function js_binary(exp) {
    return "(" + js(exp.left) + exp.operator + js(exp.right) + ")";
}

A careful reader will notice some bugs here, compared to the evaluator: in the evaluator we ensured that numerical operators receive operands of the proper types, and that there is no division by zero; also, the semantics of && and || (concerning what's considered falsy) should be a bit different than in plain JS. Let's worry later about this.

// assign nodes are compiled the same as binary
function js_assign(exp) {
    return js_binary(exp);
}

"lambda" nodes are pretty simple as well. Just output a JavaScript function expression. We parenthesize, to avoid any possible errors, output the function keyword followed by the name if exists, then list of arguments. The body is a single expression and we return its result.

function js_lambda(exp) {
    var code = "(function ";
    if (exp.name)
        code += make_var(exp.name);
    code += "(" + exp.vars.map(make_var).join(", ") + ") {";
    code += "return " + js(exp.body) + " })";
    return code;
}

"let" nodes are handled with IIFE-s. This is overkill and we could do better, but we won't bother right now. To handle the node with IIFE is pretty simple: if vars is empty, then just compile the body. Otherwise create the IIFE node (it's a "call" node) for the first variable/definition (if the definition is missing, we pass the FALSE node) — and finally, compile the IIFE node instead. That's called “desugaring”.

function js_let(exp) {
    if (exp.vars.length == 0)
        return js(exp.body);
    var iife = {
        type: "call",
        func: {
            type: "lambda",
            vars: [ exp.vars[0].name ],
            body: {
                type: "let",
                vars: exp.vars.slice(1),
                body: exp.body
            }
        },
        args: [ exp.vars[0].def || FALSE ]
    };
    return "(" + js(iife) + ")";
}

To compile an "if" node, we use JavaScript's ternary operator. We maintain the fact that the only falsy value in the language is false (so we only execute the "then" node if the result of the condition is not false). If the "else" node is missing, just compile FALSE.

Unlike the evaluator, which explored only the "then" or the "else" depending on the condition, here we must recurse to both branches, to generate code — obviously, as we can't know the value of the condition at this time.

function js_if(exp) {
    return "("
        +      js(exp.cond) + " !== false"
        +      " ? " + js(exp.then)
        +      " : " + js(exp.else || FALSE)
        +  ")";
}

Finally, "prog" and "call" — very trivial nodes. Note that for "prog" we use JavaScript's sequencing operator (the comma), in order to get a single expression out. Fortunately the comma has similar semantics (it returns the value of the last expression) which makes it perfect for our need.

function js_prog(exp) {
    return "(" + exp.prog.map(js).join(", ") + ")";
}
function js_call(exp) {
    return js(exp.func) + "(" + exp.args.map(js).join(", ") + ")";
}

That's it!

Primitives and usage

This time the generated program actually uses plain JavaScript variables, so primitives will simply be global JS variables or functions. Here's a basic usage example:

// the "print" primitive
window.print = function(txt) {
  console.log(txt);
};

// some test code here
var code = "sum = lambda(x, y) x + y; print(sum(2, 3));";

// get the AST
var ast = parse(TokenStream(InputStream(code)));

// get JS code
var code = make_js(ast);

// additionally, if you want to see the beautified JS code using UglifyJS
// (or possibly other tools such as acorn/esprima + escodegen):
console.log( UglifyJS.parse(code).print_to_string({ beautify: true }) );

// execute it
eval(code); // prints 5

Below you can actually see it in action.

Test

Remember how the first version of our evaluator (the fastest we had so far) took well over a second for computing fib(27)? (and that was about 300 times slower than the hand-written JS). Here is it after compiling to JS:

It's as fast as the hand-written JS code. You can see the generated code when you click "Run" in your browser's console.log.

It might be tempting to stop here — it's “as fast as JS”. But it's also “as good as JS”, and that's not good enough. There are no continuations and recursion is again limited. What we've done here could be called at best a “transpiler”.

A compiler transforms a source language into a target language. For example, C++ to assembler. Or λanguage to JavaScript. The real work a compiler does is not code generation, but support for the features that make the source language more powerful than the target language.

Next we're going to work on transforming the AST to continuation-passing style.

Welcome! (login)