Update: There is a bug which slipped in all of
interpreter, CPS evaluator and CPS compiler: expressions like a()
&& b()
or a() || b()
always evaluate both sides. This is
wrong, of course, whether or not b()
is evaluated depends on the
falsy value of a()
. The fix is pretty simple (convert
||
and &&
binary operations to "if"
nodes) but
I didn't update these pages yet. Shame on me.
A simple interpreter
To recap: so far we wrote 3
functions: InputStream
, TokenStream
and parse
. To get an AST from a piece of code now we can do
the following:
var ast = parse(TokenStream(InputStream(code)));
Writing an interpreter is easier than the parser. We just have to walk the AST, executing expressions in their normal order.
The environment
The key to correct execution is to properly maintain the environment — a
structure holding variable bindings. It will be passed as an argument to
our evaluate
function. Each time we enter a "lambda"
node we must extend the environment with new variables (function's
arguments) and initialize them with values passed at run time. If an
argument shadows a variable from the outer scope (I'll use
words scope and environment interchangeably here) we
must be careful to restore the previous value when we leave the function.
The simplest way to implement this is using JavaScript's prototype inheritance. When we enter a function we'll create a new environment, set its prototype to the outer (parent) environment and evaluate the function body in the new one. This way when we exit we need not do anything — the outer env will already contain any shadowed bindings.
Here's the definition of the Environment object:
function Environment(parent) {
this.vars = Object.create(parent ? parent.vars : null);
this.parent = parent;
}
Environment.prototype = {
extend: function() {
return new Environment(this);
},
lookup: function(name) {
var scope = this;
while (scope) {
if (Object.prototype.hasOwnProperty.call(scope.vars, name))
return scope;
scope = scope.parent;
}
},
get: function(name) {
if (name in this.vars)
return this.vars[name];
throw new Error("Undefined variable " + name);
},
set: function(name, value) {
var scope = this.lookup(name);
// let's not allow defining globals from a nested environment
if (!scope && this.parent)
throw new Error("Undefined variable " + name);
return (scope || this).vars[name] = value;
},
def: function(name, value) {
return this.vars[name] = value;
}
};
An Environment object has a parent
, which points to the
parent scope. The parent will be null for the global scope. And it has
a vars
property which holds the variable bindings. This is
initialized as Object.create(null)
for the toplevel (global)
scope, or Object.create(parent.vars)
for subscopes, in order to
“see” the current bindings via prototypal inheritance.
There are the following methods:
extend()
— to create a subscope.lookup(name)
— to find the scope where the variable with the given name is defined.get(name)
— to get the current value of a variable. Throws an error if the variable is not defined.set(name, value)
— to set the value of a variable. This needs to lookup the actual scope where the variable is defined. If it's not found and we're not in the global scope, throws an error.def(name, value)
— this creates (or shadows, or overwrites) a variable in the current scope.
The evaluate
function
Now that we have the Environment we can jump to the main problem. This
function will be a big switch
statement, dispatching by node
type, containing logic for evaluating each kind of node. I'll comment on
each case:
function evaluate(exp, env) {
switch (exp.type) {
For constant nodes, we just return their value:
case "num":
case "str":
case "bool":
return exp.value;
Variables are fetched from the environment. Remember that "var"
tokens contain the name in the value
property:
case "var":
return env.get(exp.value);
For assignment, we need to check if the left side is a "var"
token (if not, throw an error; we don't support assignment to anything
else for now). Then we use env.set
to set the value. Note
that the value needs to be computed first by calling evaluate
recursively.
case "assign":
if (exp.left.type != "var")
throw new Error("Cannot assign to " + JSON.stringify(exp.left));
return env.set(exp.left.value, evaluate(exp.right, env));
A "binary"
node needs to apply an operator to two operands.
We'll write the apply_op
function later, it's quite trivial.
Again, we need to call the evaluator recursively to compute
the left
and right
operands:
case "binary":
return apply_op(exp.operator,
evaluate(exp.left, env),
evaluate(exp.right, env));
A "lambda"
node will actually result in a JavaScript closure, so
it will be callable from JavaScript just like an ordinary function. I
wrote a make_lambda
function, which I'll define later:
case "lambda":
return make_lambda(env, exp);
Evaluating an "if"
node is simple: first evaluate the condition.
If it's not false
then evaluate the "then"
branch and
return its value. Otherwise, evaluate the "else"
branch, if
present, or return false
.
case "if":
var cond = evaluate(exp.cond, env);
if (cond !== false) return evaluate(exp.then, env);
return exp.else ? evaluate(exp.else, env) : false;
A "prog"
is a sequence of expressions. We just evaluate them in
order and return the value of the last one. For an empty sequence, the
return value is initialized to false
.
case "prog":
var val = false;
exp.prog.forEach(function(exp){ val = evaluate(exp, env) });
return val;
For a "call"
node we need to call a function. First we evaluate
the func
, which should return a normal JS function, then we
evaluate the args
and apply that function.
case "call":
var func = evaluate(exp.func, env);
return func.apply(null, exp.args.map(function(arg){
return evaluate(arg, env);
}));
We should never get here, but just in case we add new node types in the parser and we forget to update the evaluator, let's throw a clear error.
default:
throw new Error("I don't know how to evaluate " + exp.type);
}
}
This was the core of the evaluator and as you can see it's really simple.
We still need to write two more functions, let's start
with apply_op
as it's the easiest one:
function apply_op(op, a, b) {
function num(x) {
if (typeof x != "number")
throw new Error("Expected number but got " + x);
return x;
}
function div(x) {
if (num(x) == 0)
throw new Error("Divide by zero");
return x;
}
switch (op) {
case "+" : return num(a) + num(b);
case "-" : return num(a) - num(b);
case "*" : return num(a) * num(b);
case "/" : return num(a) / div(b);
case "%" : return num(a) % div(b);
case "&&" : return a !== false && b;
case "||" : return a !== false ? a : b;
case "<" : return num(a) < num(b);
case ">" : return num(a) > num(b);
case "<=" : return num(a) <= num(b);
case ">=" : return num(a) >= num(b);
case "==" : return a === b;
case "!=" : return a !== b;
}
throw new Error("Can't apply operator " + op);
}
It receives the operator and the arguments. Just a boring switch to apply
it. Unlike JavaScript, which applies any operator to any arguments and
moves on whether that makes any sense or not, we require that the operands
for numeric operators be numbers, and that a divizor is not zero, using
the small helpers num
and div
. For strings
we'll define something else.
make_lambda
The make_lambda
is a bit subtle:
function make_lambda(env, exp) {
function lambda() {
var names = exp.vars;
var scope = env.extend();
for (var i = 0; i < names.length; ++i)
scope.def(names[i], i < arguments.length ? arguments[i] : false);
return evaluate(exp.body, scope);
}
return lambda;
}
As you can see, it returns a plain JavaScript function that encloses
over the environment and the expression to evaluate. It's important
to understand that nothing happens when this closure is created —
but when it's called, it will extend the environment that it
saved at creation time with the new bindings of arguments/values (if less
values are passed than the function's argument list, the missing ones will
get the value false
). And then it just evaluates the body in the
new scope.
Primitive functions
You can observe that our language does not provide any means to interact
with the outside world. In some code examples I've used
some print
and println
functions, but they are
not defined anywhere. These have to be defined as primitive functions
(that is, we'll write them in JavaScript and will insert them into the
global environment).
To put it all together now, here's a test program:
// some test code here
var code = "sum = lambda(x, y) x + y; print(sum(2, 3));";
// remember, parse takes a TokenStream which takes an InputStream
var ast = parse(TokenStream(InputStream(code)));
// create the global environment
var globalEnv = new Environment();
// define the "print" primitive function
globalEnv.def("print", function(txt){
console.log(txt);
});
// run the evaluator
evaluate(ast, globalEnv); // will print 5
Download
You can download the code that we wrote so far. It's runnable with NodeJS — just pass the code to evaluate at standard input, e.g.:
echo 'sum = lambda(x, y) x + y; println(sum(2, 3));' | node lambda-eval1.js
In the next section we'll play a bit with our λanguage.
How to implement a PL
- Introduction
- λanguage description
- Writing a parser
- Simple interpreter
- CPS Evaluator
- Compiling to JS
- Wrapping up
- Real samples
Download the code so far:
lambda-eval1.js