Implementation notes
This project is inspired by Eric Bergstrome's great entry to the International Lisp Games Expo 2010. I tried a couple of times before to write a Lisp interpreter in JavaScript, but I was disappointed about the performance (or lack of it) so I gave up, until I found Eric's compiler—that made me believe that it could be doable. I grabbed a copy of PAIP and rushed to read chapter 23—that's very well written and probably the smoothest introduction into writing a Lisp compiler.
I should warn that I'm no expert at it, being the first time I do this. If nothing more, it was worth it for the mind blowing experience of getting it done.
Bootstrapping
What puzzled me about all materials that discuss implementing a Lisp is that they're all written in Lisp. Like, you need to already have a working Lisp in order to implement Lisp. Since, however, all I've got is JavaScript, I proceeded to write a Lisp reader and simple compiler (both in JavaScript) and then went to write the virtual machine (that's the part that actually runs the compiled code).
Writing the compiler in JavaScript poses an interesting problem that you wouldn't have if it were written in Lisp. The question is: how do you execute macros? Since macros are Lisp functions, you'd need to compile and execute them—but the compiler, being in JavaScript, has no access to an execution environment. My take on this was to support a very light kind of macros in the JavaScript compiler, and instantiate a VM each time we need to macro-expand—but however, I understood the importance of having the compiler written in Lisp for a proper implementation.
So next I went to write compiler.lisp, which implements the reader and the compiler in Lisp. So the bootstrapping sequence is: load compiler.js and using it, compile compiler.lisp, then have compiler.lisp to compile itself and generate a FASL. The VM can then execute this FASL and you no longer need compiler.js. I'm storing the FASLs too in the Git repository, so currently I don't need to use compiler.js at all; even if there are changes to the compiler, using the old FASL I can compile a new one. That's called a “self-hosting compiler”.
What I should have done instead is to write compiler.lisp directly, and generate a FASL of itself by running it in a Common Lisp implementation. That would have spared the efforts to write a throw-away compiler.js, but nevertheless it's nice to not have any dependencies.
Eval-when?
Working on this project made me finally understand Common Lisp's eval-when. There are good technical reasons to have such a thing, but I couldn't see many practical reasons for it.
In Common Lisp there are a few stages that happen when compiling a file. The file is parsed (that's “read time”), compiled (that's “compile time”) and you get a FASL. The code is not executed yet, but there are some things that need to run at compile-time in order to make macros work (not to mention reader customization). What gets evaluated at compile-time is decided by the eval-when directive, and it takes one months or years to fully grasp the complexity of this evaluation model.
In SLip things are simpler: when you compile a file each expression is read, compiled and executed, in sequence. This means that a macro, for example, can freely use functions that were previously defined in the same file. It behaves as if each toplevel expression would be wrapped in:
(eval-when (:compile-toplevel :load-toplevel :execute) ... )
Serialization (FASL)
The compiler creates an array of operations that is feeded to a piphole optimizer. The operations at this point are JS arrays themselves, where the first element is the “opcode” (I'm using textual names rather than bytes) and the rest are the operation's arguments.
After the piphole optimizer the resulted array is passed to an assembler, which turns each operation into a JS object (execution seems to be much faster than working with arrays). All these steps, including the VM itself, are defined in js/machine.js.
I had to figure out a way to save the compiled code (so once the application is done, you want to distribute FASLs, not source, because compiling the source takes a fair amount of time). Here's how compiler.fasl looks like. I preferred to dump a JavaScript program instead of plain JSON for two reasons: we get smaller output, and it's faster to load it (instead of traversing the JSON to re-create operation objects, we can simply run that JS). The FASLs are run in a context where all those names are defined, like PRIM, LGET, etc.
The FASL, as you can see, is pretty big; there are some tricks I could use to heavily reduce their size, but for now I'm not interested in this kind of optimization because they compress extremely well after gzip.
The “virtual machine” and threads
The VM maintains the execution context, which consists of a stack, the current lexical environment, the current dynamic environment, and the currently running function. Originally I designed it to run continuously (blocking) but then I added a simple scheduler so we can execute multiple things in parallel. Using setTimeout we schedule each task to run 200 instructions, then move on to the next, for at most 50 milliseconds; then we take a break so the browser has a chance to update/respond to user actions. Of course, this is slower than running continuously, but since JS lacks real threads it's almost a must-have for responsive interfaces.
When a thread is spawned, it creates a new VM that inherits the dynamic environment of the calling thread; however, any dynamic bindings that are done in the new thread only affect itself—thus dynamic variables behave like they do in Common Lisp implementations: once you bind it, it's “thread safe” because the binding cannot be seen by sister threads.