Multiple inheritance and multiple dispatch in JavaScript
Multiple inheritance and multiple dispatch are core features in CLOS. The standard JavaScript object system supports neither, but they can be implemented. I thought I'd write this article, along with working code, for the JavaScript fellows who don't know the concept. Maybe someone smarter than me will come up with an efficient implementation.
Single dispatch systems (like JS)
In most object-oriented languages, a class can have methods. Calling a method like foo.bar()
involves a
"runtime dispatch", meaning that the compiler cannot generally tell what method is needed by statically analyzing the
code; this detail can only be determined when the code runs. Example:
function Foo() {}
Foo.prototype.speak = function(){ alert("Hi, Foo here") };
function Moo() {}
Moo.prototype.speak = function(){ alert("Moo, man!") };
function make_object() {
return Math.random() < 0.5 ? new Foo() : new Moo();
}
var obj = make_object();
obj.speak(); // which method do we call here?
What does this code output? We don't know, it's random. So at run time, the interpreter (or virtual machine, or
the compiled program—the difference doesn't matter here) will have to determine the type of obj
, usually
available as obj.constructor
, and locate a speak
method in its prototype.
This is called “single dispatch”—the type of a single variable (obj
) is sufficient to decide which method
to call.
Multimethods (multiple dispatch)
These dispatch based on the types of more than one argument. Thus, a method no longer belongs to a single class. Instead, it belongs to a “generic function” — an object maintaining a list of “specialized” methods. When the generic is invoked, it determines what are the matching ("applicable") methods for the given arguments, and invokes the best match.
Let's see an example. We want to write an add
function which can add numbers or arrays. Adding a number to
an array will return a new array with each element of the original array added to that number. Adding two arrays will
return a new array with the added elements.
// create the generic function
var add = MM.defgeneric("add");
// define the implementation for adding numbers, e.g. add(3, 4)
add.defmethod([ MM.JS_Number, MM.JS_Number ], function(a, b){
return a + b;
});
// implementation for add([ 1, 2, 3 ], 3)
// we specialize the second arg on MM.Object
add.defmethod([ MM.JS_Array, MM.Object ], function(a, b){
return a.map(function(el){ return add(el, b) });
});
// implementation for add(3, [ 1, 2, 3 ])
// specialize the first arg on MM.Object
add.defmethod([ MM.Object, MM.JS_Array ], function(b, a){
return a.map(function(el){ return add(b, el) });
});
// implementation for add([ 1, 2, 3 ], [ 1, 2, 3 ])
add.defmethod([ MM.JS_Array, MM.JS_Array ], function(a, b){
return a.map(function(el, i){ return add(el, b[i]) });
});
console.log(add(3, 4)); // 7
console.log(add(3, [ 1, 2, 3 ])); // [ 4, 5, 6 ]
console.log(add(10, [ 2, 3, [ 4, 5, 6 ] ])); // [ 12, 13, [ 14, 15, 16 ] ]
console.log(add([ 2, 3, [ 4, 5, 6 ] ], 10)); // [ 12, 13, [ 14, 15, 16 ] ]
console.log(add([ 1, 2, 3 ], [ 4, 5, 6 ])); // [ 5, 7, 9 ]
console.log(add(1, "foo")); // FAILS
console.log(add([ 1, 2, 3 ], [ 1, 2 ])); // FAILS
add
looks like a plain JS function. However, add.defmethod
allows us to define specialized
implementations. defmethod
takes two arguments—an array of types, and a function to execute when the generic
is called with matching arguments.
So add
responds differently, depending on the argument types. Of course, we could do that with a bunch of
if/else
-s, but the ability to have separate implementations, while allowing others to customize behavior for
types we didn't consider, is the actual point of object-oriented programming.
Finally, the last two lines will fail with “No applicable methods in add”. There is no implementation to
add
a number and a string, and on the last line the arrays have different lengths so eventually add
would be called with an undefined
argument. In a way, we have a limited form of run-time type checking:
rather than returning NaN
-sense, it throws an error, which is arguably a better behavior.
Notice the second and third specializations there, they list a MM.Object
argument, instead of
JS_Number
; MM.Object
is the base class for all objects. This opens the possibility for someone else
to extend our method, say, if needed for strings:
add.defmethod([ MM.JS_String, MM.JS_String ], function(a, b){
return a + b;
});
// and now it'll work for strings too:
console.log(add([ "f", "b", "m" ], "oo")); // [ 'foo', 'boo', 'moo' ]
We didn't have to define explicit methods for adding strings and arrays this time. They're handled thanks to
inheritance by the specializations on MM.Object
above. And the following still fails, because there
is no implementation for adding numbers and strings:
console.log(add(1, "foo"));
console.log(add("foo", 1));
Classes and inheritance
This clearly departs from the standard JavaScript object system. Our objects are still JS objects, but prototypal inheritance is of no use, since we no longer store methods in an object's prototype. Here is an example where we define a few classes:
// a base class
var Person = MM.defclass("Person");
// Man and Woman inherit from Person (second arg to defclass)
var Man = MM.defclass("Man", [ Person ]);
var Woman = MM.defclass("Woman", [ Person ]);
//---- constructor
//
// MM.initialize is a generic function called by the system when an object is created.
// Its responsibility is to initialize the new object instance — like a constructor.
// Here we specialize it for the base Person class.
MM.initialize.defmethod([ Person ], function(person, args){
person.name = args.name;
});
// notice that our constructor takes two arguments, but we specialize only on the first.
// Whatever other arguments are received, if the first one is a Person then our method
// will be called with all arguments.
//
//----
// now create two objects with MM.new
var jane = MM.new(Woman, { name: "Jane" });
var john = MM.new(Man, { name: "John" });
// or, a shortcut syntax is available:
var alice = Woman.new({ name: "Alice" });
// create a generic function
var say_hi = MM.defgeneric("say_hi");
// ... and define its implementation for Man
say_hi.defmethod([ Man ], function(man){
console.log("Yo, " + man.name + " my man, what's up?");
});
// ... and for Woman.
say_hi.defmethod([ Woman ], function(woman){
console.log("Good morning miss " + woman.name + "!");
});
say_hi(john);
say_hi(jane);
It should be obvious what the output is going to be.
One nice thing is that we need not call the base class constructor from the sub-classes. We didn't even define
constructors for the sub-classes; since all persons have a name, that can be handled in the base class and we did
that by specializing MM.initialize
on Person
. Although we created Man
and Woman
objects, the inheritance mechanism figured out that it's appropriate to use the Person
constructor.
Oh, do you want to define a method for our previous add
function so that you can add two Person
-s?
That's totally doable:
add.defmethod([ Person, Person ], function(a, b){
return a.name + " married " + b.name;
});
console.log(add(john, [ jane, alice ]));
// [ 'John married Jane', 'John married Alice' ]
"Next" methods
So if we wanted to do custom things when a Man
or Woman
objects are created, we can further
specialize the constructor MM.initialize
:
MM.initialize.defmethod([ Man ], function(){
this(); // call the next available method
console.log("A boy is born");
});
MM.initialize.defmethod([ Woman ], function(){
this();
console.log("A girl is born");
});
var john = Man.new({ name: "John" });
var jane = Woman.new({ name: "Jane" });
This is just logging the fact that a Man
or a Woman
is created. Notice a call to this()
in both. It will invoke the next available method, which in this case is the implementation of
MM.initialize
for Person
, to set person.name. Conveniently, I don't have to pass any arguments—by
default it'll send the original arguments.
The nice thing is that I don't have to know or care what's the base class. There could be more than one base class,
by the way. The system knows what is the next most specific method, and to call it I only write
this()
1. Farewell,
BaseClass.prototype.method.apply(this, arguments)
.
Multiple inheritance
Another feature of the system is that a class can inherit from multiple base classes, here is a dumb example:
var ColoredObject = MM.defclass("ColoredObject");
MM.initialize.defmethod([ ColoredObject ], function(obj, props){
this();
obj.color = props.color;
});
var Vehicle = MM.defclass("Vehicle");
MM.initialize.defmethod([ Vehicle ], function(vehicle, props){
this();
vehicle.speed = props.speed;
vehicle.power = props.power;
});
var Car = MM.defclass("Car", [ ColoredObject, Vehicle ]);
MM.initialize.defmethod([ Car ], function(car, props){
this();
car.brand = props.brand;
});
var car = Car.new({
brand: "BMW",
color: "black",
power: "140 HP",
speed: "200 km/h"
});
console.log(car);
console.log(MM.is_a(car, Car),
MM.is_a(car, Vehicle),
MM.is_a(car, ColoredObject));
Even though in the Car
constructor we only set the brand, the object will get all properties because each
constructor invokes the next one by calling this()
. Also, the last output will be true true true
,
since the object is really an instance of all those classes.
Method combination
Let's say we have a Server
object for which a handle_request
generic function is implemented. And
it's in a library which we don't wish to change, but we'd like to add some logging. Here's one way:
// define our own class to customize the behavior
var MyServer = MM.defclass("MyServer", [ Server ]);
var server = MyServer.new(...); // create the object
handle_request.defmethod([ MyServer, Request ], function(server, request){
// do the logging
console.log(request.url);
// and call the next method
return this();
});
A shortcut for that is to define a "before"
method:
handle_request.defmethod("before", [ MyServer, Request ], function(server, request){
console.log(request.url);
});
The "before"
combinator tells the system that we'd like this method to be called before any other available
methods. We don't need to worry about calling this()
or returning a value, it will happen automatically.
With this trick we could do the same without defining a subclass:
var server = Server.new(...); // instantiate the original class
handle_request.defmethod("before", [ Server, Request ], function(server, request){
console.log(request.url);
});
Without subclassing, using "before"
is required; otherwise we'd simply override the
original method, leaving no way to call it later. The equivalent in JavaScript would be to set
Server.prototype.handle_request
without saving the original value.
Similar to "before"
there is an "after"
combinator which arranges for the new method to be
executed after the next most specific one. Back to our car example, we could have specialized the constructors with
"after"
, and no need to call this()
:
MM.initialize.defmethod("after", [ ColoredObject ], function(obj, props){
obj.color = props.color;
});
MM.initialize.defmethod("after", [ Vehicle ], function(vehicle, props){
vehicle.speed = props.speed;
vehicle.power = props.power;
});
MM.initialize.defmethod("after", [ Car ], function(car, props){
// car.color, car.speed and car.power are already initialized here.
car.brand = props.brand;
});
This is the recommended way to specialize MM.initialize
, because you want any base classes fully
initialized first.
The "around"
combinator
Supposing we'd like to ensure that a request is authenticated before handling it, we could do the following:
// define a new class to handle authentication
var Auth = MM.defclass("Auth");
// this generic function will take an Auth object and a Request
// and will return true if the request is authorized.
var authenticate = MM.defgeneric("authenticate");
authenticate.defmethod([ Auth, Request ], function(auth, request){
// handle authentication and return true or false
});
// the following implementation for handle_request will ensure that
// a request is authentic before passing it over to the next method.
handle_request.defmethod("around", [ Auth, Request ], function(auth, request){
if (authenticate(auth, request)) {
return this(); // call next method
} else {
request.error("403 Forbidden"); // return some error page
}
});
// now inherit MyServer from *both* Server and Auth
var MyServer = MM.defclass("MyServer", [ Server, Auth ]);
Notice the endless possibilities of customization here. The Auth
and Server
objects are
completely unrelated; no one inherits from the other. Yet we can still define an implementation of
handle_request
for Auth
objects. Our implementation can just validate the request, and if it's
authentic it calls the next method with this()
. Since MyServer
inherits from both, the next method
will be the one defined on the other base class (Server
).
The point is that the Auth
library might be in a separate module, independent from Server
,
and by inheriting from both we have a server + authentication with no additional work2.
It might seem that "around"
is like the default ("primary") combination (where you specify no combinator)
but it's subtly different. In the example above, if "around"
was not specified then the Auth
's
implementation for handle_request
would be ignored and the default one called instead. The reason is that
Server
appears first in the list of MyServer
's super classes—so its implementation would be more specific.
To fix this we could declare MyServer
inherits from [ Auth, Server ]
, thus making the
authentication handler more specific, but "around"
is really the best option because the system makes sure
that it takes priority, regardless of the class precedence order. An "around"
method wraps any
"before"
, primary and "after"
methods which may be defined. The ordering is:
- Execute any
"around"
methods, in most-specific-first order. - Execute any
"before"
methods, in most-specific-first order. - Execute any primary (standard) methods, in most-specific-first order.
- Execute any
"after"
methods, in least-specific-first order (so these are reversed).
The last three happen only if all "around"
methods invoked this()
.
Finally, let's say that the server library has three request classes: base class (Request
) and two
sub-classes, PlainRequest
and EncryptedRequest
, and it instantiates the right one before passing it
to handle_request
. Then if we wanted to allow requests only over SSL, thanks to multiple dispatch we could
just add these lines:
handle_request.defmethod("around", [ MyServer, PlainRequest ], function(server, request){
redirect_to_encrypted_url(request);
// by not calling this() we end the processing here.
});
So instead of adding if
-s here and there, if the class hierarchy is properly granularized we can (re)define
the behavior of the application by adding appropriate methods to the generic functions.
Another scenario for "around"
is initialization/cleanup for resources that might be needed over the course of
a method. For example Auth
could provide a database connection and an user object to any next methods:
handle_request.defmethod("around", [ Auth, Request ], function(auth, request){
var db = database_connect(db_credentials);
var user = get_authenticated_user(db, request);
if (user) {
// set these properties for next methods
request.db = db;
request.user = user;
// call next method
var result = this();
// cleanup
database_disconnect(db);
delete request.db;
delete request.user;
return result;
} else {
database_disconnect(db);
request.error("403 Forbidden"); // return some error page
}
});
handle_request.defmethod([ MyServer, Request ], function(server, request){
// request.db and request.user will be magically available here.
});
What about asynchronous programming?
All this should integrate pretty well with continuation-passing style. The usual “niceties” apply: return
and exceptions are unusable, and you have to nest callbacks. The example above would look like this (barring error handling):
handle_request.defmethod("around", [ Auth, Request ], function(auth, request, callback){
var next = this;
database_connect(db_credentials, function(db){
get_authenticated_user(db, request, function(user){
if (user) {
request.db = db;
request.user = user;
next(auth, request, function(result){
database_disconnect(db);
delete request.db;
delete request.user;
callback(result);
});
} else {
database_disconnect(db);
request.error("403 Forbidden");
callback();
}
});
});
});
Implementation notes
The implementation is a stripped-down version of TinyCLOS. (it has no MOP; but if you know the term I'm surprised you've read this far, since this article was nothing new to you)
Code: mm.js.
This implementation is slow. Too many things happen when a generic function is invoked. The system looks at the classes of each argument, and their super classes, to search for applicable methods. A list of methods is built, then sorted by how specific those methods are in relation to the given arguments, and also based on their combinators. In short, this will never beat a plain JS function.
A simple improvement would be to cache the resulted “effective method” (which is just a plain function that calls
the applicable methods for the given arguments). But a much better idea that should be tried is to abuse the
JavaScript JIT. This could be done by assigning an unique tag to each class, and building a new Function()
that represents the entire generic function each time defmethod
is called. For instance:
var add = MM.defgeneric("add");
add.defmethod([ MM.JS_Number, MM.JS_Number ], function(a, b){
return a + b;
});
After running the above code, the VM would contain something like the following two functions:
// i.e. 1 is the type tag for JS_Number
function $impl_add_1_1(a, b) {
return a + b;
}
function add(a, b) {
// I'm wondering if judicious tagging of types could help implementing is_a in O(1).
// The current implementation of is_a does an array.indexOf lookup.
if (MM.is_a(a, JS_Number) && MM.is_a(b, JS_Number)) {
return $impl_add_1_1(a, b);
}
throw new Error("No applicable methods in add");
}
I'm pretty sure this would yield much better results, but I didn't try it yet. It's not exactly trivial to get it
right because in general there might be multiple applicable methods (i.e. we need to send the right argument for
this()
to work).
API
With NodeJS:
var MM = require("./mm.js");
In browser, just load the file and it'll create a MM
global.
To create a generic function, use MM.defgeneric
, as in examples above. It takes an optional argument—a
string name of the generic function. That's useful during development.
var add = MM.defgeneric("add");
add
is now a reference to the generic's main dispatch function. This is just for convenience; the generic
is represented internally by an object accessible as add.O$_generic
. But you really don't need to care about that.
To add a new method to a generic, you can do one of the following (equivalent):
MM.defmethod(add, [ MM.JS_Number, MM.JS_Number ], function(a, b){
return a + b;
});
// syntactic sugar:
add.defmethod([ MM.JS_Number, MM.JS_Number ], function(a, b){
return a + b;
});
defmethod
takes an optional method combinator. You can place it before or after the types array; the
following will all work the same:
MM.defmethod(add, "before", [ MM.JS_Number, MM.JS_Number ], function(){ ... });
MM.defmethod(add, [ MM.JS_Number, MM.JS_Number ], "before", function(){ ... });
add.defmethod("before", [ MM.JS_Number, MM.JS_Number ], function(){ ... });
add.defmethod([ MM.JS_Number, MM.JS_Number ], "before", function(){ ... });
It might be interesting to note that MM.defmethod
is itself a generic function, so the
way it supports the above argument ordering is simply via specialization (yes, there are a couple
defmethod.defmethod(...)
calls in my code).
In a method function you can call this()
to invoke the next more specific method, unless the combinator is
"before"
or "after"
. this
will be undefined in such case; before/after methods need not
worry about calling this()
and they cannot alter the returned value.
To define a class, use MM.defclass
. It takes zero, one or two arguments. The first argument is the class
name as a string (useful for debugging) and the second is an array of direct super-classes.
var Server = MM.defclass("Server");
var HTTP = MM.defclass("HTTP");
var HTTPServer = MM.defclass("HTTPServer", [ Server, HTTP ]);
To customize object initialization (constructor) you need to specialize MM.initialize
for that class:
MM.initialize.defmethod([ HTTPServer ], function(server, ...){
// init your server here.
});
And to instantiate an object, one of the following:
var http_server = HTTPServer.new(...);
var http_server = MM.new(HTTPServer, ...);
Classes defined this way can be used with defmethod
for specializing a generic function. A class which
does not specify a list of direct super-classes will inherit by default from MM.Object
. Internally, there
is this hierarchy (exported under the MM
namespace): Top ← Object ← Class
(a class is really just
another object).
instanceof
-like functionality is provided by MM.is_a
:
MM.is_a(http_server, HTTPServer); // true
MM.is_a(http_server, Server); // true
MM.is_a(http_server, HTTP); // true
MM.is_a(http_server, MM.Object); // true
MM.is_a(http_server, MM.Class); // false
To get a reference to the class of an existing object instance, you can use MM.class_of
:
var cls = MM.class_of(http_server); // cls === HTTPServer
Objects created by our system do not inherit prototypically from the standard JavaScript Object
. Therefore
do not expect them to have properties like toString
or hasOwnProperty
, or JS's instanceof
to
work. They are created with an almost empty prototype—the only property a MM Object
has is
O$_class
which points to the class of an object. That's read-only—you cannot alter the class of an object at
run time. Hopefully that name is ugly enough so you won't need to use it for your own properties.
As seen in some samples, a set of classes is provided so that we can specialize generic functions by basic
JavaScript types: JS_Value
, JS_Null
, JS_Object
, JS_Number
, JS_Array
,
JS_String
, JS_Boolean
, JS_RegExp
, JS_Date
and JS_Function
.
Except JS_Null
, all the others inherit from JS_Value
(which in turn inherits from
MM.Object
). JS_Null
is special, because in JS (typeof null == "object") === true
and I
really don't like this, so it inherits from nothing. JS_Null
matches null
or undefined
, but
nothing else.
Sample
printer.js is a simple implementation of a JSON serializer. It
demonstrates how you can use multiple dispatch to implement printing to a custom OutputStream
object, and how
you can specialize a couple of methods in order to get indented output via a IndentedOutputStream
object.
this.next()
for better readability. In
Common Lisp, calling the next method is spelled (call-next-method)
instanceof
and there's still no good way of doing the above, because you get no automatic method
chaining the way we do here with this()
.
Attached files:
- mm.js — TinyCLOS in JS
- printer.js — sample (JSON printer)