In fairness to NodeJS

Alright, so maybe I exaggerated a bit. I wanted to point out that the only solution to the callback hell which still allows asynchronous and parallel computing is a language with explicit continuations. I did not try to hide the ugliness of manually written async code in Node, but perhaps we could do something.

For one thing, there are those if (err) throw new Error(err) all over the place. I'll just drop'em, for a fair comparison (in λanguage they are handled by wrappers around the Node API) but if you did any programming in NodeJS then you know that they can't be dropped (unless similar wrappers around the API are implemented). On the other hand I'll write a parallelEach helper (async.eachLimit from the async library won't work because see below).

The code does look prettier this time, probably the best we can do in NodeJS; but still not as good as the λanguage version.

function copyTree(srcdir, destdir, callback) {
  fs.mkdir(destdir, function(err){
    fs.readdir(srcdir, function(err, files){
      parallelEach(files, function(f, next){
        var fullname = path.join(srcdir, f);
        var dest = path.join(destdir, f);
        fs.lstat(fullname, function(err, stat){
          if (stat.isSymbolicLink()) {
            fs.readlink(fullname, function(err, target){
              fs.symlink(target, dest, next);
            });
          } else if (stat.isDirectory()) {
            copyTree(fullname, dest, next);
          } else if (stat.isFile()) {
            fs.readFile(fullname, function(err, data){
              fs.writeFile(dest, data, next);
            });
          } else {
            next();
          }
        });
      }, callback);
    });
  });
}

The reason why async.eachLimit doesn't work (I did try it actually) is that it doesn't allow for a global limit. Each time we recurse into copyTree we'd enter another async.eachLimit having its own limit, but that now means we can potentially double as many tasks running (and they can recurse as well). We quickly hit the “EMFILE” error. So I wrote this abstraction myself, it's not very complicated:

var PCALLS = 1000;

function parallelEach(a, f, callback) {
  if (a.length == 0) {
    callback();
    return;
  }
  var count = a.length;
  (function loop(i){
    if (i < a.length) {
      if (PCALLS <= 0) {
        setTimeout(function(){
          loop(i);
        }, 5);
      } else {
        PCALLS--;
        f(a[i], function(err){
          if (err) throw new Error(err);
          PCALLS++;
          if (--count == 0)
            callback();
        });
        loop(i + 1);
      }
    }
  })(0);
}
Welcome! (login)