Using UglifyJS for code refactoring

  • Published: 2013-03-20
  • Modified: 2013-03-20 10:46
  • By: Mishoo
  • Tags: uglifyjs, javascript
  • Comments: 3 (add)
Mar
20
2013

Using UglifyJS for code refactoring

Somebody asked me the following question:

I want to do AST transformations on my javascript files. Specifically, i want to take all throw "string" expressions and replace them with throw new Error("string") expressions.

The problem is I want to make no other changes to my source. I want to not change anything: no comments, no indentation, no whitespace, etc. I just want match certain AST subtrees (a throw node with a string node as a child) and replace them with something different.

I thought I'd write this blog post to show how.

Firstly, the AST will lose detail like for example the parens in the following expression: a = (x * y) + z. Because multiplication has higher priority than addition, the parens are unnecessary. The parens, in fact, make no sense in the AST—they're just hints for the parser to get the correct code1. Comments and whitespace also don't belong in the AST, although I went to some lenghts to provide an option to keep certain comments. So it's just not possible to solve this problem with a plain AST transformation + code generation.

The parser, however, will keep in each node some information about its location in the original source. Therefore, with a simple AST walker you can save in an array the starting and ending positions for all throw "string" nodes, and then iterate through that array (backwards is easier!) and do the replacements on the original code.

It's quite trivial, here's a sample:

#! /usr/bin/env node

var U2 = require("uglify-js");

function replace_throw_string(code) {
    var ast = U2.parse(code);
    // accumulate `throw "string"` nodes in this array
    var throw_string_nodes = [];
    ast.walk(new U2.TreeWalker(function(node){
        if (node instanceof U2.AST_Throw
            && node.value instanceof U2.AST_String) {
            throw_string_nodes.push(node);
        }
    }));
    // now go through the nodes backwards and replace code
    for (var i = throw_string_nodes.length; --i >= 0;) {
        var node = throw_string_nodes[i];
        var start_pos = node.start.pos;
        var end_pos = node.end.endpos;
        var replacement = new U2.AST_Throw({
            value: new U2.AST_New({
                expression: new U2.AST_SymbolRef({ name: "Error" }),
                args: [ node.value ]
            })
        }).print_to_string({ beautify: true });
        code = splice_string(code, start_pos, end_pos, replacement);
    }
    return code;
}

function splice_string(str, begin, end, replacement) {
    return str.substr(0, begin) + replacement + str.substr(end);
}

// test it

function test() {
    if (foo) throw bar;
    if (moo /* foo */) {
      throw "foo";
    }
    throw "bar";
}

console.log(replace_throw_string(test.toString()));

If we would do it with a forward iteration, then after replacing the first node, positions of subsequent nodes would no longer be valid. The cheapest solution is therefore to do the replacements backwards.

One other aspect is that for more complicated expressions, it's pretty annoying to match the AST manually. Maybe someday I'll implement some pattern matching in UglifyJS.

Footnotes
1. and this applies to all programming languages, not just JavaScript
3 comments. This is HOT!

Add your comment

# Sly1024
2013-04-02 01:53
Though having an AST can be useful, in this case I would just do: code.replace(/throw\s*\"([^\"]*)\"/g, 'throw new Error("$1")'); This doesn't work if the 'throw "string"' is inside a string, of course.
# hohohoman
2013-04-16 10:51
A programmer had a problem. "I'll use a regular expression," he thought...
# saravana
2015-01-11 08:02
it helped me understand ast generation logic better. i am gonna revisit uglify source code again