These pages are old. They apply to UglifyJS v2. Version 3 has evolved a lot in the mean time to support most of ES6. Please check the documentation in the official repository for up-to-date information. Big thanks to all contributors, especially to Alex Lam S.L., who has maintained this project for years!

UglifyJS — the code generator

The code generator is a recursive process of getting back source code from an AST returned by the parser. Every AST node has a “print” method that takes an OutputStream and dumps the code from that node into it. The stream object supports a lot of options that control the output. You can specify whether you'd like to get human-readable (indented) output, the indentation level, whether you'd like to quote all properties in object literals etc.

SYNOPSIS

var stream = UglifyJS.OutputStream({ ...options... });
var code = ast.print(stream);
alert(stream.toString());

The options is a JS object. Here are the default options:

indent_start  : 0,     // start indentation on every line (only when `beautify`)
indent_level  : 4,     // indentation level (only when `beautify`)
quote_keys    : false, // quote all keys in object literals?
space_colon   : true,  // add a space after colon signs?
ascii_only    : false, // output ASCII-safe? (encodes Unicode characters as ASCII)
inline_script : false, // escape "</script"?
width         : 80,    // informative maximum line width (for beautified output)
max_line_len  : 32000, // maximum line length (for non-beautified output)
ie_proof      : true,  // output IE-safe code?
beautify      : false, // beautify output?
source_map    : null,  // output a source map
bracketize    : false, // use brackets every time?
comments      : false, // output comments?
semicolons    : true,  // use semicolons to separate statements? (otherwise, newlines)

Most of these should be obvious. The ones worth additional discussion are source_map and comments.

Source map

The output stream keeps track of the current line/column in the output and can trivially generate a source mapping to the original code via Mozilla's source-map library. To use this functionality, you must load this library (it's automatically require-d by UglifyJS in the NodeJS version, but in a browser you must load it yourself) and make it available via the global MOZ_SourceMap variable.

Next, in the code generator options you'd pass a UglifyJS.SourceMap object (that's a thin wrapper around the source-map library), like this:

var source_map = UglifyJS.SourceMap({ ...source_map_options... });
var stream = UglifyJS.OutputStream({
    ...
    source_map: source_map
});
ast.print(stream);

var code = stream.toString();
var map = source_map.toString(); // json output for your source map

The source_map_options is an optional JS object that you may pass to specify additional properties for your source map:

file : null, // the compressed file name
root : null, // the root URL to the original sources
orig : null, // the input source map

The orig is useful when you compress code that was generated from some other source (possibly other programming language). If you have an input source map, pass it in this argument (either as a JS object, or as a JSON string) and UglifyJS will generate a mapping that maps back to the original source (as opposed to the compiled code that you are compressing).

Comments

The code generator can keep certain comments in the output. If you pass comments: true it'll keep all comments. You can pass a RegExp to retain only comments whose body matches that regexp. You can pass a function for custom filtering. For example, when --comments is passed with no argument the command-line tool will keep all comments containing "@license", "@preserve" or "@cc_on". Here is the function that it uses to filter them:

function(node, comment) {
    var text = comment.value;
    var type = comment.type;
    if (type == "comment2") {
        // multiline comment
        return /@preserve|@license|@cc_on/i.test(test);
    }
}

The code generator will pass two arguments: the node that the current comment is attached to, and the comment token.

Note that some comments might still be lost, due to compressor optimizations that cut whole nodes from the tree (for example unused function declarations). The safest place where to put comments that you might want to keep is at toplevel (not nested in brackets or functions).

The `OutputStream` object

You don't need to know this.
I'm just dropping it here for whoever might find it useful.

I'm including a few notes about the OutputStream object, since it seems to be useful for all sorts of code generation, not only JavaScript. It implements a simple stream with functions to print text, to print a string (quotes it and escapes characters that need to be escaped in a string), to output indented blocks etc.

To start with, here's an example. The --ast-help argument to uglifyjs will output a description of the AST hierarchy. The output looks like this:

AST_Node (start end) "Base class of all AST nodes" {
    AST_Statement "Base class of all statements" {
        AST_Debugger "Represents a debugger statement"
        AST_Directive (value scope) 'Represents a directive, like "use strict";'
        AST_SimpleStatement (body) "A statement consisting of an expression, i.e. a = 1 + 2"
        AST_Block (body) "A body of statements (usually bracketed)" {
            AST_BlockStatement "A block statement"
            AST_Scope (directives variables functions uses_with uses_eval parent_scope enclosed cname) "Base class for all statements introducing a lexical scope" {
                AST_Toplevel (globals) "The toplevel scope"
                AST_Lambda (name argnames uses_arguments) "Base class for functions" {
                    AST_Function "A function expression"
                    AST_Defun "A function definition"
                }
            }
            AST_Switch (expression) "A `switch` statement"
            AST_SwitchBranch "Base class for `switch` branches" {
                AST_Default "A `default` switch branch"
                AST_Case (expression) "A `case` switch branch"
            }
            AST_Try (bcatch bfinally) "A `try` statement"
            AST_Catch (argname) "A `catch` node; only makes sense as part of a `try` statement"
            AST_Finally "A `finally` node; only makes sense as part of a `try` statement"
        }
        AST_EmptyStatement "The empty statement (empty block or simply a semicolon)"
        AST_StatementWithBody (body) "Base class for all statements that contain one nested body: `For`, `ForIn`, `Do`, `While`, `With`" {
            AST_LabeledStatement (label) "Statement with a label"
            AST_DWLoop (condition) "Base class for do/while statements" {
                AST_Do "A `do` statement"
                AST_While "A `while` statement"
            }
            AST_For (init condition step) "A `for` statement"
            AST_ForIn (init name object) "A `for ... in` statement"
            AST_With (expression) "A `with` statement"
            AST_If (condition alternative) "A `if` statement"
        }
        AST_Jump "Base class for “jumps” (for now that's `return`, `throw`, `break` and `continue`)" {
            AST_Exit (value) "Base class for “exits” (`return` and `throw`)" {
                AST_Return "A `return` statement"
                AST_Throw "A `throw` statement"
            }
            AST_LoopControl (label) "Base class for loop control statements (`break` and `continue`)" {
                AST_Break "A `break` statement"
                AST_Continue "A `continue` statement"
            }
        }
        AST_Definitions (definitions) "Base class for `var` or `const` nodes (variable declarations/initializations)" {
            AST_Var "A `var` statement"
            AST_Const "A `const` statement"
        }
    }
    AST_VarDef (name value) "A variable declaration; only appears in a AST_Definitions node"
    AST_Call (expression args) "A function call expression" {
        AST_New "An object instantiation.  Derives from a function call since it has exactly the same properties"
    }
    AST_Seq (car cdr) "A sequence expression (two comma-separated expressions)"
    AST_PropAccess (expression property) 'Base class for property access expressions, i.e. `a.foo` or `a["foo"]`' {
        AST_Dot "A dotted property access expression"
        AST_Sub 'Index-style property access, i.e. `a["foo"]`'
    }
    AST_Unary (operator expression) "Base class for unary expressions" {
        AST_UnaryPrefix "Unary prefix expression, i.e. `typeof i` or `++i`"
        AST_UnaryPostfix "Unary postfix expression, i.e. `i++`"
    }
    AST_Binary (left operator right) "Binary expression, i.e. `a + b`" {
        AST_Assign "An assignment expression — `a = b + 5`"
    }
    AST_Conditional (condition consequent alternative) "Conditional expression using the ternary operator, i.e. `a ? b : c`"
    AST_Array (elements) "An array literal"
    AST_Object (properties) "An object literal"
    AST_ObjectProperty (key value) "Base class for literal object properties" {
        AST_ObjectKeyVal "A key: value object property"
        AST_ObjectSetter "An object setter property"
        AST_ObjectGetter "An object getter property"
    }
    AST_Symbol (scope name thedef) "Base class for all symbols" {
        AST_SymbolDeclaration (init) "A declaration symbol (symbol in var/const, function name or argument, symbol in catch)" {
            AST_SymbolVar "Symbol defining a variable" {
                AST_SymbolFunarg "Symbol naming a function argument"
            }
            AST_SymbolConst "A constant declaration"
            AST_SymbolDefun "Symbol defining a function"
            AST_SymbolLambda "Symbol naming a function expression"
            AST_SymbolCatch "Symbol naming the exception in catch"
        }
        AST_Label (references) "Symbol naming a label (declaration)"
        AST_SymbolRef "Reference to some symbol (not definition/declaration)" {
            AST_LabelRef "Reference to a label symbol"
        }
        AST_This "The `this` symbol"
    }
    AST_Constant "Base class for all constants" {
        AST_String (value) "A string literal"
        AST_Number (value) "A number literal"
        AST_RegExp (value) "A regexp literal"
        AST_Atom "Base class for atoms" {
            AST_Null "The `null` atom"
            AST_NaN "The impossible value"
            AST_Undefined "The `undefined` value"
            AST_Infinity "The `Infinity` value"
            AST_Boolean "Base class for booleans" {
                AST_False "The `false` atom"
                AST_True "The `true` atom"
            }
        }
    }
}

As you can see, it resembles JavaScript, but it's not really JavaScript. The OutputStream object made it pretty simple to get the output. Here's the function that generates it:

exports.describe_ast = function() {
    var out = UglifyJS.OutputStream({ beautify: true });
    function doitem(ctor) {
        out.print("AST_" + ctor.TYPE);
        var props = ctor.SELF_PROPS.filter(function(prop){
            return !/^\$/.test(prop);
        });
        if (props.length > 0) {
            out.space();
            out.with_parens(function(){
                props.forEach(function(prop, i){
                    if (i) out.space();
                    out.print(prop);
                });
            });
        }
        if (ctor.documentation) {
            out.space();
            out.print_string(ctor.documentation);
        }
        if (ctor.SUBCLASSES.length > 0) {
            out.space();
            out.with_block(function(){
                ctor.SUBCLASSES.forEach(function(ctor, i){
                    out.indent();
                    doitem(ctor);
                    out.newline();
                });
            });
        }
    };
    doitem(UglifyJS.AST_Node);
    return out + "";
};

Construct an OutputStream as described above in synopsis. The object exports the following methods:

get(), toString() — return the output so far as a string
indent(half) — insert one indentation string (usually 4 characters). Optionally pass true to indent half the width (I'm using that for case and default lines in switch blocks. If beautify is off, this function does nothing.
indentation() — return the current indentation width (not level; for example if we're in level 2 and indent_level is 4, this method would return 8.
current_width() — return the width of the current line text minus indentation.
should_break() — return true if current_width() is bigger than options.width (assuming options.width is non-null, non-zero).
newline() — if beautification is on, this inserts a newline. Otherwise it does nothing.
print(str) — include the given string into the output, adjusting current_line, current_col and current_pos accordingly.
space() — if beautification is on this always includes a space character. Otherwise it saves a hint somewhere that a space might be needed at current point. The space will go in at the next output but only when absolutely required, for example it will insert the space in return 10 but not in return"stuff".
comma() — inserts a comma, and calls space() — that is, if beautification is on you'll get a space after the comma.
colon() — inserts a colon, and calls space() if options.space_colon is set.
last() — returns the last printed chunk.
semicolon() — if beautification is on it always inserts a semicolon. Otherwise it saves a hint that a semicolon might be needed at current point. The semicolon is inserted when the next output comes in, only if required to not break the JS syntax.
force_semicolon() — always inserts a semicolon and clears the hint that a semicolon might be needed.
to_ascii(str) — encodes any non-ASCII characters in string with JavaScript's conventions (using \uCODE).
print_name(name) — prints an identifier. If options.ascii_only is set, non-ASCII chars will be encoded with JavaScript conventions.
print_string(str) — prints a string. It adds quotes automatically. It prefers double-quotes, but will actually count any quotes in the string and will use single-quotes if the output proves to be shorter (depending on how many backslashes it has to insert). It encodes to ASCII if options.ascii_only is set.
next_indent() — returns the width of the next indentation level. For example if current level is 2 and options.indent_level is 4, it'll return 12.
with_indent(col, func) — sets the current indentation to col (column), calls the function and thereafter restores the previous indentation level. If beautification is off it simply calls func.
with_block(func) — this is used to output blocks in curly brackets. It'll print an open bracket at current point, then call newline() and with the next indentation level it calls your func. Lastly, it'll print an indented closing bracket. As usual, if beautification is off you'll just get {x} where x is whatever func outputs.
with_parens(func) — adds parens around the output that your function prints.
with_square(func) — adds square brackets around the output that your function prints.
add_mapping(token, name) — if options.source_map is set, this will generate a source mapping between the given token (which should be an AST_Token-like object) and the current line/col. The name is optional; in most cases it will be inferred from the token.
option(name) — returns the option with the given name.
line() — returns the current line in the output (1-based).
col() — returns the current column in the output (zero-based).
push_node(node) — push the given node into an internal stack. This is used to keep track of current node's parent(s).
pop_node() — pops the top of the stack and returns it.
stack() — returns that internal stack.
parent(n) — returns the n-th parent node (where zero means the direct parent).

UglifyJS

JS compressor of world fame.

Latest blog entries tagged "uglifyjs"

/** May the source-map be with you! **/

UglifyJS — the code generator

SYNOPSIS

Source map

Comments

The OutputStream object

UglifyJS

Latest blog entries tagged "uglifyjs"

UglifyJS 2 — online demo

The `OutputStream` object