Implementing Operator Overloading in JavaScript

Paper.js comes with its own flavor of JavaScript called PaperScript. PaperScript does two things: It executes your code in its own scope, where all Paper.js classes appear to be global (without polluting the actual global scope), but more importantly it adds operator overloading to the JavaScript language.

Operator overloading in PaperScript works by amending the code that you have written, and compiling it using new Function(). This approach is preferred over the somewhat evil eval(), since code dynamically compiled as its own function with no connection to the local scope can still be fully optimized by the JavaScript engine. Before looking at how this translation is accomplished, here an example of what happens behind the scenes:

Whenever you have a piece of code that contains a mathematical operator, this operation is replaced with a special function call. You can try it out on sketch.paperjs.org:

var point1 = new Point(100, 100);
var vector = new Point({ angle: 45, length: 100 });
var point2 = point1 + vector * 2;
console.log(point2);
debugger;

When you execute this PaperScript code with the developer console open, the debugger; statement should trigger a debugging session, where you can see how the executed code actually looks:

(function(_$_, $_, …) {
    var point1 = new Point(100, 100);
    var vector = new Point({ angle: 45, length: 100 });
    var point2 = _$_(point1, "+", _$_(vector, "*", 2));
    console.log(point2);
    debugger;
});

As you can see, all mathematical expressions where replaced by calls to _$_(left, operator, right). _$_ and $_ are the functions that handle the overloading of all binary and unary expressions, and they are passed as arguments to the compiled function when it is called to execute the translated code.

So let’s look at the code responsible for the definition of _$_ inside PaperScript.js:

var binaryOperators = {
    '+': '__add',
    '-': '__subtract',
    '*': '__multiply',
    '/': '__divide',
    '%': '__modulo',
    '==': 'equals',
    '!=': 'equals'
};
 
function _$_(left, operator, right) {
    var handler = binaryOperators[operator];
    if (left && left[handler]) {
        var res = left[handler](right);
        return operator === '!=' ? !res : res;
    }
    switch (operator) {
    case '+': return left + right;
    case '-': return left - right;
    case '*': return left * right;
    case '/': return left / right;
    case '%': return left % right;
    case '==': return left == right;
    case '!=': return left != right;
    }
}

The binaryOperators lookup table maps the operator string to a method name. You can see that '+' maps to '__add', '*' maps to '__multiply', etc. _$_ then uses this information to check if the left value is an object that defines such a method, and if so, it performs the operation through it.

But you won’t find the definitions of these methods directly in the Paper.js Point class. There the mathematical methods are called add() and multiply(), without the double underscores, for use in JavaScript directly.

The adding of the '__' versions of these methods happens in this piece of code, through some Straps.js magic that is further explained in the post The .each() Side-car:

// Inject underscored math methods as aliases to Point, Size and Color.
var fields = Base.each(
    ['add', 'subtract', 'multiply', 'divide', 'modulo', 'negate'],
    function(name) {
        // Create an alias for each math method to be injected into the
        // classes using Straps.js' #inject() 
        this['__' + name] = '#' + name;
    }, 
    {}
);
Point.inject(fields);
Size.inject(fields);
Color.inject(fields);

You may think that replacing all mathematical operators with function calls will result in a huge decrease in performance. I was surprised to find out that this was rarely the case. For example, when profiling the Tadpoles example on Safari which uses a lot of vector mathematics calculations, I could measure 5% of the execution time spent in _$_. Most examples see less than that.

I decided that a 5% performance loss was an acceptable price to be paid for a huge simplification in the language when dealing with vector mathematics and geometry, especially when teaching computer graphics and vector geometry to design students.

That’s Great, But How Does It Actually Work?

Now let's look at how the code is modified to replace these mathematical operators with calls to _$_:

At first I was using the internal JS parser that came with UglifyJS by Marijn Haverbeke . This wasn’t designed to be used separately, so I manually extracted the useful parts from the project. Using this parser, I was able to translate JavaScript into an Abstract Syntax Tree (AST), a tree structure that represents the JavaScript code. I then wrote code that walked this tree and replaced the branches where such operations would happen with function calls to _$_ and $_. In the end, the tree was translated back to JavaScript code again, using another component isolated form UglifyJS. This worked well for quite a while, but it came with a few downsides:

Due to the translation to AST and back, the line numbers of the code would change, so error messages would happen in different places than you would expect in the code you had written that caused the issue. This obviously wasn’t very helpful.
The code would be stripped of all comments, resulting in even bigger shifts in line numbers. Also, if you looked at the result in the debugger, you wouldn’t see any comments that you had written, and perhaps not recognize your coding style.

So about a year ago, I finally addressed these issues by switching to a different approach:

In the meantime, different JavaScript parsers have emerged that would translate the JavaScript code into a standardized format, called the Mozilla Parser AST.

Esprima by Ariya Hidayat is one of them, a more recent one is Acorn, again by Marijn Haverbeke. Acorn is about half the size of Esprima, and Marijn claims that it runs faster than Esprima, especially when preserving source location data in the AST.

And preserving source location turned out to be they key ingredient for the new approach: Instead of mingling with the AST, I planned to modify the source code directly. But for this a parser was needed that provided very precise information about the exact location of each node of the AST inside the original source code, so that the actual code could be modified and replaced based on the values found in the AST, rather than directly changing the AST and translating it back to new code. The original UglifyJS parser wasn’t capable of providing that kind of fine-grained information.

You can see it all in action the compile() function, which I have linked to directly since it is too large for reproduction here.

The walkAST() function recursively walks the AST and finds nodes that need replacing. It does so in a children-first way, so by the time it processes a node, all its child nodes will already have been processed.

If an expression is found that needs processing (e.g. a 'BinaryExpression'), getCode() is called for both sides of the expression (both child nodes of the node we’re processing). getCode() returns the already processed and potentially modified code that the given AST node represents. replaceCode() is then used to replace the expression with new code that calls _$_(left, operator, right) instead of using the mathematical operator directly. replaceCode() keeps track of such code modifications / insertions in an array, and getCode() takes these insertions into account when determining the offsets inside the modified code.

This results in a translation of the source code that preserves coding style, line numbers, comments and almost all readability, while adding operator overloading to JavaScript in probably the least invasive way possible.

And thanks to Acorn, all that happens at great speed, and with acceptable memory footprint (when minified, Acorn.js consumes about 25kb).