Shameless plug!

Motorola Mobility just released the Montage Javascript Framework, which I've been working on for the past year.

Montage

Please check it out at Tetsubo.org! We love feedback!

I love rich HTML5 graphics

Especially WebGL!

But todays graphics APIs demand high performance mathmatics, something which Javascript hasn't provided natively.

To try and fill that gap, I created the glMatrix library in 2010.

glMatrix

Stupidly fast javascript vector and matrix library

What makes it fast?

  • Functional API, not OO
  • Cache anything we lookup more than once
  • Inline code rather than reuse functions
  • Unroll all the loops!
  • My CS professor would have hated this library

glMatrix

Functional API

var viewMat = new Mat4();
viewMat.identity();
viewMat.rotateX(Math.PI * 0.5);
viewMat.translate(10, 0, 0);

VS.

var viewMat = mat4.create();
mat4.identity(viewMat);
mat4.rotateX(viewMat, Math.PI * 0.5);
mat4.translate(viewMat, [10, 0, 0]);

This allows glMatrix to work on ANY array-like object with the right number of elements

glMatrix

Caching values

mat4.multiplyVec3 = function (mat, vec, dest) {
    if (!dest) { dest = vec; }

    var x = vec[0], y = vec[1], z = vec[2];

    dest[0] = mat[0] * x + mat[4] * y + mat[8] * z + mat[12];
    dest[1] = mat[1] * x + mat[5] * y + mat[9] * z + mat[13];
    dest[2] = mat[2] * x + mat[6] * y + mat[10] * z + mat[14];

    return dest;
};

glMatrix

Inline Code

vec3.length = function (vec) {
    var x = vec[0], y = vec[1], z = vec[2];
    return Math.sqrt(x * x + y * y + z * z);
};

vec3.normalize = function (vec, dest) {
    var x = vec[0], y = vec[1], z = vec[2],
        len = Math.sqrt(x * x + y * y + z * z);

    len = 1 / len; // Edge cases trimmed for space...
    dest[0] = x * len;
    dest[1] = y * len;
    dest[2] = z * len;
    return dest;
};

glMatrix

Loop unrolling

mat3.multiply = function (mat, mat2, dest) {
    var a00 = mat[0], a01 = mat[1], a02 = mat[2],
        a10 = mat[3], a11 = mat[4], a12 = mat[5],
        a20 = mat[6], a21 = mat[7], a22 = mat[8],

        b00 = mat2[0], b01 = mat2[1], b02 = mat2[2],
        b10 = mat2[3], b11 = mat2[4], b12 = mat2[5],
        b20 = mat2[6], b21 = mat2[7], b22 = mat2[8];

    dest[0] = b00 * a00 + b01 * a10 + b02 * a20;
    dest[1] = b00 * a01 + b01 * a11 + b02 * a21;
    dest[2] = b00 * a02 + b01 * a12 + b02 * a22;

    dest[3] = b10 * a00 + b11 * a10 + b12 * a20;
    dest[4] = b10 * a01 + b11 * a11 + b12 * a21;
    dest[5] = b10 * a02 + b11 * a12 + b12 * a22;

    dest[6] = b20 * a00 + b21 * a10 + b22 * a20;
    dest[7] = b20 * a01 + b21 * a11 + b22 * a21;
    dest[8] = b20 * a02 + b21 * a12 + b22 * a22;

    return dest;
};

glMatrix

There were two additional design decisions that make glMatrix what it is

  • Encourage object reuse
  • Encourage use of Typed Arrays (Float32Array)

And that last one has caused some confusion...

"If I wrote a physics engine..."

April 2012

My little rant about frustrating trends in JS Physics libraries

Key points:

  • Every library having their own vector classes is harmful
  • var myVec = new PhysicsLibSpecificVec3(); // Bad!
  • Vectors should be "dumb objects"
  • var myVec = {x: 0, y: 0, z: 0}; // Better!
  • Preferably use Typed Arrays instead of JS objects
  • var myVec = new Float32Array(3); // Best?

Because Typed Arrays are faster... right?

Surprisingly typed arrays perform really really bad for me.
AlteredQualia
Three.js
Tried replacing the CANNON.Vec3 with [Typed Arrays]. Sadly that time was wasted... The code ran significantly slower!
Stefan Hedman
Cannon.js

Ouch.

This goes completely against my recommendations, but they had benchmarks to prove it.

Benchmark

What's going on?

  • Are they terrible developers?
  • Is there some massive discrepancy between our systems?
  • Maybe I'm a liar that wants everyone's code to suck?

No sense in speculating. Let's find out!

Simple 2D Vector Test

25K particles via with WebGL

  • Simple motion logic
  • gl.POINT primitives
  • Single draw call

Gives us a basic but fun way to test "real world" performance

Version 1: Vector Class

// Simple Vector Class
var Vec2 = function(x, y) {
    this.x = x;
    this.y = y;
}

Vec2.prototype.add = function(v) {
    return new Vec2(this.x + v.x, this.y + v.y);
}

Vec2.prototype.subtract = function(v) {
    return new Vec2(this.x - v.x, this.y - v.y);
}

Vec2.prototype.scale = function(v) {
    return new Vec2(this.x * v, this.y * v);
}

Vec2.prototype.length = function() {
    return Math.sqrt((this.x * this.x) + (this.y * this.y));
}

Vec2.prototype.normalize = function() {
    var iLen = 1 / this.length();
    return new Vec2(this.x * iLen, this.y * iLen);
}

Version 1: Vector Class (cont.)

ParticleSystem.prototype.update = function() {
    for(var i = 0; i < this.particles.length; ++i) {
        var p = this.particles[i];

        var dir = this.attractor.subtract(p.pos);
        var dist = Math.max(1, dir.length());

        dir = dir.normalize().scale(dist/512);
        p.vel = p.vel.add(dir);

        if(p.vel.length() > maxVel) {
            p.vel = p.vel.normalize().scale(maxVel);
        }

        p.pos = p.pos.add(p.vel);

        if(p.pos.x < -this.extX) { p.pos.x = -this.extX; p.vel.x *= -1; }
        if(p.pos.x >  this.extX) { p.pos.x =  this.extX; p.vel.x *= -1; }
        if(p.pos.y < -this.extY) { p.pos.y = -this.extY; p.vel.y *= -1; }
        if(p.pos.y >  this.extY) { p.pos.y =  this.extY; p.vel.y *= -1; }
    }
}

Vector Class: Demo 1

Version 2: Typed Arrays

var Vec2 = {}

Vec2.create = function(a, b) {
    return new Float32Array([a, b]);
} 

Vec2.add = function(a, b) {
    return new Float32Array([a[0] + b[0], a[1] + b[1]]);
}

Vec2.subtract = function(a, b) {
    return new Float32Array([a[0] - b[0], a[1] - b[1]]);
}

Vec2.scale = function(a, v) {
    return new Float32Array([a[0] * v, a[1] * v]);
}

Vec2.length = function(a) {
    return Math.sqrt((a[0] * a[0]) + (a[1] * a[1]));
}

Vec2.normalize = function(a) {
    var iLen = 1 / Vec2.length(a);
    return new Float32Array([a[0] * iLen, a[1] * iLen]);
}

Version 2: Typed Arrays (cont.)

ParticleSystem.prototype.update = function() {
    for(var i = 0; i < this.particles.length; ++i) {
        var p = this.particles[i];

        var dir = Vec2.subtract(this.attractor, p.pos);
        var dist = Math.max(1, Vec2.length(dir));

        dir = Vec2.scale(Vec2.normalize(dir), dist/512);
        p.vel = Vec2.add(p.vel, dir);

        if(Vec2.length(p.vel) > maxVel) {
            p.vel = Vec2.scale(Vec2.normalize(p.vel), maxVel);
        }

        p.pos = Vec2.add(p.pos, p.vel);

        if(p.pos[0] < -this.extX) { p.pos[0] = -this.extX; p.vel[0] *= -1; }
        if(p.pos[0] >  this.extX) { p.pos[0] =  this.extX; p.vel[0] *= -1; }
        if(p.pos[1] < -this.extY) { p.pos[1] = -this.extY; p.vel[1] *= -1; }
        if(p.pos[1] >  this.extY) { p.pos[1] =  this.extY; p.vel[1] *= -1; }
    }
}

Typed Array Vectors: Demo 1

WTF?!?

Vector Classes Win?

Obviously using Vector Classes is a big performance improvement over Typed Arrays.

So that's it, end of story, goodnight!

...But can we do better?

That demo performs smoothly but it's also really simple.

We're eating up a large chunk of our 16ms per-frame budget on just moving points around in a simple pattern

Where are the potential improvement points?

Garbage creation

"Avoid vector objects if at all possible. ...[Y]ou can easily end up with hundreds of these created every frame." - Ashley Gullen, Scirra


Spot the "leaks"

ParticleSystem.prototype.update = function() {
    for(var i = 0; i < this.particles.length; ++i) {
        var p = this.particles[i];

        var dir = this.attractor.subtract(p.pos);
        var dist = Math.max(1, dir.length());

        dir = dir.normalize().scale(dist/512);
        p.vel = p.vel.add(dir);

        if(p.vel.length() > maxVel) {
            p.vel = p.vel.normalize().scale(maxVel);
        }

        p.pos = p.pos.add(p.vel);

        if(p.pos.x < -this.extX) { p.pos.x = -this.extX; p.vel.x *= -1; }
        if(p.pos.x >  this.extX) { p.pos.x =  this.extX; p.vel.x *= -1; }
        if(p.pos.y < -this.extY) { p.pos.y = -this.extY; p.vel.y *= -1; }
        if(p.pos.y >  this.extY) { p.pos.y =  this.extY; p.vel.y *= -1; }
    }
}

Each iteration of the loop creates 7 new vector object

The loop iterates 25000 times per frame, ~60 frames per second

Refactor with Object reuse in mind

var Vec2 = function(x, y) {
    this.x = x;
    this.y = y;
}

Vec2.prototype.addSelf = function(v) {
    this.x += v.x; this.y += v.y;
}

Vec2.prototype.subtractSelf = function(v) {
    this.x -= v.x; this.y -= v.y;
}

Vec2.prototype.scaleSelf = function(v) {
    this.x *= v; this.y *= v;
}

Vec2.prototype.length = function() {
    return Math.sqrt((this.x * this.x) + (this.y * this.y));
}

Vec2.prototype.normalizeSelf = function() {
    var iLen = 1 / this.length();
    this.x *= iLen; this.y *= iLen;
}

Refactor Continued

var dir = new Vec2(0, 0);
ParticleSystem.prototype.update = function() {
    for(i = 0; i < this.particles.length; ++i) {
        p = this.particles[i];

        dir.x = this.attractor.x;
        dir.y = this.attractor.y;
        dir.subtractSelf(p.pos);
        dist = Math.max(1, dir.length());

        dir.normalizeSelf();
        dir.scaleSelf(dist/512);
        p.vel.addSelf(dir);

        if(p.vel.length() > maxVel) {
            p.vel.normalizeSelf();
            p.vel.scaleSelf(maxVel);
        }

        p.pos.addSelf(p.vel);

        // Bounds check is the same as previous sample
    }
}

Vector Class: Demo 2

Non-desctructive operations

A simple API improvement

What about times were the result of the operand should be stored in a different vector?

// Currently you either have to copy the value...
dir.x = this.attractor.x;
dir.y = this.attractor.y;
dir.subtractSelf(p.pos);

// Or create a new one
dir = this.attractor.subtract(p.pos);

Solution? Allow operators to be passed an "output" vector

Vec2.prototype.subtract = function(v, out) {
    out.x = this.x - v.x;
    out.y = this.y - v.y;
}

// Allows for a simpler/faster code flow:
this.attractor.subtract(p.pos, dir);

Cool trick!

Hey! What if we did that with the Typed Array version?

Refactoring for great justice!

Vec2.create = function(a, b) {
    return new Float32Array([a, b]);
} 

Vec2.add = function(a, b, out) {
    out[0] = a[0] + b[0];
    out[1] = a[1] + b[1];
}

Vec2.subtract = function(a, b, out) {
    out[0] = a[0] - b[0];
    out[1] = a[1] - b[1];
}

Vec2.scale = function(a, v, out) {
    out[0] = a[0] * v;
    out[1] = a[1] * v;
}

Vec2.normalize = function(a, out) {
    var iLen = 1 / Vec2.length(a);
    out[0] = a[0] * iLen;
    out[1] = a[1] * iLen;
}

Hey! This looks kinda familiar....

Hope this works!

var dir = Vec2.create(0, 0);
ParticleSystem.prototype.update = function() {
    for(i = 0; i < this.particles.length; ++i) {
        p = this.particles[i];

        Vec2.subtract(this.attractor, p.pos, dir);
        dist = Math.max(1, Vec2.length(dir));

        Vec2.normalize(dir, dir);
        Vec2.scale(dir, dist/512, dir);
        Vec2.add(p.vel, dir, p.vel);

        if(Vec2.length(p.vel) > maxVel) {
            Vec2.normalize(p.vel, p.vel);
            Vec2.scale(p.vel, maxVel, p.vel);
        }

        Vec2.add(p.pos, p.vel, p.pos);

        // Bounds check is the same as previous sample
    }
}

Typed Array Vectors: Demo 2

Great Success!

Glee!
source: the interwebs

Know your bottlenecks!

Object creation isn't fast, but it's liveable

Object property referencing is slower than we would like

Float32Array creation is incredibly expensive

Float32Array indexing is BLAZING fast

Which should you use?

Depends on the needs of your code base/programmers!

  • Do you need to create a lot of temporary vectors? Use objects!
  • Are you obsessive about object reuse? Use vectors!

Environmental factors at play

Some APIs (like WebGL) require you to pass Typed Arrays.


gl.uniformMatrix4fv(projectionMatUniform, false, new Float32Array([1, 0, 0, 0...]));

If you can't avoid the Typed Array creation, might as well embrace it!

At the very least keep a cache of them around for these scenarios.

Sometimes you just can't help it

Often the choice of vector/matrix classes is made for you by you graphics/physics library.

var obj = new THREE.Object3D();
obj.matrixAutoUpdate = false; // Apparently this is important #SubversiveGripe
obj.applyMatrix(new THREE.Matrix4(1, 0, 0, 0...));

In these scenarios don't fight it, try to keep conversions between types to a minimum and always look for ways to reuse objects!

Bottom line:

Speed in your math libraries is only partially about the library that you use.

Intelligent object management will yield far more performance than an unrolled matrix multiply ever could.

Benchmark often, and be sure that you know what you're actually benchmarking

Make sure that what you're doing works for you. Don't blindly take advice for random bloggers!

Only YOU can prevent crappy Javascript!

<Thank You!>

Questions?