MinnaHTML.js Benchmarking for Speed in Node.js
Monday, 13th August 2012, 17:55
One design decision I made when creating MinnaHTML for Node.js was to have each object contain a data property, and for every other property to represent an attribute for that object. Anything non-attributy, should go in the data property object, this kept things nicely out of the way and unlikely to conflict with attributes for the foreseeable future.
Unless that is someone needed to have an attribute called data, which I considered unlikely, because people tend to prefer using the "data-" prefix instead.
But there was another way to do it which I am now happy to report turns out to be slower, yet I'm not entirely sure exactly why. It is related to the defineProperty() method of Object, with which you can set a number of settings including enumerable. This last one, if set true, will include it when you do a for..in loop on the object. If false, it won't, so here looks to be a way to do away with the data property object I'd added and move everything to the root.
Doing so is quite easy, converting it all is just a case of search and replacing every instance of ".data." with ".", but complications immediately arise. The first being content, this is also used as a valid attribute for some tags. The second is the requirement to set every property's enumerable flag to false, which has to be done in the constructor due to the inheritance model we are using.
To avoid conflicts I also renamed all the variables that were originally part of the Data object with a prefixed underscore, that made them easy to spot if they leaked somehow into the generated HTML code, and helped avoid conflicts with things like "content".
But surely with V8 the overhead of doing that could potentially be less than the speed gains of not having to check if each property is an object or not when building the HTML at the end? On the face of it, that looks very possible, but you should never trust instincts when you can benchmark and prove it.
So, in the constructor for the Base object, the following:
this.data = new Data(parent, tag, cid);
is replaced by:
this._tag = tag;
Object.defineProperty(this, "_tag", { enumerable: false });
this._cid = cid;
Object.defineProperty(this, "_cid", { enumerable: false });
this._parent = parent;
Object.defineProperty(this, "_parent", { enumerable: false });
this._content = null;
Object.defineProperty(this, "_content", { enumerable: false });
this._readycount = 0;
Object.defineProperty(this, "_readycount", { enumerable: false });
this._readycallback = null;
Object.defineProperty(this, "_readycallback", { enumerable: false });
this._abort = false;
Object.defineProperty(this, "_abort", { enumerable: false });
this._childlist = new Array();
Object.defineProperty(this, "_childlist", { enumerable: false });
So this is twice as many lines as the Data() constructor uses, minus a function call. But hopefully not requiring a check later on when building up the HTML from a tree, this should be offset.
Now to benchmark it, using a simple technique. We'll just create a Date() object which will set the time to now (in the movie), run a loop a significant number of times that calls a function which creates a bunch of objects and then generates the HTML from them. Then create another Date() object and do a quick bit of maths.
var mh = require("minnahtml").mh;
var runtimes = 10000;
var dtStart = new Date();
testFunction();
for (var runcount = 0; runcount < runtimes; runcount++)
testFunction();
var dtEnd = new Date();
console.log("Runs: " + runtimes + ", Total Time: " + (dtEnd - dtStart) + "ms, Avg Time: " + (dtEnd - dtStart) / runtimes) + "ms";
So pretty simple. The reason we call testFunction() before the loop is to give V8 a chance to precompile the code for the function. It actually isn't that necessary to do this, providing you are comparing two results using an identical testing method.
And now for our test function itself:
function testFunction()
{
hPage = new mh.Html();
hHead = new mh.Head(hPage);
hBody = new mh.Body(hPage);
new mh.Link(hHead, null, { rel: "shortcut icon", type: "image/ico", href: "/favicon.ico" } );
new mh.Meta(hHead, null, { "http-equiv": "content-type", content: "text/html", charset: "utf-8" } );
new mh.StyleSheet(hHead).href = "/styles/standard.css";
new mh.Script(hHead).src = "/scripts/jquery172.js";
new mh.Script(hHead).src = "/scripts/general.js";
new mh.Div(hBody, "header");
new mh.Image(hBody.header, "banner", { src: "/images/headerlogo.png" });
new mh.Div(hBody, "mainarea");
new mh.Paragraph(hBody.mainarea)._content = "This is a test page " + Math.random();
new mh.Div(hBody, "footer");
new mh.Div(hBody.footer, "credits");
new mh.Paragraph(hBody.footer.credits)._content = "Thanks for viewing";
return hPage.generateHtml();
}
I threw in a Math.random() just so the content was different each time, in case V8 had some sneaky optimisation method I didn't know about. But basically, this is a pretty simplistic page to test it on, no database calls, very little content. It should test the speed of MinnaHTML.js in isolation well enough.
And the results... surprised me.
I made my initial decisions based on things other than performance, yet it turns out the way I chose to do it is faster. The overhead of setting things as enumerable is actually significant.
Here is the output from the new enumerable version:
Runs: 10000, Total Time: 10677ms, Avg Time: 1.0677ms
And here is the original version:
Runs: 10000, Total Time: 6505ms, Avg Time: 0.6505ms
I ran these tests a few times, the difference was always about the same. Maybe with a later version of V8 it might close the gap, but for now, and as of Node.js 0.8.6, the old way is better.
One final thing to find out, what the actual overhead of setting this property dynamically is. Here is our new function:
function testFunction()
{
var obj = new Object();
obj.dog = "dog";
obj.cat = "cat";
obj.ten = 10;
}
With a test run of 9 million iterations, we get:
Runs: 9000000, Total Time: 394ms, Avg Time: 0.000043777777777777775ms
And now for our new function:
function testFunction()
{
var obj = new Object();
obj.dog = "dog";
obj.cat = "cat";
Object.defineProperty(obj, "cat", { enumerable: false });
obj.ten = 10;
}
The results are:
Runs: 9000000, Total Time: 34838ms, Avg Time: 0.003870888888888889
That is a HUGE performance hit, defining these properties on the fly like this carries a heavy penalty in V8. Can we set them a different way perhaps? Maybe, but all my attempts to do so result in properties being enumerable that shouldn't be.
If there is a way that I'm not aware of, I'd like to know it.