Object Creation
A comment posted before the Flash Player 10.1 series of articles asked about the performance differences between creating an object with o = new Object()
and with o = {}
. Below I’ll look into the generated bytecode for these two approaches and test their relative performance to see if either approach is faster than the other.
First of all, let’s look at two extremely simple functions demonstrating the approaches:
public function curlyBraces(): void { var o:Object = {}; } public function newOperator(): void { var o:Object = new Object(); }
Now let’s look at the bytecode generated by MXMLC 4:
function curlyBraces():void /* disp_id 0*/ { // local_count=2 max_scope=1 max_stack=1 code_len=8 0 getlocal0 1 pushscope 2 newobject {0} 4 coerce Object 6 setlocal1 7 returnvoid } function newOperator():void /* disp_id 0*/ { // local_count=2 max_scope=1 max_stack=1 code_len=11 0 getlocal0 1 pushscope 2 findpropstrict Object 4 constructprop Object (0) 7 coerce Object 9 setlocal1 10 returnvoid }
The disassembled versions of both of these functions are extremely simple and aren’t very different. Nevertheless, for AS3 code that does the exact same thing, they are different. The “curly braces” approach uses a special operator—newoperator
—and then simply coerces it to an Object
and sets it to the local variable o
. On the other hand, the “new operator” approach uses findpropstrict
to get the Object
class, calls its constructor with constructprop
, and then does a coerce
and setlocal1
to set the local variable just like the “curly braces” approach did. So is it faster to use newobject
or findpropstrict
and constructprop
? Let’s look at a simple performance test to find out:
package { import flash.display.*; import flash.text.*; import flash.utils.*; /** * An app to test the speed of different ways to create an object * @author Jackson Dunstan (jacksondunstan.com) */ public class ObjectCreation extends Sprite { public function ObjectCreation() { var logger:TextField = new TextField(); logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); function log(msg:*): void { logger.appendText(msg+"\n"); } var i:int; var beforeTime:int; var afterTime:int; var o:Object; const REPS:int = 10000000; beforeTime = getTimer(); for (i = 0; i < REPS; ++i) { o = {}; } afterTime = getTimer(); log("Curly braces: " + (afterTime-beforeTime)); beforeTime = getTimer(); for (i = 0; i < REPS; ++i) { o = new Object(); } afterTime = getTimer(); log("New operator: " + (afterTime-beforeTime)); } } }
Here we are simply using the two approaches a lot of times and measuring how long they take. Let’s look at the results:
Environment | Curly Braces | New Operator |
---|---|---|
3.0 Ghz Intel Core 2 Duo, Windows XP | 1842 | 1254 |
2.0 Ghz Intel Core 2 Duo, Mac OS X | 3218 | 2341 |
In a surprise (to me, at least) result, the “curly braces” approach with its one special-purpose instruction (newobject
) is definitively beaten by the “new operator” approach with its two general-purpose instructions (findpropstrict
and constructprop
). The “new operator” approach claims a 1.47x speedup on Windows XP and a 1.37x speedup on Mac OS X. So if you’re looking to be performance-conscious and don’t mind a bit of extra typing, by all means prefer o = new Object()
over o = {}
when constructing empty objects. Also, if you have any AS3 questions like the comment this article is in response to, by all means feel free to ask in the comments section below.
#1 by Simon Richardson on August 23rd, 2010 ·
Do you report these as bugs, we should! I think with all the data you’ve collated should be written as a unit test to check performance against the compilers…
#2 by jackson on August 23rd, 2010 ·
I’ve now submitted this as an issue to Adobe’s bug database. Feel free to vote on it and/or submit any of my previous articles as bugs as you see fit.
#3 by bwhiting on August 25th, 2010 ·
here’s one for you to try
p.s. thanks for looking at my stuff, its much appreciated :D
#4 by jackson on August 25th, 2010 ·
I’m going to do a followup article to this one and I’ll include your object-creation strategy. Thanks for the tip!
#5 by Edwin Smith on September 8th, 2010 ·
We would welcome this as a microbenchmark, and any more you have time to write. Another good way to submit bugs & AS3 microbenchmarks, is directly into the Bugzilla system we use for tracking AS3 Interpreter/JIT bugs.
companion to the jira bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=594465
Link for new Tamairn bugs (if you dont know the component, just guess).
https://bugzilla.mozilla.org/enter_bug.cgi?product=Tamarin
#6 by skyboy on October 21st, 2010 ·
This combined with another article where you pre-allocated some objects helped me speed up a JSON class by an estimated 428,571.5%
I would give a real speed increase if the class actually succeeded before the modification; It would time out at 60 seconds after it started, and only get around 7% through the total length of the string (~500,000), and now it gets through the entire thing three fold (two iterations to count maximum objects (string:”}”) and arrays (string:”]”)) in 2 seconds and below.
http://github.com/skyboy/AS3-Utilities/blob/master/skyboy/text/JSON.as
#7 by jackson on October 21st, 2010 ·
It’s great to hear that these articles have helped you get your code working fast! I briefly looked over your JSON class and have a few suggestions if you want to further optimize the class:
isSpace
andisNumber
may be suffering from poorly-generated logical operator performance due to some weird compiler bytecode output#8 by skyboy on October 22nd, 2010 ·
The one switch statement I use can’t be avoided without duplicating code, and falling into poor logic operator traps, but luckily it is also one of the least used items (parses the character after a \ in a string).
I’ll probably change the
isSpace
function to avoid that, but theisNumber
function isn’t an often used function, so i may just leave that as it is.Only three variables were set initialized to the default values, but one of them was in the most used function, which can be called as often as 20x more.
My test-case string gives these results:
Characters: 446,884
Array: 1,162
Object: 2,322
Number: 3,483
String: 20,898
Time taken: 1,608 ms
And can be found here if you want to run some tests of your own against other common JSON libraries: http://www.kongregate.com/badges.json
#9 by jackson on October 22nd, 2010 ·
Looks like there are two switches—one in
decode
and one inhandleString
—and three initializations to default variable:handleString
,handleNumber
,handleNumber2
(allBoolean
). Just a minor nit-pick. :) Keep up the good optimizing!#10 by skyboy on October 22nd, 2010 ·
Well, I ran some tests, and as it turns out, the OR operator approach I’m using is being optimized somewhere, either nJIT or the compiler (latest version as of 4 days ago)
I tested each 4 times, 10,000,000 reps, with the code 0x0B, and these are the results:
Averages are:
So there is no optimization to be done for either
isSpace
orisNumber
. I had already replaced the switch indecode
with an if-else chain in my local version, so I didn’t spot that when I glanced over it.I don’t see anywhere else I can optimize without delving into bytecode at this point. It’s too bad there isn’t a statement similar to
__asm
in MSVC++.#11 by jackson on October 22nd, 2010 ·
I recommend checking out Joa Ebert’s Apparat tool as it has an
__asm
function in it. It’s also got macro expansion, which would be a suitable replacement for inlining yourisSpace
,isNumber
, and other simple functions that get called a lot while maintaining some semblance of readability and maintainability.#12 by skyboy on October 22nd, 2010 ·
As interesting as that tool is, it’s not much help for a self-contained utility class provided to anyone/everyone.
I’ve also been looking into an AIR/AS3 solution for writing AS3 bytecode and compiling it into an SWF / editing existing SWF bytecode by hand, with some minor automatic optimization, possibly expanding it into being able to compile normal AS3 as well.
#13 by jackson on October 22nd, 2010 ·
That’s definitely true. Adding Apparat as a dependency will probably hamper adoption of your library as it will no longer be pure AS3. In that case, you could still experiment with manual inlining of
isSpace
andisNumber
. If they are truly performance hotspots then you might gain a lot of performance as function calls, regardless of how simple, are quite expensive in AS3.#14 by skyboy on October 23rd, 2010 ·
True, but with the latest improvements, it takes around 1200-1300ms in the debug player, and 600-800ms in the release player to cover the string posted above (which has actually gotten longer), so inlining those might not have a significant impact anymore.
I’ve also added encoding finally, so it’s now a complete class, and with more-correct string handling it takes on average 400ms longer than decoding on the debug player, while with only handling additional “s in the string it takes around 200ms total.
However, with the overall performance of being able to decode then encode 30,000 objects in under 4 seconds in the debug player, is very pleasing, and far, far, far better than what it used to take.
#15 by skyboy on October 24th, 2010 ·
I just tested with a variant of the class with all isSpace/isObject/etc. calls inlined (except isLit) and the normal one:
And got these results:
In one of the tests, the normal class was faster than the inlined version, so the speed gain is pretty minimal (-0.3% – 19%). The most used function that does some real work is the
handleString
function, and probably 10-70% (depending on the length found) of its time is spent in the call toString.fromCharCode.apply
, so I don’t see a way to really speed this up without there being another function added to the String class that takes an Array argument instead of var args.I also spent some time thinking about why creating the objects on the fly is 400,000% slower than making them all before they’re used, and I’ve decided that it’s either some caching on the player’s end, instruction caching on the CPU, or a combination of the two. It would be interesting to see how well this class performs on newer CPUs, mine’s a minimum of 8 years old now.
#16 by skyboy on October 24th, 2010 ·
I just ran a second test, and got even more surprising results.
Manually removing the Object wrapper by calling
valueOf
brought both the variant and normal class down to the exact same level. This could be another article in itself.Also: Both tests done in the latest standalone release player, with async tracing and writing to a file turned on (see: mm.cfg)
#17 by jackson on October 24th, 2010 ·
I looked through the source again, but I think it’s a bit outdated now since I didn’t see any “Object wrapper”. Can you post updated code? I’d like to see your optimization. :)
#18 by skyboy on October 24th, 2010 ·
The source is is virtually the same as what I have now (I reordered some variables as best I could to get the most used ones into the set/set local_x group). The Object wrapper is on the String object, calling
valueOf
returns a reference to the primitive type; I did this before testing in the second post, so both versions got that.var a:String = String(e.target.data).valueOf();
Instead of:
var a:String = e.target.data;
#19 by skyboy on July 31st, 2011 ·
I just read recently patched bug639495 that stated the newobject opcode (What {} uses) was allocating more space than it needed to, which probably lead to slower initialization; either in FP11 or the next version this test should show even performance.