Flash Player 10.1 Performance: Part 1
I’ve posted a lot of articles about AS3 performance on this site. As a result of coding for Adobe’s Flash Player VM, all of these numbers may have changed with the release of Flash Player 10.1 and its attendant optimizations.
Introduction
All tests below are compiled with MXMLC 4.0. They are run on a 3.0 Ghz Intel Core 2 Duo with the release version of Flash Player 10.1 on Windows XP. No changes were made to the original tests so the performance results can be directly compared.
Free Lists
Flash Player 10.0 Performance:
Approach | Unallocated | Pre-allocated |
---|---|---|
Vector | 88 | 12 |
Direct | 74 | 126 |
Linked List | 83 | 6 |
Linked List (recycling nodes) | 88 | 33 |
Flash Player 10.1 Performance
Approach | Unallocated | Pre-allocated |
---|---|---|
Vector | 34 | 12 |
Direct | 26 | 65 |
Linked List | 31 | 6 |
Linked List (recycling nodes) | 33 | 35 |
Here we see a 2-3x performance improvement in the allocation of objects, but barely any improvement in the pre-allocation versions. As a result, the benefits of pre-allocation are reduced from ~13x to 5x. This is still a dramatic speedup! It’s also good to see the overall improvement in object allocation as most programmers don’t want to implement a pre-allocation scheme and, really, shouldn’t need to in the first place.
Namespaces As Function Pointers
Explicit Namespace | Namespace Variable | No Namespace | |
---|---|---|---|
Flash Player 10.0 | 113 | 4745 | 112 |
Flash Player 10.1 | 80 | 5285 | 80 |
This test shows a modest (25%) performance decrease in performance with the Namespace variable, but a hefty (40%) performance increase in regular variable access. Since regular variable access is by far more common than access through a Namespace variable, this is a big win on the whole.
XOR Swap
Assign Swap | XOR Swap | |
---|---|---|
Flash Player 10.0 | 236 | 276 |
Flash Player 10.1 | 236 | 270 |
While the XOR swap is a tiny (2%) bit faster, these numbers are essentially the same.
Runnables as Function Pointers
Function Object | Runnable | Direct | |
---|---|---|---|
Flash Player 10.0 | 613 | 43 | 41 |
Flash Player 10.1 | 178 | 64 | 59 |
Firstly, calling via a Function object is ~3.5x faster! This should particularly impact the performance gains made by signal libraries like TurboSignals. Further impacting TurboSignals is the 49% drop in performance for calls through runnables. This is very sad as calls to an object’s methods is the bread and butter of AS3’s object-oriented system. Perhaps even more core to AS3 is direct function calls, which suffer a 43% performance hit. Overall, this category of tests seem to show a big loss of performance.
Loop Speed
Flash Player 10.0 Performance:
Collection | For-each | For-in | For |
---|---|---|---|
Array | 377 | 8730 | 203 |
Fixed Vector | 455 | 8983 | 105 |
Variable Vector | 448 | 8940 | 104 |
Object | 721 | 8805 | 592 |
Dictionary (strong keys) | 652 | 8816 | 487 |
Dictionary (weak keys) | 666 | 8884 | 487 |
BMD w/ alpha getPixel32 | n/a | n/a | 290 |
BMD w/o alpha getPixel32 | n/a | n/a | 222 |
BMD w/ alpha getPixel | n/a | n/a | 300 |
BMD w/o alpha getPixel | n/a | n/a | 228 |
Flash Player 10.1 Performance:
Collection | For-each | For-in | For |
---|---|---|---|
Array | 186 | 4424 | 76 |
Fixed Vector | 239 | 4545 | 66 |
Variable Vector | 238 | 4477 | 66 |
Object | 498 | 4418 | 246 |
Dictionary (strong keys) | 465 | 4568 | 250 |
Dictionary (weak keys) | 558 | 4645 | 255 |
BMD w/ alpha getPixel32 | n/a | n/a | 181 |
BMD w/o alpha getPixel32 | n/a | n/a | 160 |
BMD w/ alpha getPixel | n/a | n/a | 182 |
BMD w/o alpha getPixel | n/a | n/a | 157 |
Most of these numbers show a performance increase of roughly 100%! There are some exceptions, but even these were decent speed boosts.
More To Come
This ends the first part of the series. I’ll reserve any general conclusions until the series has concluded, but for now I have to say that performance is mostly looking up. Having already done a very high-level test in my recent AS3 vs. JavaScript performance test followup and found a 1.66x speedup in 10.1, it looks like the numbers are agreeing. What we’re seeing in this series is a much closer look into how that speedup came about. Stay tuned for part two!
#1 by David Wagner on July 5th, 2010 ·
Great stuff!
For people such as myself who are far too lazy to do this sort of research, your articles are slices of fried gold :)
#2 by DieTapete on July 5th, 2010 ·
Thank you so much for doing all the testing!
BTW, I recognized that creating Objects with “new Object()” instead of using “{}” is faster in 10.1.
#3 by jackson on July 11th, 2010 ·
Interesting. I’ll have to take a look at the generated bytecode for those and see if there’s anything related to my recent writeup on Arrays and Vectors.
#4 by Nicolas on July 9th, 2010 ·
Thanks Jack for pointing out the major differences between the two versions. I’ll go re-run all my tests now as well.
#5 by Elparole on September 5th, 2012 ·
Hi,
I’m reading your articles for few months and appreciate your great work.
I am wondering, why sometimes ‘for each’ loop is faster then ‘for’ when iterating array elements. This was already mentioned by jardilio in comment to this article -> http://jacksondunstan.com/articles/358. Replacing ‘a = ?’ with ‘a += ?’ gives different results and different conclusions on which loop is fastest. Have you got any idea why does this happening?
#6 by jackson on September 5th, 2012 ·
Hi Elparole,
Thanks for the kind words about the site; I’m glad you’re enjoying it. As to the question about
for
versusfor-each
, this is definitely something I need to re-visit as I’ve just taken a second look at the results and found that there may have been a testing methodology error. Look for an upcoming article on the subject!Thanks for the tip!