I’ve posted a lot of articles about AS3 performance on this site. As a result of coding for Adobe’s Flash Player VM, all of these numbers may have changed with the release of Flash Player 10.1 and its attendant optimizations.
Introduction
All tests below are compiled with MXMLC 4.0. They are run on a 3.0 Ghz Intel Core 2 Duo with the release version of Flash Player 10.1 on Windows XP. No changes were made to the original tests so the performance results can be directly compared.
Free Lists
Flash Player 10.0 Performance:
| Approach | Unallocated | Pre-allocated |
|---|---|---|
| Vector | 88 | 12 |
| Direct | 74 | 126 |
| Linked List | 83 | 6 |
| Linked List (recycling nodes) | 88 | 33 |
Flash Player 10.1 Performance
| Approach | Unallocated | Pre-allocated |
|---|---|---|
| Vector | 34 | 12 |
| Direct | 26 | 65 |
| Linked List | 31 | 6 |
| Linked List (recycling nodes) | 33 | 35 |
Here we see a 2-3x performance improvement in the allocation of objects, but barely any improvement in the pre-allocation versions. As a result, the benefits of pre-allocation are reduced from ~13x to 5x. This is still a dramatic speedup! It’s also good to see the overall improvement in object allocation as most programmers don’t want to implement a pre-allocation scheme and, really, shouldn’t need to in the first place.
Namespaces As Function Pointers
| Explicit Namespace | Namespace Variable | No Namespace | |
|---|---|---|---|
| Flash Player 10.0 | 113 | 4745 | 112 |
| Flash Player 10.1 | 80 | 5285 | 80 |
This test shows a modest (25%) performance decrease in performance with the Namespace variable, but a hefty (40%) performance increase in regular variable access. Since regular variable access is by far more common than access through a Namespace variable, this is a big win on the whole.
XOR Swap
| Assign Swap | XOR Swap | |
|---|---|---|
| Flash Player 10.0 | 236 | 276 |
| Flash Player 10.1 | 236 | 270 |
While the XOR swap is a tiny (2%) bit faster, these numbers are essentially the same.
Runnables as Function Pointers
| Function Object | Runnable | Direct | |
|---|---|---|---|
| Flash Player 10.0 | 613 | 43 | 41 |
| Flash Player 10.1 | 178 | 64 | 59 |
Firstly, calling via a Function object is ~3.5x faster! This should particularly impact the performance gains made by signal libraries like TurboSignals. Further impacting TurboSignals is the 49% drop in performance for calls through runnables. This is very sad as calls to an object’s methods is the bread and butter of AS3′s object-oriented system. Perhaps even more core to AS3 is direct function calls, which suffer a 43% performance hit. Overall, this category of tests seem to show a big loss of performance.
Loop Speed
Flash Player 10.0 Performance:
| Collection | For-each | For-in | For |
|---|---|---|---|
| Array | 377 | 8730 | 203 |
| Fixed Vector | 455 | 8983 | 105 |
| Variable Vector | 448 | 8940 | 104 |
| Object | 721 | 8805 | 592 |
| Dictionary (strong keys) | 652 | 8816 | 487 |
| Dictionary (weak keys) | 666 | 8884 | 487 |
| BMD w/ alpha getPixel32 | n/a | n/a | 290 |
| BMD w/o alpha getPixel32 | n/a | n/a | 222 |
| BMD w/ alpha getPixel | n/a | n/a | 300 |
| BMD w/o alpha getPixel | n/a | n/a | 228 |
Flash Player 10.1 Performance:
| Collection | For-each | For-in | For |
|---|---|---|---|
| Array | 186 | 4424 | 76 |
| Fixed Vector | 239 | 4545 | 66 |
| Variable Vector | 238 | 4477 | 66 |
| Object | 498 | 4418 | 246 |
| Dictionary (strong keys) | 465 | 4568 | 250 |
| Dictionary (weak keys) | 558 | 4645 | 255 |
| BMD w/ alpha getPixel32 | n/a | n/a | 181 |
| BMD w/o alpha getPixel32 | n/a | n/a | 160 |
| BMD w/ alpha getPixel | n/a | n/a | 182 |
| BMD w/o alpha getPixel | n/a | n/a | 157 |
Most of these numbers show a performance increase of roughly 100%! There are some exceptions, but even these were decent speed boosts.
More To Come
This ends the first part of the series. I’ll reserve any general conclusions until the series has concluded, but for now I have to say that performance is mostly looking up. Having already done a very high-level test in my recent AS3 vs. JavaScript performance test followup and found a 1.66x speedup in 10.1, it looks like the numbers are agreeing. What we’re seeing in this series is a much closer look into how that speedup came about. Stay tuned for part two!

#1 by David Wagner on July 5th, 2010 · | Quote
Great stuff!
For people such as myself who are far too lazy to do this sort of research, your articles are slices of fried gold :)
#2 by DieTapete on July 5th, 2010 · | Quote
Thank you so much for doing all the testing!
BTW, I recognized that creating Objects with “new Object()” instead of using “{}” is faster in 10.1.
#3 by jackson on July 11th, 2010 · | Quote
Interesting. I’ll have to take a look at the generated bytecode for those and see if there’s anything related to my recent writeup on Arrays and Vectors.
#4 by Nicolas on July 9th, 2010 · | Quote
Thanks Jack for pointing out the major differences between the two versions. I’ll go re-run all my tests now as well.