From a performance perspective, lots changed in Flash 10.1 (see part 1, 2, 3, 4, 5, 6). Flash Player 10.2 was officially released last week, so it’s time to update this site’s many performance tests to the new player. This time around I’ll be updating more performance tests per part of this series, so hopefully everything will be updated a lot quicker than last time. Read on for the updates!

Test Environment

All tests in this performance update use the same environment:

  • Flex SDK (MXMLC) 4.1.0.16076, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 10.1.102.64 or 10.2.152.26
  • 2.8 Ghz Intel Xeon W3530
  • Windows 7
Free Lists

Original Article

Flash Player 10.1 Performance:

Approach Unallocated Pre-allocated
Vector 32 16
Direct 15 47
Linked List 31 15
Linked List (recycling nodes) 31 15

Flash Player 10.2 Performance

Approach Unallocated Pre-allocated
Vector 22 9
Direct 21 50
Linked List 22 5
Linked List (recycling nodes) 22 27

Direct allocation seems to have taken a performance hit, which is a real shame because it happens all the time. On the plus side, we seem to be able to make up for it by using free lists. The recycling technique though, is now antiquated.

Namespaces As Function Pointers

Original Article

Explicit Namespace Namespace Variable No Namespace
Flash Player 10.1 109 7046 100
Flash Player 10.2 103 6191 103

There’s not much change here, but using namespace variables seems about 15% faster.

XOR Swap

Original Article

Assign Swap XOR Swap
Flash Player 10.1 233 302
Flash Player 10.2 233 302

No change here: XOR swap is still slower and less readable.

Runnables as Function Pointers

Original Article

Function Object Runnable Direct
Flash Player 10.1 163 50 57
Flash Player 10.2 204 54 54

Method call speed is pretty much unchanged, but calls through Function objects are now 25% slower. This is a real bummer since they are the basis of most callback and signal/event systems (except TurboSignals, which uses runnables).

Loop Speed

Original Article

Flash Player 10.1 Performance:

Collection For-each For-in For
Array 190 4259 85
Fixed Vector 255 4241 69
Variable Vector 256 4280 70
Object 506 4323 234
Dictionary (strong keys) 511 4504 287
Dictionary (weak keys) 579 4579 270
BMD w/ alpha getPixel32 n/a n/a 183
BMD w/o alpha getPixel32 n/a n/a 167
BMD w/ alpha getPixel n/a n/a 182
BMD w/o alpha getPixel n/a n/a 171
ByteArray n/a n/a 109

Flash Player 10.2 Performance:

Collection For-each For-in For
Array 206 4564 69
Fixed Vector 251 4709 70
Variable Vector 254 4706 67
Object 532 4630 253
Dictionary (strong keys) 568 4850 261
Dictionary (weak keys) 568 4901 261
BMD w/ alpha getPixel32 n/a n/a 185
BMD w/o alpha getPixel32 n/a n/a 168
BMD w/ alpha getPixel n/a n/a 184
BMD w/o alpha getPixel n/a n/a 172
ByteArray n/a n/a 95

There are a lot of figures here and they vary a little from test to test, but overall not much has changed. One notable exception is that for-in loops are slower across the board by about 10%.

Try/Catch Slowdown

Original Article

Try/Catch No Try/Catch
Flash Player 10.1 735 704
Flash Player 10.2 770 735

With or without the try/catch, both versions are 5% slower in 10.2.

Building XML

Original Article

XML Class String Class
Flash Player 10.1 57 1
Flash Player 10.2 48 1

XML is now about 19% faster, but still nearly 50x slower than just using a String.

Shape vs. Sprite

Original Article

Shape FPS Sprite FPS Shape Memory Sprite Memory
Flash Player 10.1 60 60 34524 50908
Flash Player 10.2 60 60 35776 55228

There’s no change in the performance as it was already capped at 60 FPS. As for memory, Shape is using about the same amount and Sprite is using about 8% more.

Function Performance

Original Article

Plain Local Var Method Static Override super Interface Direct Interface via Interface Interface via Class
Flash Player 10.1 259 215 216 54 62 52 57 54 118 54
Flash Player 10.2 321 257 271 53 60 56 57 55 51 53

As we saw in the runnables test above, Function objects—plain, local, var—are slower in 10.2. On the plus side, calling an interface method via an interface object no longer carries a 2x performance slowdown and is now just as fast as an interface method call directly or via a class instance. This is a big win for anyone who uses a lot of interfaces!

Simple Regular Expressions

Original Article

String.lastIndexOf() String.indexOf() RegExp.test() RegExp.exec()
Flash Player 10.1 3 3 97 95
Flash Player 10.2 3 3 94 91

There may be a very slight boost to regular expression speed here, but it may also just be statistical variance.

Beware of Getters and Setters

Original Article

Sprite Point MySprite MyPoint
Flash Player 10.1 183 18 32 78
Flash Player 10.2 182 26 27 70

These are strange results! The non-getter field access (Point.x) got slower by 44% and the getter field access (MyPoint.x) got faster by 11%. The 4.3x performance boost for using variables instead of getters/setters is now narrowed to only 2.7x, which is a shame as it is now harder to improve field access performance.

Var Args Is Slow

Original Article

Pre-Allocated Array Dynamically-Allocated Array Var Args
Flash Player 10.1 16 109 109
Flash Player 10.2 12 170 7

Wow, var args has been amazingly optimized in Flash Player 10.2! It’s now even faster than a pre-allocated Array, meaning it’s probably not even using an Array behind the scenes anymore. This is great news for any fan of var args!

Faster isNaN()

Original Article

Since this article has been superseded by this followup article, I won’t be updating Faster isNaN() anymore.

Inlining Math Functions

Original Article

Function Player 10.1 Player 10.2
abs 10 inline, 15 Math 9 inline, 14 Math
ceil 14 inline, 18 Math 13 inline, 17 Math
floor 13 inline, 18 Math 13 inline, 18 Math
max 258 inline, 46 Math 61 inline, 46 Math
min 249 inline, 46 Math 60 inline, 47 Math
max2 14 inline, 18 Math 14 inline, 21 Math
min2 14 inline, 16 Math 14 inline, 20 Math

The only real change here is the big speedups for the inlined version of min and max. They’re still slower than the regular Math versions, so there’s not much point to using them.

Map Performance

Original Article

Class Player 10.1 Player 10.2
Array 47 hit, 222 miss 44 hit, 247 miss
Vector Dynamic 42 hit, 7090 miss 42 hit, 6470 miss
Vector Fixed 41 hit, 7106 miss 43 hit, 6455 miss
Object 150 hit, 182 miss 137 hit, 206 miss
Dictionary Strong 141 hit, 242 miss 148 hit, 276 miss
Dictionary Weak 141 hit, 249 miss 146 hit, 278 miss
BitmapData no alpha, getPixel 93 hit, 78 miss 96 hit, 75 miss
BitmapData no alpha, getPixel32 94 hit, 94 miss 93 hit, 90 miss
BitmapData alpha, getPixel 93 hit, 76 miss 92 hit, 74 miss
BitmapData alpha, getPixel32 85 hit, 67 miss 90 hit, 73 miss
ByteArray 57 hit, 50 miss 53 hit, 49 miss

The performance penalty—due to the Error that gets thrown—for missing on a Vector has been reduced by about 10%. Otherwise, nothing much has changed.

More To Come

I’ll reserve any general conclusions until the series has concluded, but for now the performance is quite mixed. Stay tuned for part two!