Flash Player 10.2 Performance: Part 2
Today’s article follows up on last week’s article that began by running many of the performance tests on this site with the newly-released Flash Player 10.2. Last week I got through half of the performance tests I did in my Flash Player 10.1 followup (see part 1, 2, 3, 4, 5, 6) and today I’ll cover the second half. What faster and what’s slower in Flash Player 10.2? Read on for a ton of performance test updates!
Test Environment
All tests in this performance update use the same environment:
- Flex SDK (MXMLC) 4.1.0.16076, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 10.1.102.64 or 10.2.152.26
- 2.8 Ghz Intel Xeon W3530
- Windows 7
BitmapData Alpha Performance
Light Functions (1000000 iterations)
Function | Flash Player 10.1 | Flash Player 10.2 | Change | ||
---|---|---|---|---|---|
Alpha | No Alpha | Alpha | No Alpha | ||
floodFill | 78 | 62 | 88 | 82 | Slower |
generateFilterRect | 249 | 250 | 272 | 268 | Slower |
getColorBoundsRect | 281 | 281 | 307 | 283 | Alpha slower |
getPixel | 16 | 15 | 15 | 12 | |
getPixel32 | 15 | 16 | 16 | 13 | |
scroll | 30 | 31 | 31 | 29 | |
setPixel | 31 | 31 | 32 | 31 | |
setPixel32 | 31 | 31 | 33 | 31 |
Heavy Functions (1000 iterations)
Function | Flash Player 10.1 | Flash Player 10.2 | Change | ||
---|---|---|---|---|---|
Alpha | No Alpha | Alpha | No Alpha | ||
colorTransform | 0 | 0 | 0 | 0 | |
clone | 78 | 63 | 59 | 62 | Alpha faster |
fillRect | 0 | 0 | 0 | 0 | |
getPixels | 421 | 390 | 279 | 250 | Faster |
getVector | 47 | 32 | 67 | 62 | Slower |
histogram | 125 | 93 | 125 | 94 | |
noise | 218 | 203 | 218 | 202 | |
perlinNoise | 811 | 796 | 796 | 780 | Slightly faster |
setPixels | 32 | 15 | 31 | 16 | |
setVector | 46 | 16 | 42 | 16 |
See the “Change” column for remarks on the change. Four are slower and three are faster, so the BitmapData
results are quite mixed and really dependent on your usage of the class.
For Vs. While
Function | Flash Player 10.1 | Flash Player 10.2 |
---|---|---|
for (forward) | 2443 | 2496 |
for (backward) | 2120 | 2116 |
while (forward) | 2488 | 2465 |
while (backward) | 2124 | 2120 |
There’s no change in for
or while
loop speed, forward or backward.
Fast Line Drawing
Function | Flash Player 10.1 | Flash Player 10.2 | Change |
---|---|---|---|
efla | 62 | 63 | |
lineTo (low) | 63 | 54 | 16% faster |
lineTo (medium) | 187 | 271 | 45% slower |
lineTo (high) | 601 | 604 | |
lineTo (best) | 1035 | 1080 |
Low and medium quality lineTo
seem affected with the bigger change unfortunately in the negative direction. Still, it does mean that the efla
is no longer needed to beat the low quality setting.
Linked Lists: Part 3
Function | Operations | Flash Player 10.1 | Flash Player 10.2 | Change | ||
---|---|---|---|---|---|---|
Array | List | Array | List | |||
traverse | 100 | 7 | 8 | 6 | 9 | |
elementAt/[] | 100000 | 1 | 998 | 0 | 957 | List faster |
concat | 10 | 0 | 46 | 5 | 36 | Array slower, list faster |
every | 10 | 0 | 0 | 3 | 8 | Both slower |
filter | 10 | 16 | 31 | 6 | 21 | Both faster |
forEach | 10 | 0 | 0 | 3 | 8 | Both slower |
indexOf | 100 | 16 | 15 | 12 | 25 | Array faster, list slower |
join | 1 | 16 | 0 | 7 | 9 | Array faster, list slower |
lastIndexOf | 100 | 16 | 15 | 11 | 23 | Array faster, list slower |
map | 1 | 0 | 0 | 9 | 2 | Both slower |
pop | 100000 | 0 | 0 | 1 | 2 | |
push | 100000 | 16 | 31 | 3 | 20 | Both faster |
reverse | 100 | 16 | 93 | 7 | 103 | Array faster, list slower |
shift | 10000 | 498 | 0 | 514 | 1 | Array slower |
slice | 100 | 38 | 220 | 25 | 159 | Both faster |
some | 1 | 2 | 7 | 0 | 16 | List slower |
sort | 1 | 177 | 205 | 171 | 188 | List faster |
sortOn | 1 | 63 | 78 | 46 | 78 | Array faster |
splice | 10000 | 0 | 15 | 16 | 0 | Array slower, list faster |
unshift | 10000 | 523 | 3 | 484 | 15 | Array faster, list slower |
As you can see from the “Changes” column, there are a lot of differences in Flash Player 10.2 related to arrays and my linked list implementation. Arrays are faster on nine functions and slower on five for a net performance gain. My linked list implementation was faster on seven functions and slower on nine for a net performance loss. Directly, this is a good result as Array
usage is far more common than my custom linked list class. Indirectly, it shows a net loss for AS3 classes. Overall, the result here is yet-again mixed.
Callback Strategies
1 Argument
Function | Flash Player 10.1 | Flash Player 10.2 |
---|---|---|
Func (1 arg) | 28 | 26 |
Runnable (1 arg) | 5 | 5 |
Event (1 arg) | 783 | 789 |
Signal (1 arg) | 1020 | 1221 |
Func (10 args) | 215 | 221 |
Runnable (10 args) | 151 | 145 |
Event (10 args) | 2430 | 2106 |
Signal (10 args) | 1909 | 2043 |
Note that the implementation of as3signals here is the original version from the article, so as to maintain consistency. Later versions may alter the performance for better or worse. As for the results, as3signals seems to have gotten a little slower and events—at least with 10 args—seems to have gotten a little faster. It’s surprising that as3signals—with heavily uses Function
objects—got slower and plain Function
didn’t, but it must have been related to another implementation in, at least, that version of as3signals. On the other hand, there’s a nice performance boost for anyone (everyone?) using Event
.
Sorting Vectors
Method | Flash Player 10.1 | Flash Player 10.2 | Change | ||
---|---|---|---|---|---|
Unsorted | Pre-Sorted | Unsorted | Pre-Sorted | ||
Vector via sort() | 810 | 555 | 801 | 568 | |
Array via sortOn() | 448 | 298 | 424 | 374 | Mostly slower |
Vector via quickSort() | 2268 | 1803 | 2106 | 1685 | Faster |
Vector via shellSort() | 3982 | 2139 | 3526 | 1918 | Faster |
Vector via Array.sortOn() | 560 | 389 | 468 | 328 | Faster |
Vector.sort
was unchanged and the fastest method—Array.sortOn
—got slower, but the manual methods—quickSort
and shellSort
—got faster. Overall, I’d call this a loss as the fastest method is slower in 10.2.
Increment and Decrement
Environment | Post-Increment | Pre-increment | Add One | Post-Decrement | Pre-Decrement | Subtract One |
---|---|---|---|---|---|---|
Flash Player 10.1 | 220 | 215 | 216 | 216 | 215 | 207 |
Flash Player 10.2 | 215 | 211 | 212 | 211 | 218 | 210 |
These numbers are very close and, with the high iteration count, statistically identical. There’s really no difference between any of the methods of incrementing or decrementing.
Array vs. Vector: Part II
Read Performance
Environment | Array | Vector (int) | Vector (uint) | Vector (Number) | Vector (Boolean) | Vector (String) | Vector (Object) |
---|---|---|---|---|---|---|---|
Flash Player 10.1 | 36 | 29 | 30 | 31 | 35 | 34 | 34 |
Flash Player 10.2 | 35 | 31 | 32 | 31 | 33 | 32 | 33 |
Write Performance
Environment | Array | Vector (int) | Vector (uint) | Vector (Number) | Vector (Boolean) | Vector (String) | Vector (Object) |
---|---|---|---|---|---|---|---|
Flash Player 10.1 | 96 | 35 | 36 | 35 | 81 | 90 | 132 |
Flash Player 10.2 | 97 | 36 | 35 | 36 | 79 | 94 | 130 |
Delete Performance
Environment | Array | Vector (int) | Vector (uint) | Vector (Number) | Vector (Boolean) | Vector (String) | Vector (Object) |
---|---|---|---|---|---|---|---|
Flash Player 10.1 | 4450 | 4464 | 4551 | 4431 | 4423 | 4380 | 4422 |
Flash Player 10.2 | 3725 | 3598 | 3644 | 3637 | 3633 | 3665 | 3683 |
Reading and writing are totally unchanged, but deleting is about 25% faster across the board.
Inline Math.ceil() Part II
Environment | Inline (positive) | Inline (positive and negative) | Math.ceil() |
---|---|---|---|
Flash Player 10.1 | 262 | 281 | 933 |
Flash Player 10.2 | 269 | 286 | 921 |
No change here: inlining is still massively faster than Math.ceil
.
String Conversion
Flash Player 10.1
Type | String() | Concat | toString() | toString(10) | join() | for loop |
---|---|---|---|---|---|---|
int | 3 | 2 | 192 | 192 | n/a | n/a |
uint | 2 | 2 | 185 | 181 | n/a | n/a |
Number | 0 | 0 | 172 | 171 | n/a | n/a |
Boolean | 0 | 0 | 16 | n/a | n/a | n/a |
Array | 2608 | 2618 | 2526 | n/a | 2531 | 1472 |
Vector (int) | 2288 | 2321 | 2138 | n/a | 2103 | 1345 |
Flash Player 10.2
Type | String() | Concat | toString() | toString(10) | join() | for loop |
---|---|---|---|---|---|---|
int | 3 | 2 | 180 | 182 | n/a | n/a |
uint | 2 | 2 | 179 | 172 | n/a | n/a |
Number | 2 | 2 | 156 | 158 | n/a | n/a |
Boolean | 2 | 2 | 8 | n/a | n/a | n/a |
Array | 2721 | 2691 | 2644 | n/a | 2536 | 1501 |
Vector (int) | 2246 | 2247 | 2075 | n/a | 2059 | 1326 |
There’s no real change here. For int
, uint
, and Number
, toString
is still a bunch faster than concatenation or the String()
conversion function. For Array
and Vector
, a for-in
loop is still the fastest way.
Arguments Slowdown
Flash Player 10.0
Environment | Plain (1) | Var Args (1) | arguments (1) | Plain (25) | Var Args (25) | arguments (25) |
---|---|---|---|---|---|---|
Flash Player 10.1 | 16 | 193 | 404 | 20 | 670 | 1040 |
Flash Player 10.2 | 10 | 14 | 382 | 19 | 23 | 768 |
We saw it in part one of this series and we see it again: var args has been amazingly sped up! There’s virtually no penalty to using var args anymore, whereas it used to be about 30x slower. This was probably due to an array allocation behind the scenes that has since been removed. As for the arguments
keyword, we see that using it is still just as painful as it was in 10.1. The recommendation is clear: use var args!
Variable Ordering
Flash Player 10.0
Environment | testIFirst | testILast |
---|---|---|
Flash Player 10.1 | 9559 | 8451 |
Flash Player 10.2 | 9334 | 8034 |
Both kinds of local variable access are faster in 10.2, but the local variables declared later has widened its performance gap with local variables declared first to the extent that it’s now about 15% faster.
More To Come
I’ll reserve any general conclusions until the series has concluded, but for now the performance is quite mixed. Stay tuned for the third and final article!