Conditionals Performance
Now that the Flash Player 10.1 testing is through I can return to a comment asking about the performance difference between if-else
chains and the ternary (? :
) operator. Further, I’ll discuss switch
statements to see if there is any difference in performance for these commonly-used methods of flow control.
All AS3 programmers make tons of use of if-else
chains and many also use switch
statements and the ternary (? :
) operator. Given their essential nature, it’s important to know the performance differences, if any, between them. It would seem that since none of these constructs exist at the bytecode level that all of them would be compiled down to the same bytecode using conditional jumps/branches. Consider the following very simple functions designed to be easily read in bytecode:
private function ifElse6(val:int): void { if (val == 0) { func0(); } else if (val == 1) { func1(); } else if (val == 2) { func2(); } else if (val == 3) { func3(); } else { func4(); } } private function ternary6(val:int): void { val == 0 ? func0() : val == 1 ? func1() : val == 2 ? func2() : val == 3 ? func3() : func4(); } private function switch6(val:int): void { switch (val) { case 0: func0(); break; case 1: func1(); break; case 2: func2(); break; case 3: func3(); break; default: func4(); } }
Each of them is simply calling a function based on the value of val
, be it 0, 1, 2, 3, or something else. The switch
version is arguably the cleanest and the ternary arguably the hardest to read since it’s usually a bad idea to have such a deeply-nested ternary statement. The if-else
version is somewhere in the middle as it is straightforward but verbose. Let’s start analyzing the generated bytecode with this if-else
version:
function private::ifElse6(int):void /* disp_id 0*/ { // local_count=2 max_scope=1 max_stack=2 code_len=67 0 getlocal0 1 pushscope 2 getlocal1 3 pushbyte 0 5 ifne L1 9 getlocal0 10 callpropvoid private::func0 (0) 13 jump L2 L1: 17 getlocal1 18 pushbyte 1 20 ifne L3 24 getlocal0 25 callpropvoid private::func1 (0) 28 jump L2 L3: 32 getlocal1 33 pushbyte 2 35 ifne L4 39 getlocal0 40 callpropvoid private::func2 (0) 43 jump L2 L4: 47 getlocal1 48 pushbyte 3 50 ifne L5 54 getlocal0 55 callpropvoid private::func3 (0) 58 jump L2 L5: 62 getlocal0 63 callpropvoid private::func4 (0) L2: 66 returnvoid }
This bytecode is nearly as straightforward as the AS3 code it was compiled from. It’s simply as sequence of skipping over blocks of code, i.e., the function calls, where val
doesn’t pass the equality test and then skipping to the end of the function after val
does match.
Since the ternary version should theoretically just be syntax sugar, let’s see how MXMLC compiles it:
function private::ternary6(int):void /* disp_id 0*/ { // local_count=2 max_scope=1 max_stack=2 code_len=71 0 getlocal0 1 pushscope 2 getlocal1 3 pushbyte 0 5 equals 6 iffalse L1 10 getlocal0 11 callpropvoid private::func0 (0) 14 jump L2 L1: 18 getlocal1 19 pushbyte 1 21 equals 22 iffalse L3 26 getlocal0 27 callpropvoid private::func1 (0) 30 jump L2 L3: 34 getlocal1 35 pushbyte 2 37 equals 38 iffalse L4 42 getlocal0 43 callpropvoid private::func2 (0) 46 jump L2 L4: 50 getlocal1 51 pushbyte 3 53 equals 54 iffalse L5 58 getlocal0 59 callpropvoid private::func3 (0) 62 jump L2 L5: 66 getlocal0 67 callpropvoid private::func4 (0) L2: 70 returnvoid }
This version is very similar to the if-else
version, but unfortunately involves more stack access as it keeps using equals
then iffalse
rather than directly using ifne
. This would be like writing in AS3 if ((val == 3) == true)
rather than the much more common if (val == 3)
. They both have the same effect, but one pointlessly uses more instructions.
The switch
statement is quite fancy in AS3 compared to, for example, C/C++ or Java as it can work on non-integer types and, unlike C#, supports falling through even when there is code in a case
. Let’s see how this translates into bytecode:
function private::switch6(int):void /* disp_id 0*/ { // local_count=3 max_scope=1 max_stack=2 code_len=140 0 getlocal0 1 pushscope 2 jump L1 L2: 6 label 7 getlocal0 8 callpropvoid private::func0 (0) 11 jump L3 L4: 15 label 16 getlocal0 17 callpropvoid private::func1 (0) 20 jump L3 L5: 24 label 25 getlocal0 26 callpropvoid private::func2 (0) 29 jump L3 L6: 33 label 34 getlocal0 35 callpropvoid private::func3 (0) 38 jump L3 L7: 42 label 43 getlocal0 44 callpropvoid private::func4 (0) 47 jump L3 L1: 51 getlocal1 52 setlocal2 53 pushbyte 0 55 getlocal2 56 ifstrictne L8 60 pushbyte 0 62 jump L9 L8: 66 pushbyte 1 68 getlocal2 69 ifstrictne L10 73 pushbyte 1 75 jump L9 L10: 79 pushbyte 2 81 getlocal2 82 ifstrictne L11 86 pushbyte 2 88 jump L9 L11: 92 pushbyte 3 94 getlocal2 95 ifstrictne L12 99 pushbyte 3 101 jump L9 L12: 105 jump L13 109 pushbyte 4 111 jump L9 L13: 115 pushbyte 4 L9: 117 kill 2 119 lookupswitch default:L7 maxcase:4 L2 L4 L5 L6 L7 L3: 139 returnvoid }
Well this version sure is different! Even though it has the same effect as the other two versions, MXMLC produces bytecode that’s about twice as long. and makes use of some special instructions like lookupswitch
. It starts off by jumping past all of the case
blocks and then using yet-another approach to jump/branch logic compared to the if-else
and ternary versions. Here, similar to the if-else
version, ifstrictne
is used rather than the boolean test that the ternary version used. Still, it’s using ifstrictne
instead of the plain ifne
version, which is analogous to using the AS3 !==
operator instead of !=
. This isn’t necessary though since the values being compared are simply of int
type, but we’ll have to wait and see if it contributes to any performance degradation. Regardless, this jumping/branching doesn’t actually jump into the case
blocks, but rather sets up the arguments to the lookupswitch
instruction which does the actual work to decide which case statement should be executed.
So how do all of the above differences manifest themselves in actual performance? We’ll let’s look at a quick performance test:
package { import flash.text.*; import flash.utils.*; import flash.display.*; public class ConditionalsTest extends Sprite { public function ConditionalsTest() { stage.scaleMode = StageScaleMode.NO_SCALE; stage.align = StageAlign.TOP_LEFT; var logger:TextField = new TextField(); logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); function log(msg:*): void { logger.appendText(msg+"\n"); } var beforeTime:int; var afterTime:int; var i:int; const ITERATIONS:int = 50000000; for each (var val:int in [0,1,2,3,4]) { log(val); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { if (val == 0) { func0(); } else if (val == 1) { func1(); } else if (val == 2) { func2(); } else if (val == 3) { func3(); } else { func4(); } } afterTime = getTimer(); log("\tIf-else: " + (afterTime-beforeTime)); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { val == 0 ? func0() : val == 1 ? func1() : val == 2 ? func2() : val == 3 ? func3() : func4(); } afterTime = getTimer(); log("\tTernary: " + (afterTime-beforeTime)); beforeTime = getTimer(); for (i = 0; i < ITERATIONS; ++i) { switch (val) { case 0: func0(); break; case 1: func1(); break; case 2: func2(); break; case 3: func3(); break; default: func4(); } } afterTime = getTimer(); log("\tSwitch: " + (afterTime-beforeTime)); } } private function func0(): void{} private function func1(): void{} private function func2(): void{} private function func3(): void{} private function func4(): void{} } }
This test is designed to show the differences between hitting on the first attempt (val == 0
), the second, third, fourth, and default cases. Here are the results:
Environment | If-Else | Ternary | Switch | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 0 | 1 | 2 | 3 | 4 | 0 | 1 | 2 | 3 | 4 | |
3.0 Ghz Intel Core 2 Duo, Windows XP | 361 | 400 | 453 | 457 | 455 | 360 | 405 | 437 | 457 | 456 | 443 | 468 | 504 | 526 | 502 |
2.0 Ghz Intel Core 2 Duo, Mac OS X | 702 | 737 | 750 | 892 | 858 | 702 | 737 | 753 | 892 | 859 | 725 | 802 | 874 | 902 | 927 |
Here are some observations:
- Aside from overall speed, there seem to be no performance differences between the operating systems involved.
- We do see a general slowdown as
val
increased and it look more comparisons to find its match, regardless of the type of conditional used. It would have been nice ifswitch
could have improved this, but this wasn’t expected from the bytecode above. - The
if-else
and ternary versions are nearly identical from a performance standpoint. It seems as though the boolean test we saw in the bytecode doesn’t have much of any impact, especially on Mac OS X. - Using
switch
is about 15% slower on Windows XP and 10% slower on Mac OS X. Beware of switches in performance-critical code!
#1 by whitered on August 16th, 2010 ·
such a deeply-nested ternary statement can be pretty readable if written in this way:
val == 0 ? func0() :
val == 1 ? func1() :
val == 2 ? func2() :
val == 3 ? func3() :
func4();
considering that this is the only statement that returns a value, using it isn’t such a bad idea
#2 by jackson on August 16th, 2010 ·
Good point. I should also point out that readability is always subjective. To me, the
if-else
is still much more readable.#3 by Alama on September 19th, 2010 ·
Lol; for me, it’s better switch readable, and the ternary syntax proposed by Whitered is very similar! Good idea for best performances switch ! ;-)
#4 by skyboy on October 20th, 2010 ·
Aside from taking an extra 100ms, the switch statement is actually faster between hits (20-40 ms instead of 40-60 ms).
I think this test should be redone with 20, 30 or more options to see which is faster in the long run, maybe even a test against Dictionary, Object or Class to see which is faster if all you’re doing is retrieving a value, or even running a function(s) in the case of a class, using getters.
#5 by jackson on October 21st, 2010 ·
I’m not sure I understand what you mean by “the switch statement is actually faster between hits”. I could certainly increase the number of cases beyond five, but it looks like there’s already a good demonstration that more cases means slower performance. Still, it may be interesting to see a graph of that performance as the number of cases increases. Perhaps I’ll do a followup article…
#6 by craig on August 10th, 2013 ·
very helpful. thanks for this!