An absolute fundamental of programming is the concept of logical operators like && and ||. In a recent comment, Chris H pointed out that MXMLC doesn’t do a particularly good job generating bytecode for these operators. Today I’ll look further into the subject and see just how much this impacts performance.

Let’s take a look at a very simple function using the “logical and” (&&) operator:

private function compound(a:int, b:int, c:int, d:int, e:int): void
{
	if (a == 1 && b == 2 && c == 3 && d == 4 && e == 5)
	{
	}
}

First note that this function doesn’t really do anything. MXMLC is perfectly free to delete the contents of this function as an optimization since they can have no effect on anything. However MXMLC (at least in version 4.1) doesn’t do this sort of optimization. While bad for performance, it’ll allow us to see the bytecode that would be generated if MXMLC were forced to keep the if statement around like it would in a real function. So, let’s examine the output bytecode:

  function private::compound(int,int,int,int,int):void	/* disp_id 0*/
  {
    // local_count=6 max_scope=1 max_stack=2 code_len=53
    0       getlocal0     	
    1       pushscope     	
    2       getlocal1     	
    3       pushbyte      	1
    5       equals        	
    6       dup           	
    7       iffalse       	L1
 
    11      pop           	
    12      getlocal2     	
    13      pushbyte      	2
    15      equals        	
 
    L1: 
    16      dup           	
    17      iffalse       	L2
 
    21      pop           	
    22      getlocal3     	
    23      pushbyte      	3
    25      equals        	
 
    L2: 
    26      dup           	
    27      iffalse       	L3
 
    31      pop           	
    32      getlocal      	4
    34      pushbyte      	4
    36      equals        	
 
    L3: 
    37      dup           	
    38      iffalse       	L4
 
    42      pop           	
    43      getlocal      	5
    45      pushbyte      	5
    47      equals        	
 
    L4: 
    48      iffalse       	L5
 
    L5: 
    52      returnvoid    	
  }

While a bit long, this bytecode is basically doing the same thing over and over as it checks a, b, c, d, and e against 1, 2, 3, 4, and 5. The way it does this is through the following series of operations:

  1. getlocal1 – Push the variable to compare to the operand stack
  2. pushbyte 1 – Push the literal value to compare to the operand stack
  3. equals – Pop the variable and literal value, compare them, and push the resulting boolean
  4. dup – Duplicate the resulting boolean
  5. iffalse L1 – Skip second comparison if the comparison failed

While not extremely complicated, this approach has the downside of performing multiple jumps via iffalse and quite a bit of seemingly-unnecessary stack operations like dup. Let’s look at an alternative approach in AS3 to see if we can trick the compiler into generating better bytecode whilst still accomplishing the exact same task:

private function chain(a:int, b:int, c:int, d:int, e:int): void
{
	if (a == 1)
	{
		if (b == 2)
		{
			if (c == 3)
			{
				if (d == 4)
				{
					if (e == 4)
					{
					}
				}
			}
		}
	}
}

This AS3 code is longer since we’ve eschewed the logical operator && in favor of equivalent, nested if blocks. Let’s see if this trick worked by looking at the generated bytecode:

  function private::chain(int,int,int,int,int):void	/* disp_id 0*/
  {
    // local_count=6 max_scope=1 max_stack=2 code_len=40
    0       getlocal0     	
    1       pushscope     	
    2       getlocal1     	
    3       pushbyte      	1
    5       ifne          	L1
 
    9       getlocal2     	
    10      pushbyte      	2
    12      ifne          	L1
 
    16      getlocal3     	
    17      pushbyte      	3
    19      ifne          	L1
 
    23      getlocal      	4
    25      pushbyte      	4
    27      ifne          	L1
 
    31      getlocal      	5
    33      pushbyte      	4
    35      ifne          	L1
 
    L1: 
    39      returnvoid    	
  }

Unlike the AS3, this bytecode is quite a bit shorter and therefore starts out with a natural advantage: smaller code SWF size. Let’s see how it accomplishes the task:

  1. getlocal1 – Push the variable to compare to the operand stack
  2. pushbyte 1 – Push the literal value to compare to the operand stack
  3. ifne L1 – If the variable doesn’t equal the literal, skip to the end

By using ifne instead of equals, dup, and iffalse to do comparisons, this bytecode is much more compact and taking much better advantage of the available instructions. Further, this version properly skips to the end of the if chain as soon as a comparison fails rather than simply skipping to a strange mid-way point in the process. This allows the Flash VM to “short circuit” the comparisons properly and hopefully save some execution time. Speaking of execution time, let’s see a performance test:

package
{
	import flash.display.*;
	import flash.text.*;
	import flash.utils.*;
 
	public class LogicalOperators extends Sprite
	{
		public function LogicalOperators()
		{
			var logger:TextField = new TextField();
			logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(logger);
			function log(msg:*): void { logger.appendText(msg+"\n"); }
 
			var beforeTime:int;
			var afterTime:int;
			var i:int;
			var a:int, b:int, c:int, d:int, e:int;
			const REPS:int = 100000000;
 
			log("all true");
			a = 1; b = 2; c = 3; d = 4; e = 5;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				if (a == 1 && b == 2 && c == 3 && d == 4 && e == 5)
				{
				}
			}
			afterTime = getTimer();
			log("\tCompound: " + (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				if (a == 1)
				{
					if (b == 2)
					{
						if (c == 3)
						{
							if (d == 4)
							{
								if (e == 4)
								{
								}
							}
						}
					}
				}
			}
			afterTime = getTimer();
			log("\tChain: " + (afterTime-beforeTime));
 
			log("all false");
			a = 0; b = 0; c = 0; d = 0; e = 0;
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				if (a == 1 && b == 2 && c == 3 && d == 4 && e == 5)
				{
				}
			}
			afterTime = getTimer();
			log("\tCompound: " + (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				if (a == 1)
				{
					if (b == 2)
					{
						if (c == 3)
						{
							if (d == 4)
							{
								if (e == 4)
								{
								}
							}
						}
					}
				}
			}
			afterTime = getTimer();
			log("\tChain: " + (afterTime-beforeTime));
		}
	}
}

The performance test measures the time it takes to run through these five comparisons via the “compound” approach (using local operators) and via the “chain” approach (using a chain of if blocks). Here are the performance results I get:

Environment All True All False
Compound Chain Compound Chain
3.0 Ghz Intel Core 2 Duo, Windows XP 806 428 565 291
2.0 Ghz Intel Core 2 Duo, Mac OS X 1302 639 844 432

As you can see, the performance difference is striking: if chains are about twice as fast regardless of platform or even if any or all of the comparisons are true. If you don’t mind a little extra typing in AS3, you can save a little SWF size and a lot of execution time by using an if chain instead of logical operators like &&.