Using static variables and functions is slow. That was the conclusion of the previous article on statics, but the subject is actually more nuanced than that. Today we’ll explore static more in-depth and find out just why it is so slow.

Based on some keen comments (particularly by skyboy), I’ve put together the following test:

// BaseClass.as
package
{
	import flash.display.Sprite;
	public class BaseClass extends Sprite
	{
		protected var superVal:Number = 44;
	}
}
 
// StaticTest2.as
package
{
	import flash.display.*;
	import flash.utils.*;
	import flash.text.*;
 
	public class StaticTest2 extends BaseClass
	{
		private var __logger:TextField = new TextField();
		private function row(...cols): void
		{
			__logger.appendText(cols.join(",")+"\n");
		}
 
		protected var val:Number = 33;
		protected static var staticVal:Number = 33;
 
		public function StaticTest2()
		{
			__logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(__logger);
 
			var beforeTime:int;
			var afterTime:int;
			var REPS:int = 1000000;
			var i:int;
			var c:Class;
			var num:Number;
 
			row("Test", "Time");
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				c = Math;
			}
			afterTime = getTimer();
			row("Get class", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = Math.PI;
			}
			afterTime = getTimer();
			row("Dot access static var", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = Math["PI"];
			}
			afterTime = getTimer();
			row("Index access static var", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = val;
			}
			afterTime = getTimer();
			row("num = val", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = this.val;
			}
			afterTime = getTimer();
			row("num = this.val", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = superVal;
			}
			afterTime = getTimer();
			row("num = superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = this.superVal;
			}
			afterTime = getTimer();
			row("num = this.superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = super.superVal;
			}
			afterTime = getTimer();
			row("num = super.superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = staticVal;
			}
			afterTime = getTimer();
			row("num = staticVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = StaticTest2.staticVal;
			}
			afterTime = getTimer();
			row("num = StaticTest2.staticVal", (afterTime-beforeTime));
		}
	}
}

Let’s look at the bytecode generated for each of these in order. First, just getting a class: (annotated by me)

getlex        	:Math  // get the Math class
coerce        	:Class // cast it to a Class
setlocal      	5      // assign it to the c variable

Next is getting a static variable (PI) of a class (Math), which was the focus of the last article:

getlex        	:Math // get the Math class
getproperty   	:PI   // get its PI property
convert_d     	      // convert PI to a Number
setlocal      	6     // assign PI to the num variable

Next we follow skyboy’s suggestion and access the PI property of Math by indexing like so: Math["PI"]

getlex        	:Math // get the Math class
pushstring    	"PI"  // push the "PI" string on the stack
getproperty   	private,StaticTest2,http://adobe.com/AS3/2006/builtin,,flash.text,flash.utils,private,,flash.display,StaticTest2,BaseClass,flash.display:Sprite,flash.display:DisplayObjectContainer,flash.display:InteractiveObject,flash.display:DisplayObject,flash.events:EventDispatcher:null // get the PI property
convert_d     	      // convert PI to a Number
setlocal      	6     // assign PI to the num variable

Wow, that’s a lot of arguments to getproperty! Later on in the article we’ll see what kind of performance impact that has. For now, let’s continue with some comparisons of non-static fields starting with num = val:

getlocal0     	                // get the "this" object
getproperty   	StaticTest2:val // get val, which is defined in StaticTest2
convert_d     	                // convert val to a Number
setlocal      	6               // assign val to the num variable

That was pretty straightforward. If we type a little more (as is common) and use num = this.val, does the compiler generate different bytecode? Let’s see:

getlocal0     	                // get the "this" object
getproperty   	StaticTest2:val // get val, which is defined in StaticTest2
convert_d     	                // convert val to a Number
setlocal      	6               // assign val to the num variable

The answer is clear: no, the generated code is identical whether or not you use the this keyword. Now what about variables defined in our parent class? Let’s start with num = superVal:

getlex        	BaseClass:superVal // get superVal
convert_d     	                   // convert superVal to a Number
setlocal      	6                  // assign superVal to the num variable

This bytecode uses a getlex just like the static accesses did, but it avoids the getproperty instruction. We’ll find out in a little bit what kind of performance impact this has. In the meantime, let’s see the bytecode generated when we use the this keyword to write num = this.superVal:

getlocal0     	                   // get the "this" object
getproperty   	BaseClass:superVal // get superVal
convert_d     	                   // convert superVal to a Number
setlocal      	6                  // assign superVal to num

Using the this keyword totally transforms the bytecode! Rather than using getlex, the bytecode is now identical to the num = this.val bytecode except for referencing the parent class instead of the base class in getproperty. Does the compiler do this when we use the super keyword? Let’s look:

getlocal0     	                   // get the "this" object
getsuper      	BaseClass:superVal // get superVal from the base class
convert_d     	                   // convert superVal to a Number
setlocal      	6                  // assign superVal to num

Here we have a third set of bytecode generated for functionally equivalent code. This version gets the this object like the version using the this keyword, but it uses a new instruction—getsuper—to fetch the superVal variable. Now that we have a good set of non-statics to compare against, let’s turn our attention back to statics and check out num = staticVal:

getlex        	StaticTest2:staticVal // get staticVal
convert_d     	                      // convert staticVal to a Number
setlocal      	6                     // assign staticVal to num

This access has generated the fewest number (3) of instructions so far, but are they faster given that they have a getlex? We’ll see in just a bit. Let’s look at the final test with num = StaticTest2.staticVal:

getglobalscope	                      // get the global scope
getslot       	1                     // get the first slot of the global scope
getproperty   	StaticTest2:staticVal // get the slot's first property: staticVal
convert_d     	                      // convert staticVal to a Number
setlocal      	6                     // assign staticVal to num

This has to be the most roundabout way of getting to staticVal given that it results in the same functionality as the previous version that simply had some irrelevant specification left off.

Now that we’ve inspected the bytecode, let’s look at the results. I ran these tests in the following environment:

  • Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 11.1.102.55
  • 2.4 Ghz Intel Core i5
  • Mac OS X 10.7.2

And got these results:

Test Time
Get class 3
Dot access static var 8
Index access static var 1360
num = val 2
num = this.val 2
num = superVal 2
num = this.superVal 2
num = super.superVal 2
num = staticVal 2
num = StaticTest2.staticVal 2

Field Performance

Given that the firs three tests forced the number of iterations down so far that there was apparently no difference between the rest of them, I decided to make another test class that doesn’t include the first three tests and crank up the iterations to examine the performance differences between the remaining tests:

package
{
	import flash.display.*;
	import flash.utils.*;
	import flash.text.*;
 
	public class StaticTest3 extends BaseClass
	{
		private var __logger:TextField = new TextField();
		private function row(...cols): void
		{
			__logger.appendText(cols.join(",")+"\n");
		}
 
		protected var val:Number = 33;
		protected static var staticVal:Number = 33;
 
		public function StaticTest3()
		{
			__logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(__logger);
 
			var beforeTime:int;
			var afterTime:int;
			var REPS:int = 100000000;
			var i:int;
			var c:Class;
			var num:Number;
 
			row("Test", "Time");
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = val;
			}
			afterTime = getTimer();
			row("num = val", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = this.val;
			}
			afterTime = getTimer();
			row("num = this.val", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = superVal;
			}
			afterTime = getTimer();
			row("num = superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = this.superVal;
			}
			afterTime = getTimer();
			row("num = this.superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = super.superVal;
			}
			afterTime = getTimer();
			row("num = super.superVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = staticVal;
			}
			afterTime = getTimer();
			row("num = staticVal", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			for (i = 0; i < REPS; ++i)
			{
				num = StaticTest3.staticVal;
			}
			afterTime = getTimer();
			row("num = StaticTest3.staticVal", (afterTime-beforeTime));
		}
	}
}

Here are the results for StaticTest3 in the same environment:

Test Time
num = val 215
num = this.val 223
num = superVal 219
num = this.superVal 220
num = super.superVal 220
num = staticVal 222
num = StaticTest3.staticVal 293

Field Performance (Limited Set)

Now that we have the results data, let’s draw some conclusions:

  • Indexing a static variable (Math["PI"]) is far and away the slowest way you can access a field. Avoid it at all costs in performance-critical code.
  • Just getting a class (c = Math) is 50% slower than accessing a field (static or not) in the same class or any superclass, let alone the time needed to access its fields.
  • Accessing a static field of another class is about 4x slower than accessing any field in the same class or superclass.
  • Specifying even the same class (StaticTest3.staticVar) results in bytecode that is slower than any other way of accessing a field of the same class or superclass, static or not.
  • Despite wildly different bytecode, all other ways of accessing fields of the same class or superclass are roughly the same speed. The JIT is probably producing identical machine code for all of these different sets of bytecode instructions.

The main takeaway here is to remember that static is slow when you’re accessing through a class name. Minimize that (perhaps by caching) and you should be fine.

Spot a bug? Have a suggestion? Post a comment!