The site has had many articles about improving the performance of your app, but never discussed the basic methodology on which all optimizations should be based. Today’s article will go over a scientific approach to optimizing that makes use of a tool known as a profiler and demonstrate using an AS3 application just why it’s so important to usage such a tool.

A profiler is a tool that gathers statistics about the performance cost of each function in your app and presents it to you in a useful way. Usually, you’ll get a long list of functions with the top function taking the most time to complete and the bottom function taking the least. You see at a glance which functions are worth your time to optimize and which are not. This information is often surprising, even to programmers with many years of experience optimizing for performance. As an experiment, take a look at the following simple AS3 app and see if you can guess the performance problem.

package
{
	import flash.display.Sprite;
	import flash.display.StageAlign;
	import flash.display.StageScaleMode;
	import flash.events.Event;
	import flash.text.TextField;
	import flash.text.TextFieldAutoSize;
	import flash.utils.getTimer;
 
	public class ProfileMe extends Sprite
	{
		private static const SIZE:int = 5000;
		private var logger:TextField = new TextField();
		private var vec:Vector.<Number> = new Vector.<Number>(SIZE);
 
		public function ProfileMe()
		{
			addEventListener(Event.ENTER_FRAME, onEnterFrame);
 
			stage.align = StageAlign.TOP_LEFT;
			stage.scaleMode = StageScaleMode.NO_SCALE;
 
			logger.text = "Running test...";
			logger.y = 100;
			logger.autoSize = TextFieldAutoSize.LEFT;
			addChild(logger);
		}
 
		private function onEnterFrame(ev:Event): void
		{
			logger.text = "";
 
			var beforeTime:int;
			var afterTime:int;
			var totalTime:int;
 
			row("Operation", "Time");
 
			beforeTime = getTimer();
			buildVector();
			afterTime = getTimer();
			totalTime += afterTime - beforeTime;
			row("buildVector", (afterTime-beforeTime));
 
			beforeTime = getTimer();
			vec.sort(vecCompare);
			afterTime = getTimer();
			totalTime += afterTime - beforeTime;
			row("sort", (afterTime-beforeTime));
 
			row("total", totalTime);
		}
 
		private function buildVector(): void
		{
			var SIZE:int = ProfileMe.SIZE;
			var vec:Vector.<Number> = this.vec;
			for (var i:int; i < SIZE; ++i)
			{
				vec[i] = Math.abs(i)
					* Math.ceil(i)
					* Math.cos(i)
					* Math.exp(i)
					* Math.floor(i)
					* Math.round(i)
					* Math.sin(i)
					* Math.sqrt(i);
			}
		}
 
		private function vecCompare(a:Number, b:Number): int
		{
			if (a < b)
			{
				return -1;
			}
			else if (a > b)
			{
				return 1;
			}
			return 0;
		}
 
		private function row(...cols): void
		{
			logger.appendText(cols.join(",")+"\n");
		}
	}
}

In step one, the app builds a Vector of Number out of a lot of Math calls. In step two, the app calls Vector.sort to sort the list. I ran this test on the following environment:

  • Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 11.1.102.63
  • 2.4 Ghz Intel Core i5
  • Mac OS X 10.7.3

And got these results

Operation Time
buildVector 1
sort 78
total 79

In a debug version of Flash Player, which is required to run the profiler, I got:

Operation Time
buildVector 7
sort 620
total 627

So clearly the Math calls are faster than the Vector sorting. In this simple app it was easy to add getTimer calls around the only two functions. But what if your app consists of thousands or tens of thousands of lines of code? Clearly, it’s impractical to add so many getTimer calls, even if you limit yourself to what you guess are the expensive portions of your app.

Enter the profiler. There are many available for AS3, usually as part of an IDE like Flash Builder, FlashDevelop, or FDT. Instead, we’ll be using TheMiner (formerly FlashPreloadProfiler) which is built in pure AS3 code rather than as an external tool. To set it up, let’s add a few lines of code to the above app:

DEBUG::profile
{
	if (Capabilities.isDebugger)
	{
		addChild(new TheMiner());
	}
}

DEBUG::profile is simply a Boolean compile-time constant that lets us turn off the profiler with a compiler setting. Even if it’s enabled, it requires a debug version of the Flash Player to run, so we don’t try to run if Capabilities.isDebugger is false.

Next, we simply download the TheMiner SWC and add it to the application. If you’re compiling with the command line tool MXMLC or COMPC, your new command will look like this:

mxmlc --library-path+=TheMiner_en_v1_3_10.swc ProfileMe.as

Now when we run the app we see a UI for the profiler at the top:

Profiler UI

Clicking on the “Performance Profiler” button, we see:

Profiler UI, expanded

Here we immediately see the source of the problem in top listed function:

Function Name %
ProfileMe/vecCompare 81.02
Vector.$/_sort 17.63
ProfileMe/buildVector 0.41
Math$/sqrt 0.14

Notice how the sorting functions (the first two) dwarf the building functions (the second two). Together, they’re taking over 98% of the total run time! It would be a waste of our time to worry about the building functions, so let’s optimize the sorting ones. To do that, we’ll use skyboy‘s fastSort function instead of plain old Vector.sort. It’s a simple one line change from:

vec.sort(vecCompare);

To:

fastSort(vec, vecCompare);

With this in place, I now get these results in a release player:

Operation Time
buildVector 1
sort 23
total 24

And in a debug player:

Operation Time
buildVector 7
sort 48
total 55

So in release we’ve optimized the total application from 79 milliseconds to 24, nearly a 3x improvement. If we had spent our time optimizing out all of the Math calls with something like a lookup table, we could have only possibly gotten a 1 millisecond savings, which would be about 1% faster.

In conclusion, a profiler is definitely a tool that you want to use while performance tuning your app. It helps you quickly and easily identify the performance problems and, perhaps even more importantly, the performance problems you don’t have. Don’t waste time optimizing (and often uglifying) your code if you don’t have to. Instead, try out a profiler like TheMiner and speed up your app without taking shots in the dark.

Questions? Comments? Spot a bug or typo? Post a comment!