There’s no doubt that Flash 11’s new Stage3D API can produce some amazing results by giving us access to the power of the user’s video card/GPU. However, it’d be a mistake to blindly assume that it is always faster than the traditional Flash display list (i.e. Stage). Today’s article begins a series that discusses the topic of “draw calls” and how they heavily impact the performance of your application.

Simply put, a “draw call” (or simply a “draw”) is a call to Context3D.drawTriangles. To get anything other than a background color to show up, you’ll need to make at least one call to this function. As it turns out, a video card/GPU is happiest when you make exactly one call to this function. The reason is simple: GPUs prefer to be given a large data set and to chug away on it uninterrupted. Giving the GPU a few triangles at a time is the perfect way to interrupt this batch processing of your rendering data. In fact, you may have noticed that a shader program in Flash 11 cannot contain any branching code such as if-else or switch. While these branching instructions are actually available on many GPUs, they result in an interruption to the GPU’s batch processing because the sequence of shader opcodes to execute will change depending on the condition of your branching code.

So, for today’s test I will pit the Stage3DSprite class from my Simple 2D With Stage3D article against the venerable Bitmap class from the Flash Player API. The test app allows you to change the mode from Stage3DSprite to Bitmap, enable/disable moving, rotating, and scaling the sprites, and increase/decrease the number of sprites being displayed. First off, here is the Stage3DSprite class with one method (dispose) added to it:

package
{
	import flash.geom.*;
	import flash.utils.*;
	import flash.display.*;
	import flash.display3D.*;
	import flash.display3D.textures.*;
 
	import com.adobe.utils.*;
 
	/**
	*   A Stage3D-based 2D sprite
	*   @author Jackson Dunstan, www.JacksonDunstan.com
	*/
	public class Stage3DSprite
	{
		/** Cached static lookup of Context3DVertexBufferFormat.FLOAT_2 */
		private static const FLOAT2_FORMAT:String = Context3DVertexBufferFormat.FLOAT_2;
 
		/** Cached static lookup of Context3DVertexBufferFormat.FLOAT_3 */
		private static const FLOAT3_FORMAT:String = Context3DVertexBufferFormat.FLOAT_3;
 
		/** Cached static lookup of Context3DProgramType.VERTEX */
		private static const VERTEX_PROGRAM:String = Context3DProgramType.VERTEX;
 
		/** Cached static lookup of Vector3D.Z_AXIS */
		private static const Z_AXIS:Vector3D = Vector3D.Z_AXIS;
 
		/** Temporary AGAL assembler to avoid allocation */
		private static const tempAssembler:AGALMiniAssembler = new AGALMiniAssembler();
 
		/** Temporary rectangle to avoid allocation */
		private static const tempRect:Rectangle = new Rectangle();
 
		/** Temporary point to avoid allocation */
		private static const tempPoint:Point = new Point();
 
		/** Temporary matrix to avoid allocation */
		private static const tempMatrix:Matrix = new Matrix();
 
		/** Temporary 3D matrix to avoid allocation */
		private static const tempMatrix3D:Matrix3D = new Matrix3D();
 
		/** Cache of positions Program3D per Context3D */
		private static const programsCache:Dictionary = new Dictionary(true);
 
		/** Cache of positions and texture coordinates VertexBuffer3D per Context3D */
		private static const posUVCache:Dictionary = new Dictionary(true);
 
		/** Cache of triangles IndexBuffer3D per Context3D */
		private static const trisCache:Dictionary = new Dictionary(true);
 
		/** Vertex shader program AGAL bytecode */
		private static var vertexProgram:ByteArray;
 
		/** Fragment shader program AGAL bytecode */
		private static var fragmentProgram:ByteArray;
 
		/** 3D context to use for drawing */
		public var ctx:Context3D;
 
		/** 3D texture to use for drawing */
		public var texture:Texture;
 
		/** Width of the created texture */
		public var textureWidth:uint;
 
		/** Height of the created texture */
		public var textureHeight:uint;
 
		/** X position of the sprite */
		public var x:Number = 0;
 
		/** Y position of the sprite */
		public var y:Number = 0;
 
		/** Rotation of the sprite in degrees */
		public var rotation:Number = 0;
 
		/** Scale in the X direction */
		public var scaleX:Number = 1;
 
		/** Scale in the Y direction */
		public var scaleY:Number = 1;
 
		/** Fragment shader constants: U scale, V scale, {unused}, {unused} */
		private var fragConsts:Vector.<Number> = new <Number>[1, 1, 1, 1];
 
		// Static initializer to create vertex and fragment programs
		{
			tempAssembler.assemble(
				Context3DProgramType.VERTEX,
				// Apply draw matrix (object -> clip space)
				"m44 op, va0, vc0\n" +
 
				// Scale texture coordinate and copy to varying
				"mov vt0, va1\n" +
				"div vt0.xy, vt0.xy, vc4.xy\n" +
				"mov v0, vt0\n"
			);
			vertexProgram = tempAssembler.agalcode;
 
			tempAssembler.assemble(
				Context3DProgramType.FRAGMENT,
				"tex oc, v0, fs0 <2d,linear,mipnone,clamp>"
			);
			fragmentProgram = tempAssembler.agalcode;
		}
 
		/**
		*   Make the sprite
		*   @param ctx 3D context to use for drawing
		*/
		public function Stage3DSprite(ctx:Context3D): void
		{
			this.ctx = ctx;
			if (!(ctx in trisCache))
			{
				// Create the shader program
				var program:Program3D = ctx.createProgram();
				program.upload(vertexProgram, fragmentProgram);
				programsCache[ctx] = program;
 
				// Create the positions and texture coordinates vertex buffer
				var posUV:VertexBuffer3D = ctx.createVertexBuffer(4, 5);
				posUV.uploadFromVector(
					new <Number>[
						// X,  Y,  Z, U, V
						-1,   -1, 0, 0, 1,
						-1,    1, 0, 0, 0,
						 1,    1, 0, 1, 0,
						 1,   -1, 0, 1, 1
					], 0, 4
				);
				posUVCache[ctx] = posUV;
 
				// Create the triangles index buffer
				var tris:IndexBuffer3D = ctx.createIndexBuffer(6);
				tris.uploadFromVector(
					new <uint>[
						0, 1, 2,
						2, 3, 0
					], 0, 6
				);
				trisCache[ctx] = tris;
			}
		}
 
		/**
		*   Set a BitmapData to use as a texture
		*   @param bmd BitmapData to use as a texture
		*/
		public function set bitmapData(bmd:BitmapData): void
		{
			var width:uint = bmd.width;
			var height:uint = bmd.height;
 
			// Create a new texture if we need to
			if (createTexture(width, height))
			{
				// If the new texture doesn't match the BitmapData's dimensions
				if (width != textureWidth || height != textureHeight)
				{
					// Create a BitmapData with the required dimensions
					var powOfTwoBMD:BitmapData = new BitmapData(
						textureWidth,
						textureHeight,
						bmd.transparent
					);
 
					// Copy the given BitmapData to the newly-created BitmapData
					tempRect.width = width;
					tempRect.height = height;
					powOfTwoBMD.copyPixels(bmd, tempRect, tempPoint);
 
					// Upload the newly-created BitmapData instead
					bmd = powOfTwoBMD;
 
					// Scale the UV to the sub-texture
					fragConsts[0] = textureWidth / width;
					fragConsts[1] = textureHeight / height;
				}
				else
				{
					// Reset UV scaling
					fragConsts[0] = 1;
					fragConsts[1] = 1;
				}
			}
 
			// Upload new BitmapData to the texture
			texture.uploadFromBitmapData(bmd);
		}
 
		/**
		*   Create the texture to fit the given dimensions
		*   @param width Width to fit
		*   @param height Height to fit
		*   @return If a new texture had to be created
		*/
		protected function createTexture(width:uint, height:uint): Boolean
		{
			width = nextPowerOfTwo(width);
			height = nextPowerOfTwo(height);
 
			if (!texture || textureWidth != width || textureHeight != height)
			{
				texture = ctx.createTexture(
					width,
					height,
					Context3DTextureFormat.BGRA,
					false
				);
				textureWidth = width;
				textureHeight = height;
				return true;
			}
			return false;
		}
 
		/**
		*   Render the sprite to the 3D context
		*/
		public function render(): void
		{
			tempMatrix3D.identity();
			tempMatrix3D.appendRotation(-rotation, Z_AXIS);
			tempMatrix3D.appendScale(scaleX, scaleY, 1);
			tempMatrix3D.appendTranslation(x, y, 0);
 
			ctx.setProgram(programsCache[ctx]);
			ctx.setTextureAt(0, texture);
			ctx.setProgramConstantsFromMatrix(VERTEX_PROGRAM, 0, tempMatrix3D, true);
			ctx.setProgramConstantsFromVector(VERTEX_PROGRAM, 4, fragConsts);
			ctx.setVertexBufferAt(0, posUVCache[ctx], 0, FLOAT3_FORMAT);
			ctx.setVertexBufferAt(1, posUVCache[ctx], 3, FLOAT2_FORMAT);
			ctx.drawTriangles(trisCache[ctx]);
		}
 
		/**
		*   Dispose of this sprite's resources
		*/
		public function dispose(): void
		{
			if (texture)
			{
				texture.dispose();
				texture = null;
			}
		}
 
		/**
		*   Get the next-highest power of two
		*   @param v Value to get the next-highest power of two from
		*   @return The next-highest power of two from the given value
		*/
		public static function nextPowerOfTwo(v:uint): uint
		{
			v--;
			v |= v >> 1;
			v |= v >> 2;
			v |= v >> 4;
			v |= v >> 8;
			v |= v >> 16;
			v++;
			return v;
		}
	}
}

Here is the test app:

package
{
	import flash.display3D.*;
	import flash.display.*;
	import flash.filters.*;
	import flash.events.*;
	import flash.text.*;
	import flash.geom.*;
	import flash.utils.*;
 
	public class Stage3DSpriteSpeed extends Sprite 
	{
		private static const MODE_3D:int = 1;
		private static const MODE_BITMAP:int = 2;
 
		[Embed(source="flash_logo_icon.jpg")]
		private static const TEXTURE:Class;
 
		private var context3D:Context3D;
		private var stats:TextField = new TextField();
		private var lastStatsUpdateTime:uint;
		private var lastFrameTime:uint;
		private var frameCount:uint;
		private var driver:TextField = new TextField();
 
		private var texture:BitmapData = (new TEXTURE() as Bitmap).bitmapData;
 
		private var sprites3D:Vector.<Stage3DSprite> = new <Stage3DSprite>[];
		private var spritesBitmap:Vector.<Bitmap> = new <Bitmap>[];
 
		private var mode:int = MODE_3D;
		private var moving:Boolean;
		private var rotating:Boolean;
		private var scaling:Boolean;
		private var numSprites:int = 2000;
		private var container:Sprite = new Sprite();
 
		public function Stage3DSpriteSpeed()
		{
			stage.align = StageAlign.TOP_LEFT;
			stage.scaleMode = StageScaleMode.NO_SCALE;
			stage.frameRate = 60;
 
			addChild(container);
 
			var stage3D:Stage3D = stage.stage3Ds[0];
			stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
			stage3D.requestContext3D(Context3DRenderMode.AUTO);
		}
 
		protected function onContextCreated(ev:Event): void
		{
			// Setup context
			var stage3D:Stage3D = stage.stage3Ds[0];
			stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
			context3D = stage3D.context3D;			
			context3D.configureBackBuffer(
				stage.stageWidth,
				stage.stageHeight,
				0,
				true
			);
			context3D.enableErrorChecking = true;
 
			// Setup UI
			stats.background = true;
			stats.backgroundColor = 0xffffffff;
			stats.autoSize = TextFieldAutoSize.LEFT;
			stats.text = "Getting FPS...";
			addChild(stats);
 
			driver.background = true;
			driver.backgroundColor = 0xffffffff;
			driver.text = "Driver: " + context3D.driverInfo;
			driver.autoSize = TextFieldAutoSize.LEFT;
			driver.y = stats.height;
			addChild(driver);
 
			makeButtons(
				"Mode: Stage3DSprite", "Mode: Bitmap", null,
				"Add 100 Sprites", "Remove 100 Sprites", null,
				"Enable Moving", "Enable Rotating", "Enable Scaling"
			);
 
			// Start the simulation
			makeSprites();
			addEventListener(Event.ENTER_FRAME, onEnterFrame);
		}
 
		private function makeButtons(...labels): void
		{
			const PAD:Number = 5;
 
			var curX:Number = PAD;
			var curY:Number = stage.stageHeight - PAD;
			for each (var label:String in labels)
			{
				if (!label)
				{
					curX = PAD;
					curY -= button.height + PAD;
					continue;
				}
 
				var tf:TextField = new TextField();
				tf.mouseEnabled = false;
				tf.selectable = false;
				tf.defaultTextFormat = new TextFormat("_sans", 16, 0x0071BB);
				tf.autoSize = TextFieldAutoSize.LEFT;
				tf.text = label;
				tf.name = "lbl";
 
				var button:Sprite = new Sprite();
				button.buttonMode = true;
				button.graphics.beginFill(0xF5F5F5);
				button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD);
				button.graphics.endFill();
				button.graphics.lineStyle(1);
				button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD);
				button.addChild(tf);
				button.addEventListener(MouseEvent.CLICK, onButton);
				if (curX + button.width > stage.stageWidth - PAD)
				{
					curX = PAD;
					curY -= button.height + PAD;
				}
				button.x = curX;
				button.y = curY - button.height;
				addChild(button);
 
				curX += button.width + PAD;
			}
		}
 
		private function makeSprites(): void
		{
			// Clear old sprites
			context3D.clear(0.5, 0.5, 0.5);
			context3D.present();
			for each (var spr3D:Stage3DSprite in sprites3D)
			{
				spr3D.dispose();
			}
			sprites3D.length = 0;
			spritesBitmap.length = 0;
			container.removeChildren();
 
			// Make new sprites
			var i:int;
			switch (mode)
			{
				case MODE_3D:
					var scale:Number = texture.width / stage.stageWidth;
					for (; i < numSprites; ++i)
					{
						spr3D = new Stage3DSprite(context3D);
						spr3D.bitmapData = texture;
						spr3D.x = Math.random()*2-1;
						spr3D.y = Math.random()*2-1;
						spr3D.scaleX = spr3D.scaleY = scale;
						sprites3D[i] = spr3D;
					}
					break;
				case MODE_BITMAP:
					for (; i < numSprites; ++i)
					{
						var bm:Bitmap = new Bitmap(texture);
						bm.x = Math.random()*stage.stageWidth;
						bm.y = Math.random()*stage.stageHeight;
						spritesBitmap[i] = bm;
						container.addChild(bm);
					}
					break;
			}
 
			// Reset FPS
			frameCount = 0;
			lastFrameTime = 0;
			lastStatsUpdateTime = getTimer();
		}
 
		private function onButton(ev:MouseEvent): void
		{
			var tf:TextField = ev.target.getChildByName("lbl");
			var lbl:String = tf.text;
			switch (lbl)
			{
				case "Mode: Stage3DSprite":
					mode = MODE_3D;
					makeSprites();
					break;
				case "Mode: Bitmap":
					mode = MODE_BITMAP;
					makeSprites();
					break;
				case "Add 100 Sprites":
					numSprites += 100;
					makeSprites();
					break;
				case "Remove 100 Sprites":
					if (numSprites)
					{
						numSprites -= 100;
						makeSprites();
					}
					break;
				case "Enable Moving":
					moving = true;
					tf.text = "Disable Moving";
					break;
				case "Disable Moving":
					moving = false;
					tf.text = "Enable Moving";
					break;
				case "Enable Rotating":
					rotating = true;
					tf.text = "Disable Rotating";
					break;
				case "Disable Rotating":
					rotating = false;
					tf.text = "Enable Rotating";
					break;
				case "Enable Scaling":
					scaling = true;
					tf.text = "Disable Scaling";
					break;
				case "Disable Scaling":
					scaling = false;
					tf.text = "Enable Scaling";
					break;
			}
		}
 
		private function onEnterFrame(ev:Event): void
		{
			// Render the scene
			switch (mode)
			{
				case MODE_3D:
					var spr3D:Stage3DSprite;
					context3D.clear(0.5, 0.5, 0.5);
					for each (spr3D in sprites3D)
					{
						spr3D.render();
					}
					if (moving)
					{
						for each (spr3D in sprites3D)
						{
							spr3D.x = Math.random()*2-1;
							spr3D.y = Math.random()*2-1;
						}
					}
					if (rotating)
					{
						for each (spr3D in sprites3D)
						{
							spr3D.rotation = 360*Math.random();
						}
					}
					if (scaling)
					{
						var baseScale:Number = texture.width / stage.stageWidth;
						for each (spr3D in sprites3D)
						{
							spr3D.scaleX = baseScale*Math.random();
							spr3D.scaleY = baseScale*Math.random();
						}
					}
					context3D.present();
					break;
				case MODE_BITMAP:
					var dispObj:DisplayObject;
					if (moving)
					{
						var stageWidth:Number = stage.stageWidth;
						var stageHeight:Number = stage.stageHeight;
						for each (dispObj in spritesBitmap)
						{
							dispObj.x = Math.random()*stageWidth;
							dispObj.y = Math.random()*stageHeight;
						}
					}
					if (rotating)
					{
						for each (dispObj in spritesBitmap)
						{
							dispObj.rotation = 360*Math.random();
						}
					}
					if (scaling)
					{
						for each (dispObj in spritesBitmap)
						{
							dispObj.scaleX = Math.random();
							dispObj.scaleY = Math.random();
						}
					}
					break;
			}
 
			// Update stats display
			frameCount++;
			var now:int = getTimer();
			var dTime:int = now - lastFrameTime;
			var elapsed:int = now - lastStatsUpdateTime;
			if (elapsed > 1000)
			{
				var framerateValue:Number = 1000 / (elapsed / frameCount);
				stats.text = "FPS: " + framerateValue.toFixed(4)
					+ ", Sprites: " + numSprites;
				lastStatsUpdateTime = now;
				frameCount = 0;
			}
			lastFrameTime = now;
		}
	}
}

And here is the texture image used.

Launch the test app

I ran the test app with in the following environment:

  • Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 11.1.102.55
  • 2.4 Ghz Intel Core i5
  • NVIDIA GeForce GT 330M 256 MB
  • Mac OS X 10.7.3

With these settings:

  • Sprites: 2000

Here are the results I got:

Mode FPS
Stage3DSprite 26
Bitmap 48

How could Stage3D have lost!? Even on this modern GPU that has no trouble playing games like World of Warcraft, it’s still brought to its knees by a measly 2000 quads. At two triangles per quad, that’s only 4000 triangles! Even worse, conventional wisdom says—correctly— that you shouldn’t scale or rotate your Bitmap objects because it’s much slower to draw them. But even drawing 2000 of them per frame on a pretty fast machine is way faster than the Stage3D approach which is supposed to be great at scaling and rotating 3D objects.

As you might have guessed, the answer lies in the number of draw calls being performed. Consider that each sprite that is drawn to the screen is its own draw call and you will see, given the introductory paragraph, that we are constantly interrupting the GPU. It really wants to chug along drawing all 4000 triangles in one go, but we’re feeding it only two at a time, telling it to stop and wait for our next two triangles, feeding it two more, and so on.

So, the above test shows you a clear-cut case of the classic 2D Stage beating the pants off of the supposedly high-performance Stage3D. The next article in this series will show you how to optimize to reduce draw calls and turn the tables on our old friend Stage. Stay tuned!

Spot a bug? Have a suggestion or a question? Post a comment!