To draw with Flash Player 11’s Stage3D API, you must set up the state of various GPU resources before finally calling drawTriangles. Inevitably, you’ll end up calling drawTriangles multiple times during a single frame to draw your characters, terrain, sky, and so forth. In between these calls you will change the GPU’s state by calling Context3D‘s set* functions. This article will show you which of these functions can literally cut your app’s performance in half.

There are many set* functions in Context3D, but here I’m testing some of the most common ones:

  • Shader programs: setProgram
  • Vertex buffers: setVertexBufferAt
  • Textures: setTextureAt

The following test app started life as the Simple Stage3D Camera test app. I then augmented it in the following ways:

  • Added more cubes until my performance dropped below the 60 FPS cap
  • Removed all camera controls and pointed it so that all cubes were visible
  • Made duplicate GPU resources for the texture, shader program, and vertex buffers (positions and texture coordinates)
  • Added check boxes to allow for changing the state of these GPU resources between every draw
  • Added dummy code (an & and a function call) when state changing is disabled to avoid an unfair advantage
  • UPDATE: Added buttons to change the number of cubes. The below results are for 15x15x15.

When the GPU state (i.e. texture, shader program, or vertex buffer) is changed between every draw, there should be no difference in the visual output because the alternate GPU resources being switched to are identical.

Here’s how the test app ended up:

package
{
	import com.adobe.utils.*;
 
	import flash.display.*;
	import flash.display3D.*;
	import flash.display3D.textures.*;
	import flash.events.*;
	import flash.filters.*;
	import flash.geom.*;
	import flash.text.*;
	import flash.utils.*;
 
	public class Stage3DStateChanging extends Sprite 
	{
		private static const FLOAT_2:String = Context3DVertexBufferFormat.FLOAT_2;
		private static const FLOAT_3:String = Context3DVertexBufferFormat.FLOAT_3;
 
		/** Positions of all cubes' vertices */
		private static const POSITIONS:Vector.<Number> = new <Number>[
			// back face - bottom tri
			-0.5, -0.5, -0.5,
			-0.5, 0.5, -0.5,
			0.5, -0.5, -0.5,
			// back face - top tri
			-0.5, 0.5, -0.5,
			0.5, 0.5, -0.5,
			0.5, -0.5, -0.5,
 
			// front face - bottom tri
			-0.5, -0.5, 0.5,
			-0.5, 0.5, 0.5,
			0.5, -0.5, 0.5,
			// front face - top tri
			-0.5, 0.5, 0.5,
			0.5, 0.5, 0.5,
			0.5, -0.5, 0.5,
 
			// left face - bottom tri
			-0.5, -0.5, -0.5,
			-0.5, 0.5, -0.5,
			-0.5, -0.5, 0.5,
			// left face - top tri
			-0.5, 0.5, -0.5,
			-0.5, 0.5, 0.5,
			-0.5, -0.5, 0.5,
 
			// right face - bottom tri
			0.5, -0.5, -0.5,
			0.5, 0.5, -0.5,
			0.5, -0.5, 0.5,
			// right face - top tri
			0.5, 0.5, -0.5,
			0.5, 0.5, 0.5,
			0.5, -0.5, 0.5,
 
			// bottom face - bottom tri
			-0.5, -0.5, 0.5,
			-0.5, -0.5, -0.5,
			0.5, -0.5, 0.5,
			// bottom face - top tri
			-0.5, -0.5, -0.5,
			0.5, -0.5, -0.5,
			0.5, -0.5, 0.5,
 
			// top face - bottom tri
			-0.5, 0.5, 0.5,
			-0.5, 0.5, -0.5,
			0.5, 0.5, 0.5,
			// top face - top tri
			-0.5, 0.5, -0.5,
			0.5, 0.5, -0.5,
			0.5, 0.5, 0.5
		];
 
		/** Texture coordinates of all cubes' vertices */
		private static const TEX_COORDS:Vector.<Number> = new <Number>[
			// back face - bottom tri
			1, 1,
			1, 0,
			0, 1,
			// back face - top tri
			1, 0,
			0, 0,
			0, 1,
 
			// front face - bottom tri
			0, 1,
			0, 0,
			1, 1,
			// front face - top tri
			0, 0,
			1, 0,
			1, 1,
 
			// left face - bottom tri
			0, 1,
			0, 0,
			1, 1,
			// left face - top tri
			0, 0,
			1, 0,
			1, 1,
 
			// right face - bottom tri
			1, 1,
			1, 0,
			0, 1,
			// right face - top tri
			1, 0,
			0, 0,
			0, 1,
 
			// bottom face - bottom tri
			0, 0,
			0, 1,
			1, 0,
			// bottom face - top tri
			0, 1,
			1, 1,
			1, 0,
 
			// top face - bottom tri
			0, 1,
			0, 0,
			1, 1,
			// top face - top tri
			0, 0,
			1, 0,
			1, 1
		];
 
		/** Triangles of all cubes */
		private static const TRIS:Vector.<uint> = new <uint>[
			2, 1, 0,    // back face - bottom tri
			5, 4, 3,    // back face - top tri
			6, 7, 8,    // front face - bottom tri
			9, 10, 11,  // front face - top tri
			12, 13, 14, // left face - bottom tri
			15, 16, 17, // left face - top tri
			20, 19, 18, // right face - bottom tri
			23, 22, 21, // right face - top tri
			26, 25, 24, // bottom face - bottom tri
			29, 28, 27, // bottom face - top tri
			30, 31, 32, // top face - bottom tri
			33, 34, 35  // top face - bottom tri
		];
 
		[Embed(source="flash_logo.png")]
		private static const TEXTURE:Class;
 
		private static const TEMP_DRAW_MATRIX:Matrix3D = new Matrix3D();
 
		private var context3D:Context3D;
		private var positionsBuffer:VertexBuffer3D;
		private var positionsBuffer2:VertexBuffer3D;
		private var texCoordsBuffer:VertexBuffer3D;
		private var texCoordsBuffer2:VertexBuffer3D;
		private var indexBuffer:IndexBuffer3D;
		private var program:Program3D;
		private var program2:Program3D;
		private var texture:Texture;
		private var texture2:Texture;
		private var camera:Camera3D;
		private var cubes:Vector.<Cube> = new Vector.<Cube>();
 
		private var changeProgram:Boolean;
		private var changePositionBuffer:Boolean;
		private var changeTexCoordBuffer:Boolean;
		private var changeTexture:Boolean;
		private var numCubes:uint = 15;
 
		private var fps:TextField = new TextField();
		private var lastFPSUpdateTime:uint;
		private var lastFrameTime:uint;
		private var frameCount:uint;
		private var driver:TextField = new TextField();
 
		public function Stage3DStateChanging()
		{
			stage.align = StageAlign.TOP_LEFT;
			stage.scaleMode = StageScaleMode.NO_SCALE;
			stage.frameRate = 60;
 
			var stage3D:Stage3D = stage.stage3Ds[0];
			stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
			stage3D.requestContext3D(Context3DRenderMode.AUTO);
		}
 
		protected function onContextCreated(ev:Event): void
		{
			// Setup context
			var stage3D:Stage3D = stage.stage3Ds[0];
			stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated);
			context3D = stage3D.context3D;            
			context3D.configureBackBuffer(
				stage.stageWidth,
				stage.stageHeight,
				0,
				true
			);
			context3D.enableErrorChecking = true;
 
			// Setup camera
			camera = new Camera3D(
				0.1, // near
				100, // far
				stage.stageWidth / stage.stageHeight, // aspect ratio
				40*(Math.PI/180), // vFOV
				-6, -6, 6, // position
				0, 0, 0, // target
				0, 1, 0 // up dir
			);
 
			// Setup cubes
			makeCubes();
 
			// Setup UI
			fps.background = true;
			fps.backgroundColor = 0xffffffff;
			fps.autoSize = TextFieldAutoSize.LEFT;
			fps.text = "Getting FPS...";
			addChild(fps);
 
			driver.background = true;
			driver.backgroundColor = 0xffffffff;
			driver.text = "Driver: " + context3D.driverInfo;
			driver.autoSize = TextFieldAutoSize.LEFT;
			driver.y = fps.height;
			addChild(driver);
 
			// Make checkboxes
			var checkBoxes:Sprite = new Sprite();
			var cb:Sprite;
			var thiz:Stage3DStateChanging = this;
			function makeCallback(field:String): Function
			{
				return function(checked:Boolean): void
				{
					thiz[field] = checked;
				};
			}
			for each (var option:Object in [
				{label:"Shader Program", field:"changeProgram"},
				{label:"Position Buffer", field:"changePositionBuffer"},
				{label:"Tex Coord Buffer", field:"changeTexCoordBuffer"},
				{label:"Texture", field:"changeTexture"}
			])
			{
				cb = makeCheckBox(option.label + ": ", false, makeCallback(option.field));
				cb.y = checkBoxes.height;
				checkBoxes.addChild(cb);
			}
			checkBoxes.y = stage.stageHeight - checkBoxes.height;
			addChild(checkBoxes);
 
			makeButtons("15x15x15 Cubes", "25x25x25 Cubes", "32x32x32 Cubes");
 
			var assembler:AGALMiniAssembler = new AGALMiniAssembler();
 
			// Vertex shader
			var vertSource:String = "m44 op, va0, vc0\nmov v0, va1\n"
			assembler.assemble(Context3DProgramType.VERTEX, vertSource);
			var vertexShaderAGAL:ByteArray = assembler.agalcode;
 
			// Fragment shader
			var fragSource:String = "tex oc, v0, fs0 <2d,linear,mipnone>";
			assembler.assemble(Context3DProgramType.FRAGMENT, fragSource);
			var fragmentShaderAGAL:ByteArray = assembler.agalcode;
 
			// Shader program
			program = context3D.createProgram();
			program.upload(vertexShaderAGAL, fragmentShaderAGAL);
			program2 = context3D.createProgram();
			program2.upload(vertexShaderAGAL, fragmentShaderAGAL);
 
			// Setup buffers
			positionsBuffer = context3D.createVertexBuffer(36, 3);
			positionsBuffer.uploadFromVector(POSITIONS, 0, 36);
			positionsBuffer2 = context3D.createVertexBuffer(36, 3);
			positionsBuffer2.uploadFromVector(POSITIONS, 0, 36);
			texCoordsBuffer = context3D.createVertexBuffer(36, 2);
			texCoordsBuffer.uploadFromVector(TEX_COORDS, 0, 36);
			texCoordsBuffer2 = context3D.createVertexBuffer(36, 2);
			texCoordsBuffer2.uploadFromVector(TEX_COORDS, 0, 36);
			indexBuffer = context3D.createIndexBuffer(36);
			indexBuffer.uploadFromVector(TRIS, 0, 36);
 
			// Setup texture
			var bmd:BitmapData = (new TEXTURE() as Bitmap).bitmapData;
			texture = context3D.createTexture(
				bmd.width,
				bmd.height,
				Context3DTextureFormat.BGRA,
				true
			);
			texture.uploadFromBitmapData(bmd);
			texture2 = context3D.createTexture(
				bmd.width,
				bmd.height,
				Context3DTextureFormat.BGRA,
				true
			);
			texture2.uploadFromBitmapData(bmd);
 
			// Start the simulation
			addEventListener(Event.ENTER_FRAME, onEnterFrame);
		}
 
		private function makeCubes(): void
		{
			cubes = new Vector.<Cube>();
			for (var i:int; i < numCubes; ++i)
			{
				for (var j:int = 0; j < numCubes; ++j)
				{
					for (var k:int = 0; k < numCubes; ++k)
					{
						cubes.push(new Cube(i*2, j*2, -k*2));
					}
				}
			}
		}
 
		private function makeButtons(...labels): void
		{
			const PAD:Number = 5;
 
			var curX:Number = stage.stageWidth;
			var curY:Number = stage.stageHeight;
			for each (var label:String in labels)
			{
				var tf:TextField = new TextField();
				tf.mouseEnabled = false;
				tf.selectable = false;
				tf.defaultTextFormat = new TextFormat("_sans", 16, 0x0071BB);
				tf.autoSize = TextFieldAutoSize.LEFT;
				tf.text = label;
				tf.name = "lbl";
				tf.background = true;
				tf.backgroundColor = 0xffffff;
 
				var button:Sprite = new Sprite();
				button.buttonMode = true;
				button.graphics.beginFill(0xF5F5F5);
				button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD);
				button.graphics.endFill();
				button.graphics.lineStyle(1);
				button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD);
				button.addChild(tf);
				button.addEventListener(MouseEvent.CLICK, onButton);
				tf.x = PAD/2;
				tf.y = PAD/2;
 
				button.x = curX - button.width;
				button.y = curY - button.height;
				addChild(button);
 
				curY -= button.height;
			}
		}
 
		public static function makeCheckBox(
			label:String,
			checked:Boolean,
			callback:Function,
			labelFormat:TextFormat=null): Sprite
		{
			var sprite:Sprite = new Sprite();
 
			var tf:TextField = new TextField();
			tf.autoSize = TextFieldAutoSize.LEFT;
			tf.text = label;
			tf.background = true;
			tf.backgroundColor = 0xffffff;
			tf.selectable = false;
			tf.mouseEnabled = false;
			tf.setTextFormat(labelFormat || new TextFormat("_sans"));
			sprite.addChild(tf);
 
			var size:Number = tf.height;
 
			var background:Shape = new Shape();
			background.graphics.beginFill(0xffffff);
			background.graphics.drawRect(0, 0, size, size);
			background.x = tf.width;
			sprite.addChild(background);
 
			var border:Shape = new Shape();
			border.graphics.lineStyle(1, 0x000000);
			border.graphics.drawRect(0, 0, size, size);
			border.x = background.x;
			sprite.addChild(border);
 
			var check:Shape = new Shape();
			check.graphics.lineStyle(1, 0x000000);
			check.graphics.moveTo(0, 0);
			check.graphics.lineTo(size, size);
			check.graphics.moveTo(size, 0);
			check.graphics.lineTo(0, size);
			check.x = background.x;
			check.visible = checked;
			sprite.addChild(check);
 
			sprite.addEventListener(
				MouseEvent.CLICK,
				function(ev:MouseEvent): void
				{
					checked = !checked;
					check.visible = checked;
					callback(checked);
				}
			);
 
			return sprite;
		}
 
		private function onButton(ev:MouseEvent): void
		{
			var tf:TextField = ev.target.getChildByName("lbl");
			var lbl:String = tf.text;
			switch (lbl)
			{
				case "15x15x15 Cubes":
					numCubes = 15;
					makeCubes();
					break;
				case "25x25x25 Cubes":
					numCubes = 25;
					makeCubes();
					break;
				case "32x32x32 Cubes":
					numCubes = 32;
					makeCubes();
					break;
			}
		}
 
		private function onEnterFrame(ev:Event): void
		{
			// Render scene
			context3D.setProgram(program);
			context3D.setVertexBufferAt(0, positionsBuffer, 0, FLOAT_3);
			context3D.setVertexBufferAt(1, texCoordsBuffer, 0, FLOAT_2);
			context3D.setTextureAt(0, texture);
 
			context3D.clear(0.5, 0.5, 0.5);
 
			// Draw all cubes
			var worldToClip:Matrix3D = camera.worldToClipMatrix;
			var drawMatrix:Matrix3D = TEMP_DRAW_MATRIX;
			var temp:int;
			var cubes:Vector.<Cube> = this.cubes;
			var numCubes:uint = cubes.length;
			for (var i:int; i < numCubes; ++i)
			{
				var cube:Cube = cubes[i];
 
				if (changeProgram)
				{
					context3D.setProgram(i & 1 ? program : program2);
				}
				else
				{
					temp = i & 1 ? 1 : 0;
					cube.dummyFunction();
				}
				if (changePositionBuffer)
				{
					context3D.setVertexBufferAt(0, i & 1 ? positionsBuffer : positionsBuffer2, 0, FLOAT_3);
				}
				else
				{
					temp = i & 1 ? 1 : 0;
					cube.dummyFunction();
				}
				if (changeTexCoordBuffer)
				{
					context3D.setVertexBufferAt(1, i & 1 ? texCoordsBuffer : texCoordsBuffer2, 0, FLOAT_2);
				}
				else
				{
					temp = i & 1 ? 1 : 0;
					cube.dummyFunction();
				}
				if (changeTexture)
				{
					context3D.setTextureAt(0, i & 1 ? texture : texture2);
				}
				else
				{
					temp = i & 1 ? 1 : 0;
					cube.dummyFunction();
				}
 
				cube.mat.copyToMatrix3D(drawMatrix);
				drawMatrix.prepend(worldToClip);
				context3D.setProgramConstantsFromMatrix(
					Context3DProgramType.VERTEX,
					0,
					drawMatrix,
					false
				);
				context3D.drawTriangles(indexBuffer, 0, 12);
			}
 
			context3D.present();
 
			// Update frame rate display
			frameCount++;
			var now:int = getTimer();
			var dTime:int = now - lastFrameTime;
			var elapsed:int = now - lastFPSUpdateTime;
			if (elapsed > 1000)
			{
				var framerateValue:Number = 1000 / (elapsed / frameCount);
				fps.text = "FPS: " + framerateValue.toFixed(1);
				lastFPSUpdateTime = now;
				frameCount = 0;
			}
			lastFrameTime = now;
		}
	}
}
import flash.display.Shape;
import flash.display.Sprite;
import flash.events.MouseEvent;
import flash.geom.*;
import flash.text.TextField;
import flash.text.TextFieldAutoSize;
import flash.text.TextFormat;
 
class Cube
{
	public var mat:Matrix3D;
 
	public function Cube(x:Number, y:Number, z:Number)
	{
		mat = new Matrix3D(
			new <Number>[
				1, 0, 0, x,
				0, 1, 0, y,
				0, 0, 1, z,
				0, 0, 0, 1
			]
		);
	}
 
	public function dummyFunction(): void { }
}

Run the test app

I ran this test app in the following environment:

  • Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
  • Release version of Flash Player 11.2.202.229
  • 2.4 Ghz Intel Core i5
  • NVIDIA GeForce GT 330M 256 MB
  • Mac OS X 10.7.3

It is very important to remember that this is just one possible testing environment. Other environments will very a lot compared to this one. For example, consider Windows machines running DirectX 9, iOS and Android devices running OpenGL ES 2.0, and desktops of all sorts running software rendering and you’ll have some idea of just how vast the performance landscape is. That said, the following results were gathered with a real world machine (a Mid-2010 MacBook Pro), so they will very likely apply to your users if you are targeting the desktop.

That said, here are the results I got:

Shader Program x x x x x x x x
Position Buffer x x x x x x x x
Tex Coord Buffer x x x x x x x x
Texture x x x x x x x x
FPS 58.2 26.6 49.3 23.6 49.3 23.6 46.7 23.3 44.1 25.3 30.5 22.3 37.6 22.4 36.1 22.1

Here we see a clear order of performance impact:

  1. Shader Program (54% performance loss)
  2. Texture (24% performance loss)
  3. Vertex buffer (15% performance loss)

Further, we see that state changes are always cumulative. No state change includes another, so saving any state change is always good for performance. Just because you’re already changing one part of the state does not mean that you should try to avoid changing another part.

In conclusion, state change plays a major role in the performance of your Stage3D-based app. If necessary, it is probably worth spending a sizable amount of CPU time to avoid changing the GPU state when not necessary.

Spot a bug? Have a suggestion or a question? Post a comment!