Stage3D Draw Calls: Part 2
Today’s article shows you how to get great performance with a ton of sprites by reducing your Stage3D
draw calls. As we saw last time, Stage3D
performance is not guaranteed to be good and falls significantly below normal 2D Stage
performance even on expensive tasks like scaling and rotating Bitmap
objects as well as redraw regions covering the whole stage. Today we’ll show how to overcome those performance problems and beat the tar out of the 2D Stage
.
As you recall from the previous article in this series, draw calls (i.e. Context3D.drawTriangles
) interrupt the GPU’s ability to efficiently process huge batches of geometry. The Stage3DSprite
class was simple to implement and use, but it required one draw call per sprite which led to performance far inferior to regular Bitmap
objects on the 2D Stage
.
The new design centers around a class called Stage3DSprites
which, as its name suggests, represents a bunch of sprites. You call Stage3DSprites.addSprite
and are given a Stage3DSpriteData
object back which has the familiar x
, y
, rotation
, scaleX
, and scaleY
getters and setters, just like all DisplayObject
derivatives (including Bitmap
and Sprite
). When you want to draw these sprites, simply call Stage3DSprites.render
and all of them will be drawn in one giant call to Context3D.drawTriangles
, thus (hopefully) giving huge performance gains. Here are the classes and resources involved:
- Stage3DSprite now caches textures on a per-context basis to more efficiently use VRAM
package { import com.adobe.utils.*; import flash.display.*; import flash.display3D.*; import flash.display3D.textures.*; import flash.geom.*; import flash.utils.*; /** * A Stage3D-based 2D sprite * @author Jackson Dunstan, www.JacksonDunstan.com */ public class Stage3DSprite { /** Cached static lookup of Context3DVertexBufferFormat.FLOAT_2 */ private static const FLOAT2_FORMAT:String = Context3DVertexBufferFormat.FLOAT_2; /** Cached static lookup of Context3DVertexBufferFormat.FLOAT_3 */ private static const FLOAT3_FORMAT:String = Context3DVertexBufferFormat.FLOAT_3; /** Cached static lookup of Context3DProgramType.VERTEX */ private static const VERTEX_PROGRAM:String = Context3DProgramType.VERTEX; /** Cached static lookup of Vector3D.Z_AXIS */ private static const Z_AXIS:Vector3D = Vector3D.Z_AXIS; /** Temporary AGAL assembler to avoid allocation */ private static const tempAssembler:AGALMiniAssembler = new AGALMiniAssembler(); /** Temporary rectangle to avoid allocation */ private static const tempRect:Rectangle = new Rectangle(); /** Temporary point to avoid allocation */ private static const tempPoint:Point = new Point(); /** Temporary matrix to avoid allocation */ private static const tempMatrix:Matrix = new Matrix(); /** Temporary 3D matrix to avoid allocation */ private static const tempMatrix3D:Matrix3D = new Matrix3D(); /** Cache of positions Program3D per Context3D */ private static const programsCache:Dictionary = new Dictionary(true); /** Cache of positions and texture coordinates VertexBuffer3D per Context3D */ private static const posUVCache:Dictionary = new Dictionary(true); /** Cache of triangles IndexBuffer3D per Context3D */ private static const trisCache:Dictionary = new Dictionary(true); /** Cache of texture Dictionary (BitmapData->Texture) */ private static const textureCache:Dictionary = new Dictionary(true); /** Vertex shader program AGAL bytecode */ private static var vertexProgram:ByteArray; /** Fragment shader program AGAL bytecode */ private static var fragmentProgram:ByteArray; /** 3D context to use for drawing */ public var ctx:Context3D; /** Texture of the sprite */ public var texture:Texture; /** Width of the created texture */ public var textureWidth:uint; /** Height of the created texture */ public var textureHeight:uint; /** X position of the sprite */ public var x:Number = 0; /** Y position of the sprite */ public var y:Number = 0; /** Rotation of the sprite in degrees */ public var rotation:Number = 0; /** Scale in the X direction */ public var scaleX:Number = 1; /** Scale in the Y direction */ public var scaleY:Number = 1; /** Fragment shader constants: U scale, V scale, {unused}, {unused} */ private var fragConsts:Vector.<Number> = new <Number>[1, 1, 1, 1]; // Static initializer to create vertex and fragment programs { tempAssembler.assemble( Context3DProgramType.VERTEX, // Apply draw matrix (object -> clip space) "m44 op, va0, vc0\n" + // Scale texture coordinate and copy to varying "mov vt0, va1\n" + "div vt0.xy, vt0.xy, vc4.xy\n" + "mov v0, vt0\n" ); vertexProgram = tempAssembler.agalcode; tempAssembler.assemble( Context3DProgramType.FRAGMENT, "tex oc, v0, fs0 <2d,linear,mipnone,clamp>" ); fragmentProgram = tempAssembler.agalcode; } /** * Make the sprite * @param ctx 3D context to use for drawing */ public function Stage3DSprite(ctx:Context3D): void { this.ctx = ctx; if (!(ctx in trisCache)) { // Create the shader program var program:Program3D = ctx.createProgram(); program.upload(vertexProgram, fragmentProgram); programsCache[ctx] = program; // Create the positions and texture coordinates vertex buffer var posUV:VertexBuffer3D = ctx.createVertexBuffer(4, 5); posUV.uploadFromVector( new <Number>[ // X, Y, Z, U, V -1, -1, 0, 0, 1, -1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, -1, 0, 1, 1 ], 0, 4 ); posUVCache[ctx] = posUV; // Create the triangles index buffer var tris:IndexBuffer3D = ctx.createIndexBuffer(6); tris.uploadFromVector( new <uint>[ 0, 1, 2, 2, 3, 0 ], 0, 6 ); trisCache[ctx] = tris; } } /** * Clear cache of a context * @param ctx Context to clear cache for */ public static function clearCache(ctx:Context3D): void { delete trisCache[ctx]; delete programsCache[ctx]; delete posUVCache[ctx]; delete textureCache[ctx]; } /** * Set a BitmapData to use as a texture * @param bmd BitmapData to use as a texture */ public function set bitmapData(bmd:BitmapData): void { // Maybe it's already cached if (ctx in textureCache) { if (bmd in textureCache[ctx]) { texture = textureCache[ctx][bmd]; return; } } else { textureCache[ctx] = new Dictionary(); } var width:uint = bmd.width; var height:uint = bmd.height; // Create a new texture if we need to if (createTexture(width, height)) { // If the new texture doesn't match the BitmapData's dimensions if (width != textureWidth || height != textureHeight) { // Create a BitmapData with the required dimensions var powOfTwoBMD:BitmapData = new BitmapData( textureWidth, textureHeight, bmd.transparent ); // Copy the given BitmapData to the newly-created BitmapData tempRect.width = width; tempRect.height = height; powOfTwoBMD.copyPixels(bmd, tempRect, tempPoint); // Upload the newly-created BitmapData instead bmd = powOfTwoBMD; // Scale the UV to the sub-texture fragConsts[0] = textureWidth / width; fragConsts[1] = textureHeight / height; } else { // Reset UV scaling fragConsts[0] = 1; fragConsts[1] = 1; } } // Upload new BitmapData to the texture texture.uploadFromBitmapData(bmd); textureCache[ctx][bmd] = texture; } /** * Create the texture to fit the given dimensions * @param width Width to fit * @param height Height to fit * @return If a new texture had to be created */ protected function createTexture(width:uint, height:uint): Boolean { width = nextPowerOfTwo(width); height = nextPowerOfTwo(height); if (!texture || textureWidth != width || textureHeight != height) { texture = ctx.createTexture( width, height, Context3DTextureFormat.BGRA, false ); textureWidth = width; textureHeight = height; return true; } return false; } /** * Render the sprite to the 3D context */ public function render(): void { tempMatrix3D.identity(); tempMatrix3D.appendRotation(-rotation, Z_AXIS); tempMatrix3D.appendScale(scaleX, scaleY, 1); tempMatrix3D.appendTranslation(x, y, 0); ctx.setProgram(programsCache[ctx]); ctx.setTextureAt(0, texture); ctx.setProgramConstantsFromMatrix(VERTEX_PROGRAM, 0, tempMatrix3D, true); ctx.setProgramConstantsFromVector(VERTEX_PROGRAM, 4, fragConsts); ctx.setVertexBufferAt(0, posUVCache[ctx], 0, FLOAT3_FORMAT); ctx.setVertexBufferAt(1, posUVCache[ctx], 3, FLOAT2_FORMAT); ctx.drawTriangles(trisCache[ctx]); } /** * Dispose of this sprite's resources */ public function dispose(): void { if (texture) { texture.dispose(); texture = null; } } /** * Get the next-highest power of two * @param v Value to get the next-highest power of two from * @return The next-highest power of two from the given value */ public static function nextPowerOfTwo(v:uint): uint { v--; v |= v >> 1; v |= v >> 2; v |= v >> 4; v |= v >> 8; v |= v >> 16; v++; return v; } } }
- Stage3DSprites represents a collection of
Stage3DSpriteData
sprites and draws them all at once (i.e. one draw call)package { import com.adobe.utils.*; import flash.display.*; import flash.display3D.*; import flash.display3D.textures.*; import flash.geom.*; import flash.utils.*; /** * Stage3D-based 2D sprites * @author Jackson Dunstan, www.JacksonDunstan.com */ public class Stage3DSprites { /** Cached static lookup of Context3DVertexBufferFormat.FLOAT_3 */ private static const FLOAT3_FORMAT:String = Context3DVertexBufferFormat.FLOAT_3; /** Cached static lookup of Context3DVertexBufferFormat.FLOAT_4 */ private static const FLOAT4_FORMAT:String = Context3DVertexBufferFormat.FLOAT_4; /** Temporary AGAL assembler to avoid allocation */ private static const tempAssembler:AGALMiniAssembler = new AGALMiniAssembler(); /** Cache of positions Program3D per Context3D */ private var program:Program3D; /** Vertex shader program AGAL bytecode */ private static var vertexProgram:ByteArray; /** Fragment shader program AGAL bytecode */ private static var fragmentProgram:ByteArray; /** 3D context to use for drawing */ public var ctx:Context3D; /** 3D texture to use for drawing */ public var texture:Texture; /** Width of the created texture */ public var textureWidth:uint; /** Height of the created texture */ public var textureHeight:uint; /** Fragment shader constants: U scale, V scale, {unused}, {unused} */ private var fragConsts:Vector.<Number> = new <Number>[1, 1, 1, 1]; /** Data about the contained sprites */ private var spriteData:Vector.<Stage3DSpriteData> = new <Stage3DSpriteData>[]; /** Number of sprites */ private var numSprites:int; /** Triangle index data */ private var indexData:Vector.<uint> = new <uint>[]; /** Vertex data for all sprites */ private var vertexData:Vector.<Number> = new <Number>[]; /** Vertex buffer for all sprites */ private var vertexBuffer:VertexBuffer3D; /** Indx buffer for all sprites */ private var indexBuffer:IndexBuffer3D; /** If the vertex and index buffers need to be uploaded */ public var needUpload:Boolean; // Static initializer to create vertex and fragment programs { // VA // 0 - posX, posY, posZ, {unused} // 1 - U, V, translationX, translationY // 2 - scaleX, scaleY, cos(rotation), sin(rotation) // VC // FC // V // 0 - U, V, {unused}, {unused} tempAssembler.assemble( Context3DProgramType.VERTEX, // Initial position "mov vt0, va0\n" + // Rotate (about Z, like this...) // x' = x*cos(rot) - y*sin(rot) // y' = x*sin(rot) + y*cos(rot) "mul vt1.xy, vt0.xy, va2.zw\n" + // x*cos(rot), y*sin(rot) "sub vt0.x, vt1.x, vt1.y\n" + // x*cos(rot) - y*sin(rot) "mul vt1.xy, vt0.xy, va2.wz\n" + // x*sin(rot), y*cos(rot) "add vt0.y, vt1.x, vt1.y\n" + // x*sin(rot) + y*cos(rot) // Scale "mul vt0.xy, vt0.xy, va2.xy\n" + // Translate "add vt0.xy, vt0.xy, va1.zw\n" + // Output position "mov op, vt0\n" + // Copy texture coordinate to varying "mov v0, va1\n" ); vertexProgram = tempAssembler.agalcode; tempAssembler.assemble( Context3DProgramType.FRAGMENT, "tex oc, v0, fs0 <2d,linear,mipnone,clamp>" ); fragmentProgram = tempAssembler.agalcode; } /** * Make the sprite * @param ctx 3D context to use for drawing */ public function Stage3DSprites(ctx:Context3D) { this.ctx = ctx; // Create the shader program program = ctx.createProgram(); program.upload(vertexProgram, fragmentProgram); } /** * Set a BitmapData to use as a texture * @param bmd BitmapData to use as a texture */ public function set bitmapData(bmd:BitmapData): void { // Create a new texture if we need to var width:int = bmd.width; var height:int = bmd.height; if (!texture || textureWidth != width || textureHeight != height) { texture = ctx.createTexture( width, height, Context3DTextureFormat.BGRA, false ); textureWidth = width; textureHeight = height; } // Upload new BitmapData to the texture texture.uploadFromBitmapData(bmd); } /** * Add a sprite * @return The added sprite */ public function addSprite(): Stage3DSpriteData { if (vertexBuffer) { vertexBuffer.dispose(); indexBuffer.dispose(); } // Add the triangle indices for the sprite indexData.length += 6; var index:int = numSprites*6; var base:int = numSprites*4; indexData[index++] = base; indexData[index++] = base+1; indexData[index++] = base+2; indexData[index++] = base+2; indexData[index++] = base+3; indexData[index++] = base; // Add the sprite vertexData.length += 44; var spr:Stage3DSpriteData = new Stage3DSpriteData( vertexData, numSprites*44, numSprites, this ); spriteData.push(spr); numSprites++; vertexBuffer = ctx.createVertexBuffer(numSprites*4, 11); indexBuffer = ctx.createIndexBuffer(numSprites*6); needUpload = true; return spr; } /** * Remove all added sprites */ public function removeAllSprites(): void { numSprites = 0; vertexData.length = 0; indexData.length = 0; spriteData.length = 0; if (vertexBuffer) { vertexBuffer.dispose(); indexBuffer.dispose(); } } /** * Render the sprite to the 3D context */ public function render(): void { if (numSprites) { for each (var data:Stage3DSpriteData in spriteData) { data.update(); } if (needUpload) { vertexBuffer.uploadFromVector(vertexData, 0, numSprites*4); indexBuffer.uploadFromVector(indexData, 0, numSprites*6); needUpload = false; } // shader program for all sprites ctx.setProgram(program); // texture of all sprites ctx.setTextureAt(0, texture); // x, y, z, {unused} ctx.setVertexBufferAt(0, vertexBuffer, 0, FLOAT3_FORMAT); // u, v, translationX, translationY ctx.setVertexBufferAt(1, vertexBuffer, 3, FLOAT4_FORMAT); // scaleX, scaleY, cos(rotation), sin(rotation) ctx.setVertexBufferAt(2, vertexBuffer, 7, FLOAT4_FORMAT); // draw all sprites ctx.drawTriangles(indexBuffer, 0, numSprites*2); } } /** * Dispose of this sprite's resources */ public function dispose(): void { if (texture) { texture.dispose(); texture = null; } } } }
- Stage3DSpriteData represents a single sprite in a
Stage3DSprites
collection of sprites. It has accessors familiar to users ofDisplayObject
package { /** * A Stage3D-based sprite * @author Jackson Dunstan */ public class Stage3DSpriteData { /** Vertex data for all sprites */ private var __vertexData:Vector.<Number>; /** Index into the vertex data where the sprite's data is stored */ private var __vertexDataIndex:int; /** Index of __sprite in the sprites list */ private var __spriteIndex:int; /** Sprites __sprite is in */ private var __sprites:Stage3DSprites; /** X position of the sprite */ private var __x:Number = 0; /** Y position of the sprite */ private var __y:Number = 0; /** Rotation of the sprite in degrees */ private var __rotation:Number = 0; /** Scale in the X direction */ private var __scaleX:Number = 1; /** Scale in the Y direction */ private var __scaleY:Number = 1; /** If the transform data needs updating */ private var __needsUpdate:Boolean = true; /** * Make the sprite data * @param vertexData Vertex data for all sprites * @param vertexDataIndex Index into the vertex data where the * sprite's data is stored * @param spriteIndex Index of __sprite in the sprites list * @param sprites Sprites __sprite is in */ public function Stage3DSpriteData( vertexData:Vector.<Number>, vertexDataIndex:int, spriteIndex:int, sprites:Stage3DSprites ) { __vertexData = vertexData; __vertexDataIndex = vertexDataIndex; __spriteIndex = spriteIndex; __sprites = sprites; // Add the vertices for the first vertex vertexData[vertexDataIndex++] = -1; // x vertexData[vertexDataIndex++] = -1; // y vertexData[vertexDataIndex++] = 0; // z vertexData[vertexDataIndex++] = 0; // u vertexData[vertexDataIndex++] = 1; // v vertexDataIndex += 6; // skip transform data // Add the vertices for the second vertex vertexData[vertexDataIndex++] = -1; // x vertexData[vertexDataIndex++] = 1; // y vertexData[vertexDataIndex++] = 0; // z vertexData[vertexDataIndex++] = 0; // u vertexData[vertexDataIndex++] = 0; // v vertexDataIndex += 6; // skip transform data // Add the vertices for the third vertex vertexData[vertexDataIndex++] = 1; // x vertexData[vertexDataIndex++] = 1; // y vertexData[vertexDataIndex++] = 0; // z vertexData[vertexDataIndex++] = 1; // u vertexData[vertexDataIndex++] = 0; // v vertexDataIndex += 6; // skip transform data // Add the vertices for the fourth vertex vertexData[vertexDataIndex++] = 1; // x vertexData[vertexDataIndex++] = -1; // y vertexData[vertexDataIndex++] = 0; // z vertexData[vertexDataIndex++] = 1; // u vertexData[vertexDataIndex++] = 1; // v } /** * X position of the sprite */ public function get x(): Number { return __x; } public function set x(x:Number): void { __x = x; __needsUpdate = true; } /** * Y position of the sprite */ public function get y(): Number { return __y; } public function set y(y:Number): void { __y = y; __needsUpdate = true; } /** * Rotation of the sprite in degrees */ public function get rotation(): Number { return __rotation; } public function set rotation(rotation:Number): void { __rotation = rotation; __needsUpdate = true; } /** * Scale in the X direction */ public function get scaleX(): Number { return __scaleX; } public function set scaleX(scaleX:Number): void { __scaleX = scaleX; __needsUpdate = true; } /** * Scale in the Y direction */ public function get scaleY(): Number { return __scaleY; } public function set scaleY(scaleY:Number): void { __scaleY = scaleY; __needsUpdate = true; } /** * Tell the sprites collection that the sprite has been updated */ public function update(): void { if (__needsUpdate) { var cosRotation:Number = Math.cos(__rotation); var sinRotation:Number = Math.sin(__rotation); var vertexDataIndex:int = __vertexDataIndex+5; __vertexData[vertexDataIndex++] = __x; __vertexData[vertexDataIndex++] = __y; __vertexData[vertexDataIndex++] = __scaleX; __vertexData[vertexDataIndex++] = __scaleY; __vertexData[vertexDataIndex++] = cosRotation; __vertexData[vertexDataIndex++] = sinRotation; // Add the vertices for the second vertex vertexDataIndex += 5; // skip x, y, z, u, v __vertexData[vertexDataIndex++] = __x; __vertexData[vertexDataIndex++] = __y; __vertexData[vertexDataIndex++] = __scaleX; __vertexData[vertexDataIndex++] = __scaleY; __vertexData[vertexDataIndex++] = cosRotation; __vertexData[vertexDataIndex++] = sinRotation; // Add the vertices for the third vertex vertexDataIndex += 5; // skip x, y, z, u, v __vertexData[vertexDataIndex++] = __x; __vertexData[vertexDataIndex++] = __y; __vertexData[vertexDataIndex++] = __scaleX; __vertexData[vertexDataIndex++] = __scaleY; __vertexData[vertexDataIndex++] = cosRotation; __vertexData[vertexDataIndex++] = sinRotation; // Add the vertices for the fourth vertex vertexDataIndex += 5; // skip x, y, z, u, v __vertexData[vertexDataIndex++] = __x; __vertexData[vertexDataIndex++] = __y; __vertexData[vertexDataIndex++] = __scaleX; __vertexData[vertexDataIndex++] = __scaleY; __vertexData[vertexDataIndex++] = cosRotation; __vertexData[vertexDataIndex++] = sinRotation; __sprites.needUpload = true; } } } }
- Stage3DSpriteSpeed2 is the performance test app for
Stage3DSprite
,Bitmap
, andStage3DSprites
‘ ability to draw lots of sprites. It now allows you to switch between hardware and software rendering, use a 16×16 texture or a 256×256 texture, and draw theStage3D
-based sprites multiple times/iterations. Sprites are now capped at 4000 sinceStage3DSprite
will create too many vertex buffers beyond that.package { import flash.display.*; import flash.display3D.*; import flash.events.*; import flash.filters.*; import flash.geom.*; import flash.text.*; import flash.utils.*; public class Stage3DSpriteSpeed2 extends Sprite { private static const TWO_PI:Number = 2*Math.PI; private static const MODE_STAGE3DSPRITE:int = 1; private static const MODE_BITMAP:int = 2; private static const MODE_STAGE3DSPRITES:int = 3; [Embed(source="flash_logo_icon.jpg")] private static const TEXTURE_ICON:Class; [Embed(source="flash_logo.jpg")] private static const TEXTURE_LARGE:Class; private var context3D:Context3D; private var stats:TextField = new TextField(); private var lastStatsUpdateTime:uint; private var lastFrameTime:uint; private var frameCount:uint; private var driver:TextField = new TextField(); private var modeText:TextField = new TextField(); private var textureIcon:BitmapData = (new TEXTURE_ICON() as Bitmap).bitmapData; private var textureLarge:BitmapData = (new TEXTURE_LARGE() as Bitmap).bitmapData; private var sprites3D:Vector.<Stage3DSprite> = new <Stage3DSprite>[]; private var spritesBitmap:Vector.<Bitmap> = new <Bitmap>[]; private var sprites3DData:Vector.<Stage3DSpriteData> = new <Stage3DSpriteData>[]; private var sprites3DBatch:Stage3DSprites; private var mode:int = MODE_STAGE3DSPRITE; private var texture:BitmapData = textureIcon; private var moving:Boolean = true; private var rotating:Boolean = true; private var scaling:Boolean = true; private var numSprites:int = 4000; private var container:Sprite = new Sprite(); private var iterations:int = 10; [SWF(width="800", height="600", frameRate="60")] public function Stage3DSpriteSpeed2() { stage.align = StageAlign.TOP_LEFT; stage.scaleMode = StageScaleMode.NO_SCALE; stage.frameRate = 60; addChild(container); // Setup UI stats.background = true; stats.backgroundColor = 0xffffffff; stats.autoSize = TextFieldAutoSize.LEFT; stats.text = "Getting FPS..."; addChild(stats); driver.background = true; driver.backgroundColor = 0xffffffff; driver.text = "Getting driver..."; driver.autoSize = TextFieldAutoSize.LEFT; driver.y = stats.height; addChild(driver); modeText.background = true; modeText.backgroundColor = 0xffffffff; modeText.text = "Mode: Stage3DSprite"; modeText.autoSize = TextFieldAutoSize.LEFT; modeText.y = driver.y + driver.height; addChild(modeText); makeButtons( "Mode: Stage3DSprite", "Mode: Bitmap", "Mode: Stage3DSprites", null, "Texture: 16x16", "Texture: 256x256", null, "Rendering: Hardware", "Rendering: Software", null, "Iterations: 1", "Iterations: 2", "Iterations: 5", "Iterations: 10", null, "Add 100 Sprites", "Remove 100 Sprites", null, "Disable Moving", "Disable Rotating", "Disable Scaling" ); addEventListener(Event.ENTER_FRAME, onEnterFrame); getContext(Context3DRenderMode.AUTO); } private function getContext(mode:String): void { Stage3DSprite.clearCache(context3D); context3D = null; var stage3D:Stage3D = stage.stage3Ds[0]; stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated); stage3D.requestContext3D(mode); } private function onContextCreated(ev:Event): void { // Setup context var stage3D:Stage3D = stage.stage3Ds[0]; stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated); context3D = stage3D.context3D; context3D.configureBackBuffer( stage.stageWidth, stage.stageHeight, 0, true ); driver.text = "Driver: " + context3D.driverInfo; sprites3DBatch = new Stage3DSprites(context3D); makeSprites(); } private function makeButtons(...labels): void { const PAD:Number = 5; var curX:Number = PAD; var curY:Number = stage.stageHeight - PAD; for each (var label:String in labels) { if (!label) { curX = PAD; curY -= button.height + PAD; continue; } var tf:TextField = new TextField(); tf.mouseEnabled = false; tf.selectable = false; tf.defaultTextFormat = new TextFormat("_sans", 16, 0x0071BB); tf.autoSize = TextFieldAutoSize.LEFT; tf.text = label; tf.name = "lbl"; var button:Sprite = new Sprite(); button.buttonMode = true; button.graphics.beginFill(0xF5F5F5); button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD); button.graphics.endFill(); button.graphics.lineStyle(1); button.graphics.drawRect(0, 0, tf.width+PAD, tf.height+PAD); button.addChild(tf); button.addEventListener(MouseEvent.CLICK, onButton); if (curX + button.width > stage.stageWidth - PAD) { curX = PAD; curY -= button.height + PAD; } button.x = curX; button.y = curY - button.height; addChild(button); curX += button.width + PAD; } } private function makeSprites(): void { // Clear old sprites context3D.clear(0.5, 0.5, 0.5); context3D.present(); Stage3DSprite.clearCache(context3D); sprites3D.length = 0; spritesBitmap.length = 0; container.removeChildren(); sprites3DBatch.removeAllSprites(); sprites3DData.length = 0; // Make new sprites var i:int; switch (mode) { case MODE_STAGE3DSPRITE: var scale:Number = texture.width / stage.stageWidth; for (; i < numSprites; ++i) { var spr3D:Stage3DSprite = new Stage3DSprite(context3D); spr3D.bitmapData = texture; spr3D.x = Math.random()*2-1; spr3D.y = Math.random()*2-1; spr3D.scaleX = spr3D.scaleY = scale; sprites3D[i] = spr3D; } break; case MODE_BITMAP: for (; i < numSprites*iterations; ++i) { var bm:Bitmap = new Bitmap(texture); bm.x = Math.random()*stage.stageWidth; bm.y = Math.random()*stage.stageHeight; spritesBitmap[i] = bm; container.addChild(bm); } break; case MODE_STAGE3DSPRITES: sprites3DBatch.bitmapData = texture; scale = texture.width / stage.stageWidth; for (; i < numSprites; ++i) { var sprData:Stage3DSpriteData = sprites3DBatch.addSprite(); sprData.x = Math.random()*2-1; sprData.y = Math.random()*2-1; sprData.scaleX = sprData.scaleY = scale; sprites3DData[i] = sprData; } break; } // Reset FPS frameCount = 0; lastFrameTime = 0; lastStatsUpdateTime = getTimer(); } private function onButton(ev:MouseEvent): void { var tf:TextField = ev.target.getChildByName("lbl"); var lbl:String = tf.text; switch (lbl) { case "Mode: Stage3DSprite": modeText.text = lbl; context3D.setVertexBufferAt(2, null); // clear mode = MODE_STAGE3DSPRITE; makeSprites(); break; case "Mode: Bitmap": modeText.text = lbl; context3D.setVertexBufferAt(2, null); // clear mode = MODE_BITMAP; makeSprites(); break; case "Mode: Stage3DSprites": modeText.text = lbl; context3D.setVertexBufferAt(2, null); // clear mode = MODE_STAGE3DSPRITES; makeSprites(); break; case "Texture: 16x16": texture = textureIcon; makeSprites(); break; case "Texture: 256x256": texture = textureLarge; makeSprites(); break; case "Rendering: Hardware": getContext(Context3DRenderMode.AUTO); break; case "Rendering: Software": getContext(Context3DRenderMode.SOFTWARE); break; case "Iterations: 1": iterations = 1; break; case "Iterations: 2": iterations = 2; break; case "Iterations: 5": iterations = 5; break; case "Iterations: 10": iterations = 10; break; case "Add 100 Sprites": if (numSprites < 4000) { numSprites += 100; makeSprites(); } break; case "Remove 100 Sprites": if (numSprites) { numSprites -= 100; makeSprites(); } break; case "Enable Moving": moving = true; tf.text = "Disable Moving"; break; case "Disable Moving": moving = false; tf.text = "Enable Moving"; break; case "Enable Rotating": rotating = true; tf.text = "Disable Rotating"; break; case "Disable Rotating": rotating = false; tf.text = "Enable Rotating"; break; case "Enable Scaling": scaling = true; tf.text = "Disable Scaling"; break; case "Disable Scaling": scaling = false; tf.text = "Enable Scaling"; break; } } private function onEnterFrame(ev:Event): void { // Render the scene switch (mode) { case MODE_STAGE3DSPRITE: if (context3D) { for (var i:int; i < iterations; ++i) { var spr3D:Stage3DSprite; context3D.clear(0.5, 0.5, 0.5); for each (spr3D in sprites3D) { spr3D.render(); } if (moving) { for each (spr3D in sprites3D) { spr3D.x = Math.random()*2-1; spr3D.y = Math.random()*2-1; } } if (rotating) { for each (spr3D in sprites3D) { spr3D.rotation = 360*Math.random(); } } if (scaling) { var baseScale:Number = texture.width / stage.stageWidth; for each (spr3D in sprites3D) { spr3D.scaleX = baseScale*Math.random(); spr3D.scaleY = baseScale*Math.random(); } } context3D.present(); } } break; case MODE_BITMAP: var dispObj:DisplayObject; if (moving) { var stageWidth:Number = stage.stageWidth; var stageHeight:Number = stage.stageHeight; for each (dispObj in spritesBitmap) { dispObj.x = Math.random()*stageWidth; dispObj.y = Math.random()*stageHeight; } } if (rotating) { for each (dispObj in spritesBitmap) { dispObj.rotation = 360*Math.random(); } } if (scaling) { for each (dispObj in spritesBitmap) { dispObj.scaleX = Math.random(); dispObj.scaleY = Math.random(); } } break; case MODE_STAGE3DSPRITES: if (context3D) { for (i = 0; i < iterations; ++i) { var sprData:Stage3DSpriteData; context3D.clear(0.5, 0.5, 0.5); if (moving) { for each (sprData in sprites3DData) { sprData.x = Math.random()*2-1; sprData.y = Math.random()*2-1; } } if (rotating) { for each (sprData in sprites3DData) { sprData.rotation = TWO_PI*Math.random(); } } if (scaling) { baseScale = texture.width / stage.stageWidth; for each (sprData in sprites3DData) { sprData.scaleX = baseScale*Math.random(); sprData.scaleY = baseScale*Math.random(); } } sprites3DBatch.render(); context3D.present(); } } break; } // Update stats display frameCount++; var now:int = getTimer(); var dTime:int = now - lastFrameTime; var elapsed:int = now - lastStatsUpdateTime; if (elapsed > 1000) { var framerateValue:Number = 1000 / (elapsed / frameCount); stats.text = "FPS: " + framerateValue.toFixed(4) + ", Sprites: " + numSprites + " x " + iterations + " iterations = " + (numSprites*iterations) + " total"; lastStatsUpdateTime = now; frameCount = 0; } lastFrameTime = now; } } }
- Images Used: flash_logo_icon.jpg (16×16), flash_logo.jpg (256×256)
Launch the test app
I ran the test app with in the following environment:
- Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 11.1.102.62
- 2.8 Ghz Intel Xeon W3530
- NVIDIA GeForce 9600 GT
- Windows 7
Here are the common settings for the tests:
- Sprites: 4000
- Iterations: 10
- Move: Yes
- Scale: Yes
- Rotate: Yes
And here are some of the results that I found:
Mode | Rendering | Texture | FPS |
---|---|---|---|
Stage3DSprite | Hardware | 16×16 | 10 |
Stage3DSprite | Hardware | 256×256 | 10 |
Stage3DSprite | Software | 16×16 | 0.6 |
Stage3DSprite | Software | 256×256 | 0.2 |
Bitmap | n/a | 16×16 | 4.5 |
Bitmap | n/a | 256×256 | 0.1 |
Stage3DSprite | Hardware | 16×16 | 37 |
Stage3DSprite | Hardware | 256×256 | 36 |
Stage3DSprite | Software | 16×16 | 15 |
Stage3DSprite | Software | 256×256 | 0.6 |
As you can see, the new Stage3DSprites
class has a massive impact on performance by reducing the draw calls from 40,000 to 1. Hardware-accelerated performance goes from an unplayable 10 FPS to a smooth 36 FPS. Software-rendering performance goes from 0.2-0.6 FPS to 0.6-15 FPS, the latter of which is actually playable even with 40,000 moving, rotating, and scaling sprites. To refresh our memory of what performance we could expect with Flash Plaeyr 10.x, consider how Bitmap
performs: a dismal 0.1-4.5 FPS. Truly, this is the real power of Stage3D
.
If you read the above code closely you may have noticed a downside to the Stage3DSprites
approach. Unlike with Bitmap
or Stage3DSprite
, you must use the same texture for all sprites. In the next article of this series we’ll tackle this problem to regain the lost flexibility while keeping performance high. Stay tuned!
Spot a bug? Have a suggestion? Post a comment!
#1 by AlexG on February 29th, 2012 ·
Great post!
I usually precache BitmapDatas and rotate and scale them and put in cache. With BitmapData copyPixels() I achieve 50 FPS and more which is a desired range for games which use also resources for calculating other things not related to render like physics and so on. But I use no more than 1000 Sprites.
Stage3DSprites gives similar results even in software mode but in hardware mode the processor is completely free. Its a pitty that Stage3D doesnt support filters. Maybe shaders can achieve results similar to filters.
#2 by jackson on February 29th, 2012 ·
That sounds like it could be a fast way of rendering a lot of sprites, too. How many
BitmapData
objects do you cache? Since there’s an unlimited number of scales and rotations, your memory usage could balloon to huge proportions with a lot of rotations and scales, especially if you’re using multiple sprite textures/images. To compare, all three approaches in this article share a sprite texture/image so generating a cache of them is unnecessary.As for shaders comparing to filters, I’d say it’s definitely possible to get a lot of the filter effects in shader code. Actually, it’s very similar to the existing PixelBender (2D)-based ShaderFilter.
#3 by AlexG on February 29th, 2012 ·
I am precaching less than 5000 bitmapDatas of size around 100×100 pixels and it gives me a very big operative memory use, almost 1GB. I think for a modern PC thats not so much though I am looking for ways to avoid this use.
I use PixelBender but even PixelBender is giving not so good performance. I made a blur filter in PB and it gives me around 35% more FPS which is not so much in comparation with native BlurFilter and the quality is worse (better quality in PB gives even less frames than native filter). So creating cool filters on GPU would be very cool!
#4 by Matt Lockyer on March 6th, 2012 ·
See my comments on Part 3.
There’s no need to precache anything if you can pass the appropriate alpha and color multipliers into a vertexBuffer.
Email me if you need some source.