Loops With int and uint
AS3 has two integer types: int
and uint
. In my experience, most AS3 programmers just use int
everywhere and ignore uint
. This is usually acceptable as the need for unsigned integers is rare compared to their signed counterparts. However, there are significant performance differences between the two. Read on for the impact of uint
on your loops. The original version of this article’s performance test contained a small-but-critical error that led to a lot of incorrect analysis and results. This version of the article has been corrected.
One common uint
you’ll run across in AS3 is the length
property of Array
and Vector
. While you should almost always cache this expensive getter call as a local variable, should you cache it as an int
or a uint
? To preserve compatibility with lengths greater than 2 billion long, you should keep it as a uint
to avoid using the sign bit. This case is extremely rare though, as an Array
or Vector
that large takes up at least 2 GB of RAM. The more important question is which will run faster.
With speed in mind, I’ve designed a small performance test to see the performance in various scenarios:
- A
for
loop that pre-increments (++x
) anint
iterator - A
for
loop that post-increments (x++
) aint
iterator - A
for
loop that pre-increments (++x
) anuint
iterator - A
for
loop that post-increments (x++
) auint
iterator - A
for
loop that adds one (x+=1
) anint
iterator - A
for
loop that adds one (x+=1
) auint
iterator - A
for
loop that compares anint
iterator with auint
check (i.e.int < uint
) - A
for
loop that compares auint
iterator with anint
check (i.e.uint < int
) - A
for
loop that pre-decrements (--x
) the length in the check part - A
for
loop that pre-decrements (--x
) the length in the counting part
package { import flash.display.*; import flash.utils.*; import flash.text.*; public class IntUintLoops extends Sprite { public function IntUintLoops() { var logger:TextField = new TextField(); logger.autoSize = TextFieldAutoSize.LEFT; addChild(logger); var beforeTime:int; var afterTime:int; var SIZEINT:int = 100000000; var SIZEUINT:uint = SIZEINT; var i:int; var u:uint; logger.text = "Method,Time\n"; beforeTime = getTimer(); for (i = 0; i < SIZEINT; ++i) { } afterTime = getTimer(); logger.appendText("++int," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (u = 0; u < SIZEUINT; ++u) { } afterTime = getTimer(); logger.appendText("++uint," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (i = 0; i < SIZEINT; i++) { } afterTime = getTimer(); logger.appendText("int++," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (u = 0; u < SIZEUINT; u++) { } afterTime = getTimer(); logger.appendText("uint++," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (i = 0; i < SIZEINT; i+=1) { } afterTime = getTimer(); logger.appendText("int+=1," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (u = 0; u < SIZEUINT; u+=1) { } afterTime = getTimer(); logger.appendText("uint+=1," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (i = 0; i < SIZEUINT; i++) { } afterTime = getTimer(); logger.appendText("int iterator w/ uint check," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (u = 0; u < SIZEINT; u++) { } afterTime = getTimer(); logger.appendText("uint iterator w/ int check," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (i = 0; SIZEINT--; i++) { } afterTime = getTimer(); SIZEINT = SIZEUINT; logger.appendText("length decrement in check," + (afterTime-beforeTime) + "\n"); beforeTime = getTimer(); for (i = 0; SIZEINT; i++, --SIZEINT) { } afterTime = getTimer(); SIZEINT = SIZEUINT; logger.appendText("length decrement in post," + (afterTime-beforeTime) + "\n"); } } }
I ran this performance test with the following environment:
- Flex SDK (MXMLC) 4.1.0.16076, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 10.3.181.26
- 2.4 Ghz Intel Core i5
- Mac OS X 10.6.8
And got these results:
Method | Time |
---|---|
++int | 222 |
++uint | 211 |
int++ | 215 |
uint++ | 211 |
int+=1 | 212 |
uint+=1 | 216 |
int iterator w/ uint check | 296 |
uint iterator w/ int check | 276 |
length decrement in check | 208 |
length decrement in post | 222 |
Here are the same results in graph form:
Below is the bytecode generated by the compiler for each test. There are significant variations between each test, but the performance results don't show much, if any, difference. It is likely that the JIT is stepping in to optimize some of these tests, but it's hard to tell which. Of note is how "int iterator w/ uint check" is deceptively identical to "++int" and "int++", even though it consistently runs about one third slower!
++int pushbyte 0 setlocal 6 jump L1 L2: label inclocal_i 6 L1: getlocal 6 getlocal 4 iflt L2 ++uint pushbyte 0 convert_u setlocal 7 jump L3 L4: label getlocal 7 increment convert_u setlocal 7 L3: getlocal 7 getlocal 5 iflt L4 int++ pushbyte 0 setlocal 6 jump L5 L6: label inclocal_i 6 L5: getlocal 6 getlocal 4 iflt L6 uint++ pushbyte 0 convert_u setlocal 7 jump L7 L8: label getlocal 7 increment convert_u setlocal 7 L7: getlocal 7 getlocal 5 iflt L8 int+=1 pushbyte 0 setlocal 6 jump L9 L10: label getlocal 6 pushbyte 1 add convert_i setlocal 6 L9: getlocal 6 getlocal 4 iflt L10 uint+=1 pushbyte 0 convert_u setlocal 7 jump L11 L12: label getlocal 7 pushbyte 1 add convert_u setlocal 7 L11: getlocal 7 getlocal 5 iflt L12 int iterator w/ uint check pushbyte 0 setlocal 6 jump L13 L14: label inclocal_i 6 L13: getlocal 6 getlocal 5 iflt L14 uint iterator w/ int check pushbyte 0 convert_u setlocal 7 jump L15 L16: label getlocal 7 increment convert_u setlocal 7 L15: getlocal 7 getlocal 4 iflt L16 length decrement in check pushbyte 0 setlocal 6 jump L17 L18: label inclocal_i 6 L17: getlocal 4 dup decrement_i setlocal 4 iftrue L18 length decrement in post pushbyte 0 setlocal 6 jump L19 L20: label inclocal_i 6 getlocal 4 decrement_i setlocal 4 L19: getlocal 4 iftrue L20
In any case, the performance of the mixed loops—int
iterator with uint
check or visa versa—exhibit far worse performance than any of the other loops. As the bytecode shows, they are using the stack more (pushbyte 0
) and using the general add
instruction rather than the special-purpose increment instructions. This has led to an astounding 6x performance drop compared to all of the alternatives.
As for the others, there isn't much difference. Remember that this test runs 100 million iterations of each loop, so differences of a couple of milliseconds are likely just statistical noise.
In conclusion, feel free to use either int
or uint
in your loops, but be careful to not mix them by having an int
iterator and a uint
length check or visa versa. Happy looping!
#1 by skyboy on June 20th, 2011 ·
There are also a couple other methods to loop:
for (var i:int; len–; ++i);
for (var i:int; len; ++i, –len);
They both use the same number of variables and will have different performances compared to using a .. comparison. I also suspect that performance will vary more dramatically between int/uint.
I’m not sure how you’re getting inclocal_i for the uint, as i’m relatively certain that method will convert a uint to an int internally, leading to bugs elsewhere. The bytecode I’ve gotten from the compiler was always getlocal/increment opcode(s)/setlocal: from a short performance test I performed in the debug player I noticed the multiple code method took less time than the single code method; I’m relatively certain nJIT is getting in on it there, though.
#2 by jackson on June 20th, 2011 ·
The article had a major bug in it: the
uint
variable was declared as anint
! I probably should have named it something that didn’t “read” like “uint”:Anyhow, I’ve corrected the error and updated the article with all new results, analysis, and conclusions. I’ve also included your two test cases, though they didn’t make much difference. As for the bytecode, it makes much more sense now that
u
is auint
. :)Thank you very much for pointing this out. It could have very well been one of those articles where no one tells me I have a critical error for an entire year…
#3 by skyboy on June 20th, 2011 ·
Wow, that’s surprising. Either nJIT is optimizing away the float increment to an int increment or the increment op function checks for float/int types and reacts accordingly.
Though, you did miss two cases: length being a uint while decrementing it. Based on my own tests, there should be a staggering difference with:
for (var i:int, len:uint = ...; len--; ++i);
compared to:for (var i:int, len:int = ...; len--; ++i);
#4 by jackson on June 20th, 2011 ·
OK, I ran those too on the same test machine:
The “length decrement in check” version is apparently quite horrible when the length is a
uint
. Here’s the bytecode:And there you have it: a conversion through float at the
convert_d
line. This never appears in any other version, so it’s strange that it appears here.#5 by Azzatai on June 28th, 2011 ·
Running your test code on my machine I noticed that my times for the +=1 operations are in line with all the other loops. I think you must have mistaken the int+=1,214 to be 1214, but it is 214 because the 1 comes from your string. :)
#6 by jackson on June 28th, 2011 ·
I think you’re right that I mistook the “1” from “+=1” as a thousands digit. :O
I re-ran the test, updated the results table and graph, and removed the erroneous bit of analysis at the end warning to not use
+=1
.Thanks for letting me know about this!
#7 by Epsz on June 30th, 2011 ·
If you want to know when the JIT comes into play, you should do incrementally larger tests and graph the resulting time(size of the tests on the Y axis, time on the X). There should be a visible leap in the curve when/if the JIT starts doing it’s thing.