Inline Math.ceil() Part II
I’ve been looking at a lot of AVM2 bytecode recently with the excellent Nemo440 AIR app. Some of the code was using my inline Math.ceil() function and I noticed that the int() cast is implemented like any other function call. Today’s article will show you how to optimize the inline Math.ceil() call even further by avoiding this function call.
I’ve shown before how painful function calls are in AS3, so it’s only natural that we would like to avoid them wherever possible in performance-critical code. Given that there is simply a convert_i opcode that is used when assigning a Number to an int but not when explicitly calling int(), we have the framework to do the conversion. Firstly, let’s look at the inline Math.ceil() functions from last time:
function ceilPositiveOnly(value:Number): Number { return value == int(value) ? value : int(value+1); } function ceil(value:Number): Number { return value == int(value) ? value : value >= 0 ? int(value+1) : int(value); }
Both versions have either one or two calls to int(). So let’s re-write with the help of a temporary int variable:
function ceilPositiveOnly(value:Number): Number { var valueInt:int = value; if (value == valueInt) { return value; } else { valueInt = value + 1; return valueInt; } } function ceil(value:Number): Number { var valueInt:int = value; if (value == valueInt) { return value; } else if (value >= 0) { valueInt = value + 1; return valueInt; } else { valueInt = value; return valueInt; } }
Yes, this version is a good deal longer due to using if-else chains instead of ternary (a ? b : c) operators. Let’s see if it’s worth it with some tests. Firstly, let’s test for correctness to make sure the new version works:
package { import flash.text.*; import flash.utils.*; import flash.display.*; public class InlineCeilCorrectness extends Sprite { private var __logger:TextField; public function InlineCeilCorrectness() { __logger = new TextField(); __logger.autoSize = TextFieldAutoSize.LEFT; addChild(__logger); log("Positive only:"); var value:Number; var valueInt:int; var inlineResult:Number; var ceilResult:Number; for each (value in [1.1, 2.0]) { valueInt = value; if (value == valueInt) { inlineResult = value; } else { valueInt = value + 1; inlineResult = valueInt; } ceilResult = Math.ceil(value); log("\t" + value + ": " + inlineResult + " == " + ceilResult + ": " + (inlineResult == ceilResult ? "PASS" : "FAIL")); } log("Positive and negative:"); for each (value in [1.1, 2.0, -1.1, -2.0]) { valueInt = value; if (value == valueInt) { inlineResult = value; } else if (value >= 0) { valueInt = value + 1; inlineResult = valueInt; } else { valueInt = value; inlineResult = valueInt; } ceilResult = Math.ceil(value); log("\t" + value + ": " + inlineResult + " == " + ceilResult + ": " + (inlineResult == ceilResult ? "PASS" : "FAIL")); } } private function log(msg:*): void { __logger.appendText(msg + "\n"); } } }
We seem to get passing marks here:
Positive only: 1.1: 2 == 2: PASS 2: 2 == 2: PASS Positive and negative: 1.1: 2 == 2: PASS 2: 2 == 2: PASS -1.1: -1 == -1: PASS -2: -2 == -2: PASS
Now let’s see if all this work has paid off. Here is a test to measure the performance:
package { import flash.text.*; import flash.utils.*; import flash.display.*; public class InlineCeilPerformance extends Sprite { private var __logger:TextField; public function InlineCeilPerformance() { __logger = new TextField(); __logger.autoSize = TextFieldAutoSize.LEFT; addChild(__logger); const ITERATIONS:int = 50000000; var i:int; var res:Number; var before:int = getTimer(); var value:Number = 0.0; var valueInt:int; for (i = 0; i < ITERATIONS; ++i) { valueInt = value; if (value == valueInt) { res = value; } else if (value >= 0) { valueInt = value + 1; res = valueInt; } else { valueInt = value; res = valueInt; } value += 1.0; } log("Inline (positive only): " + (getTimer()-before)); before = getTimer(); value = 0.0; for (i = 0; i < ITERATIONS; ++i) { valueInt = value; if (value == valueInt) { res = value; } else if (value >= 0) { valueInt = value + 1; res = valueInt; } else { valueInt = value; res = valueInt; } value += 1.0; } log("Inline (positive and negative): " + (getTimer()-before)); before = getTimer(); value = 0.0; for (i = 0; i < ITERATIONS; ++i) { res = Math.ceil(value); value += 1.0; } log("Math.ceil(): " + (getTimer()-before)); } private function log(msg:*): void { __logger.appendText(msg + "\n"); } } }
For your reference, here are the results from last time:
Environment | Inline (positive) | Inline (positive and negative) | Math.ceil() |
---|---|---|---|
2.2 Ghz Intel Core 2 Duo, Mac OS X 10.6 | 869 | 844 | 1520 |
And with the new and improved version that doesn’t use int(), we get these results:
Environment | Inline (positive) | Inline (positive and negative) | Math.ceil() |
---|---|---|---|
2.2 Ghz Intel Core 2 Duo, Mac OS X 10.6 | 428 (+103%) | 429 (+97%) | 1520 |
Hooray! It seems as though we can successfully work around the slowness of the int() function call to achieve better performance in this case. We seem to be have taken the very slow Math.ceil(), inlined it to double the speed, and then removed int() to double the speed again. At this point we are four times faster than the built-in version! There are probably many other cases where avoiding int() would be a good idea. If you’re looking to speed up any performance-critical code, try searching through for any calls to int().
#1 by jpauclair on April 2nd, 2010 ·
Nice one Jackson!
One thing you have to keep in mind is that on more complex algorithm you can’t always have that “temporary int var” to convert all your stuff to int because it is using one more object in the registers and that at some point having this var can result in slower computing.
Still… 400% is quite an improvement!
#2 by jackson on April 2nd, 2010 ·
Could you provide an example where adding one more variable increases register count and results in a substantial slowdown? I know how that can easily be the case in C/C++, so it’d be really interesting to know more about this in AS3.
#3 by mousman on August 12th, 2010 ·
hello !
you’re work on optimisation is really great,
and i’m just discovering it !!!
what about this line :
return int(value+0.5)
#4 by mousman on August 12th, 2010 ·
sorry for the last comment,
I was meaning
int(value+0.9) (for positive values)
#5 by mousman on August 12th, 2010 ·
please !!!
forget about what I said and delete my comments ….
#6 by mousman on August 12th, 2010 ·
the last..; because I feel embarrassed by my precedents comments…
so ,
I have think beacause writting this time.. and made some tests using this code :
And it seems that the winner is the last part of the code.
Hope this time I was usefull !
#7 by jackson on August 12th, 2010 ·
It looks like something happened to your post since your for loops don’t have three parts. I see them as just
for (i = 0; i = 0)
, which doesn’t compile. The last one especially seems to be missing anif
to correspond with theelse
.#8 by mousman on August 13th, 2010 ·
hello.
yes a part was cut (maybe due to the less operator )
in fact it was the same loop you used
for (i = 0; i “less than” ITERATIONS; ++i)
I try again with encoded caracteres
(forgive my english… i’m french)
#9 by jackson on August 13th, 2010 ·
Thanks for resubmitting; I cleaned up the post to be more readable. Then I ran your version through the correctness test from the start of the article:
The reason for this error is that your version omits the check for whole numbers and therefore inappropriately adds 1 to the whole number 2.
#10 by mousman on August 13th, 2010 ·
yes I should think more before posting…….
I’m becoming obsessed by this function (I used it so many times) , knowing now (thanks to you) that it’s possible to optimize it I would have been pleased to add my stone….
anyway…sorry for the lost of time.
You’re doing a great job !!
And I promess to think twice before posting next time :-)
#11 by Jozef Chutka on November 17th, 2010 ·
Hi jackson, thank you for your work, 400% boost is impressive. However I wonder why you define your return value as Number when its clear that int is always returned. I have modifed your code into:
#12 by Jozef Chutka on November 17th, 2010 ·
without a function
#13 by jackson on November 17th, 2010 ·
I return
Number
because that’s what Flash’s version returns. Usually the return is an integer, but in some cases it isn’t. For example, if you takeMath.ceil(NaN)
the return value isNaN
, which isn’t anint
.#14 by dimuMurray on August 17th, 2011 ·
Is using the ternary operator viable for the optimized ceilPositiveOnly()?
It would make the code a little more compact with the following:
Would there be any impact on performance taking this route?
#15 by jackson on August 18th, 2011 ·
Sure, and the performance shouldn’t change by much, if any.
#16 by christiaan on February 22nd, 2012 ·
Create posts Jackson, looking forward to your flash player 11 version. Have already found the Math.Ceil to be fastest
#17 by benjamin guihaire on July 9th, 2013 ·
its possible to be much faster by using alchemy and some bit twiddling tricks
http://guihaire.com/code/?p=161