IL2CPP Slowdown (Partially Solved)
The new IL2CPP scripting backend in Unity 4.6.2 and 5.0 is supposed to be much faster than the old Mono backend. I ran some benchmarks, but mostly found slowdowns compared to Mono. Today’s article shows the tests I ran, the results I got, and wonders why the IL2CPP version seems so slow. Perhaps one of you, dear readers, knows the reason why. Update: Part of the reason why has been discovered. Read on for updated results.
All of the benchmarks in today’s test were taken from The Computer Language Benchmarks Game. You can download the source code for C# and many other languages from their site. I went through each benchmark and took the latest of each of them that had a C# version that worked in Unity. Here’s the list of tests I ended up with:
- binarytrees
- chameneosredux
- fannkuchredux
- fasta-2
- fastaredux
- knucleotide-3
- mandlebrot-3
- nbody-3
- regexdna-6
- revcomp-3
- spectralnorm-2
I then modified each version as slightly as I could. This included making a public Main
method for each so they could be called from my benchmark runner and disabling any writes to Console
in favor of writes to a MemoryStream
. Feel free to download the resulting tests.
I then created a benchmark runner script that ran the tests in a background thread so as not to block the main UI thread. The main thread simply shows the test results report. The tests are run over and over with the program parameters specified in the benchmarksgame/nanobench/makefiles/u32.ini
configuration file. The script is simply attached to the main camera GameObject
in an otherwise empty Unity scene.
Here’s the benchmark runner source code:
using System; using System.Diagnostics; using System.Text; using System.Threading; using UnityEngine; public class TestScript : MonoBehaviour { private class Test { public string Name; public Action<string[]> Main; public string[] Args; public long TotalTime; public Test(string name, Action<string[]> main, string[] args) { Name = name; Main = main; Args = args; } } private static readonly Test[] tests = new Test[]{ new Test("BinaryTrees", BinaryTrees.Main, new string[]{"12"}), new Test("chameneosredux", chameneosredux.Main, new string[]{"60000"}), new Test("FannkuchRedux", FannkuchRedux.Main, new string[]{"10"}), new Test("Fasta", Fasta.Main, new string[]{"250000"}), new Test("FastaRedux", FastaRedux.Main, new string[]{"250000"}), new Test("knucleotide", knucleotide.Main, new string[]{"250000"}), new Test("MandelBrot", MandelBrot.Main, new string[]{"1000"}), new Test("NBody", NBody.Main, new string[]{"500000"}), new Test("regexdna", regexdna.Main, new string[]{"50000"}), new Test("revcomp", revcomp.Main, new string[]{"250000"}), new Test("SpectralNorm", SpectralNorm.Main, new string[]{"500"}) }; private Rect reportDrawArea; private string report; private object reportMutex; private Thread reportThread; void Awake() { reportDrawArea = new Rect(0, 0, Screen.width, Screen.height); reportMutex = new object(); reportThread = new Thread(TestThread); reportThread.Start(); } void TestThread() { var reportBuilder = new StringBuilder(); var numRuns = 0; var stopwatch = new Stopwatch(); while (true) { numRuns++; reportBuilder.Length = 0; reportBuilder.Append("Num Runs,"); reportBuilder.Append(numRuns); reportBuilder.Append('\n'); reportBuilder.Append("Test,Time\n"); foreach (var test in tests) { stopwatch.Reset(); stopwatch.Start(); test.Main(test.Args); var elapsedMillis = stopwatch.ElapsedMilliseconds; test.TotalTime += elapsedMillis; var averageMillis = test.TotalTime / (float)numRuns; reportBuilder.Append(test.Name); reportBuilder.Append(','); reportBuilder.Append(averageMillis); reportBuilder.Append('\n'); } lock (reportMutex) { report = reportBuilder.ToString(); UnityEngine.Debug.Log(report); } } } void Update() { if (Input.GetMouseButtonDown(0)) { reportThread.Abort(); } } void OnApplicationQuit() { reportThread.Abort(); } void OnGUI() { lock (reportMutex) { if (report != null) { GUI.TextArea(reportDrawArea, report); } } } }
I then built the app for iOS using Mono and IL2CPP targeting ARMv7 and not using a “Development Build”. Here are the results I got for an iPad 3 running iOS 8.2.
Test | Mono Time | IL2CPP Time | IL2CPP Time Change |
---|---|---|---|
BinaryTrees | 439.3529 | 360.2 | 81.98% |
chameneosredux | 1786 | 206821.3 | 11580.14% |
FannkuchRedux | 4650.824 | 26470.4 | 569.16% |
Fasta | 711.2941 | 3508 | 493.19% |
FastaRedux | 573.5883 | 4176.9 | 728.21% |
knucleotide | 18.76471 | 23.4 | 124.70% |
MandelBrot | 88.11765 | 157.4 | 178.62% |
NBody | 2210.294 | 4245 | 192.06% |
regexdna | 11.35294 | 13.5 | 118.91% |
revcomp | 0.05882353 | 0 | 0.00% |
SpectralNorm | 480.4706 | 866.1 | 180.26% |
Any result greater than 100% is taking longer for IL2CPP to complete than Mono. Anything less than 100% is faster than Mono.
As you can see, all but two tests are slower in IL2CPP compared to Mono. Those two tests were BinaryTrees
at 19% faster and revcomp
that was so fast as to be unmeasurable.
The other tests all show slowdowns. These range from the very minor to the extreme. On the very minor side, knucleotide
and regexdna
are only about 20% slower. MandelBrot
, NBody
, and SpectralNorm
are next-slowest at about 80% slower. Then things get really bad. Fasta
and FannkuchRedux
are 400-500% slower, and FastaRedux
is 628% slower. The slowness crown, by far, goes to chameneosredux
at a whopping 10580% slower.
As you can read repeatedly on the “Benchmarks Game” site, these should be taken with a grain of salt. They’re obviously not real apps or games running in Unity and differences can easily creep in. Then again, these 11 tests should be fairly representative of common programming tasks.
I expected to do this test and write an article showing the how awesomely fast the new IL2CPP scripting backend in Unity 4.6.2 and 5.0 is. Unfortunately, the results I’m seeing are quite the opposite. I don’t know why that is, either. If you do, please let me know in the comments.
UPDATE
Thanks to Ralph Hauwert in the comments, part of the reason for the slowdown has been discovered. It turns out that you need to explicitly set Xcode to make a release build, even when “Development Build” has not been checked in the Unity build settings. To do so, click the name of the project, choose “Edit Scheme…”, and change “Build Configuration” to “Release”. By the way, I built the test using Xcode 6.2 on Mac OS X 10.10.2.
Here are the updated results:
Test | Mono Time | IL2CPP Time (Debug) | IL2CPP Time Change (Debug) | IL2CPP Time (Release) | IL2CPP Time Change (Release) |
---|---|---|---|---|---|
BinaryTrees | 439.3529 | 360.2 | 81.98% | 266.875 | 60.74% |
chameneosredux | 1786 | 206821.3 | 11580.14% | 116754.1 | 6537.18% |
FannkuchRedux | 4650.824 | 26470.4 | 569.16% | 5289.5 | 113.73% |
Fasta | 711.2941 | 3508 | 493.19% | 2421.375 | 340.42% |
FastaRedux | 573.5883 | 4176.9 | 728.21% | 3252.625 | 567.07% |
knucleotide | 18.76471 | 23.4 | 124.70% | 19.375 | 103.25% |
MandelBrot | 88.11765 | 157.4 | 178.62% | 99.625 | 113.06% |
NBody | 2210.294 | 4245 | 192.06% | 1053.25 | 47.65% |
regexdna | 11.35294 | 13.5 | 118.91% | 13.125 | 115.61% |
revcomp | 0.05882353 | 0 | 0.00% | 0 | 0.00% |
SpectralNorm | 480.4706 | 866.1 | 180.26% | 250.5 | 52.14% |
IL2CPP in release mode is indeed much faster than in debug mode, as is to be expected. chameneosredux
, fasta
, and fastaredux
remain ludicrously slow at 3x-64x slower than Mono, but the rest are much quicker. knucleotide
is almost exactly the same speed as Mono. FannkuchRedux
, MandleBrot
, and regexdna
are only about 10-15% slower.
But there are some bright spots with release mode, too. BinaryTrees
, NBody
, and SpectralNorm
are all about twice as fast with IL2CPP as with Mono. revcomp
remains unmeasurably fast whereas with Mono it registered an occasional millisecond of time.
Given that the performance can vary from wildly slower to wildly faster when switching to IL2CPP, the performance you ultimately see will depend highly on the kind of code your app has. Don’t simply assume that your app will see an across-the-board speedup though.
If you’ve seen a significant speedup or slowdown in your real-world code, especially particular kinds of code, I’d be really interested to hear about it in the comments.
#1 by Ralph Hauwert on March 30th, 2015 ·
Jackson, these results don’t come near by what we have seen internally, with similar benchmarks. Right now the only thing I can imagine that you are not setting xcode to release explicitly.
Please see the screenshots here; http://blogs.unity3d.com/2015/01/29/unity-4-6-2-ios-64-bit-support/
On how to switch xcode to release when using IL2CPP. There is no difference on doing the same under Mono, but under IL2CPP that will make a huge difference.
Could you confirm that you have set it to release when running these benchmarks ?
#2 by jackson on March 30th, 2015 ·
Ralph, thanks for pointing this out, especially so quickly. I’ve updated the article with my findings after setting Xcode to release mode. It is indeed much quicker than debug mode, but still not an across-the-board speedup. In many cases IL2CPP is quite a bit slower than Mono. Perhaps the benchmarks I’ve posted can be useful in optimizing IL2CPP for future releases.
#3 by Roy van Doorn on June 9th, 2015 ·
Set xcode to release is indeed way faster thats for the tip!
#4 by tonydongyiqi on July 5th, 2015 ·
Jackson, My project met performance problem after switched into IL2CPP.
30fps drop down to 10fps .
You can check my sample project on bitbucket
https://tonydongyiqi@bitbucket.org/tonydongyiqi/il2cppperformance.git
I hope you can find the point where cause the performance drop~
you can touch me through my email :233062069@qq.com
#5 by Paiman Roointan on September 1st, 2019 ·
it could be interesting to know how a game built with latest versions of Unity performs
#6 by jackson on September 2nd, 2019 ·
The article includes a download link and the test source code, so it should be pretty straightforward to test it on any version of Unity on any target hardware.