The new IL2CPP scripting backend in Unity 4.6.2 and 5.0 is supposed to be much faster than the old Mono backend. I ran some benchmarks, but mostly found slowdowns compared to Mono. Today’s article shows the tests I ran, the results I got, and wonders why the IL2CPP version seems so slow. Perhaps one of you, dear readers, knows the reason why. Update: Part of the reason why has been discovered. Read on for updated results.

All of the benchmarks in today’s test were taken from The Computer Language Benchmarks Game. You can download the source code for C# and many other languages from their site. I went through each benchmark and took the latest of each of them that had a C# version that worked in Unity. Here’s the list of tests I ended up with:

  • binarytrees
  • chameneosredux
  • fannkuchredux
  • fasta-2
  • fastaredux
  • knucleotide-3
  • mandlebrot-3
  • nbody-3
  • regexdna-6
  • revcomp-3
  • spectralnorm-2

I then modified each version as slightly as I could. This included making a public Main method for each so they could be called from my benchmark runner and disabling any writes to Console in favor of writes to a MemoryStream. Feel free to download the resulting tests.

I then created a benchmark runner script that ran the tests in a background thread so as not to block the main UI thread. The main thread simply shows the test results report. The tests are run over and over with the program parameters specified in the benchmarksgame/nanobench/makefiles/u32.ini configuration file. The script is simply attached to the main camera GameObject in an otherwise empty Unity scene.

Here’s the benchmark runner source code:

using System;
using System.Diagnostics;
using System.Text;
using System.Threading;
 
using UnityEngine;
 
public class TestScript : MonoBehaviour
{
	private class Test
	{
		public string Name;
		public Action<string[]> Main;
		public string[] Args;
		public long TotalTime;
 
		public Test(string name, Action<string[]> main, string[] args)
		{
			Name = name;
			Main = main;
			Args = args;
		}
	}
 
	private static readonly Test[] tests = new Test[]{
		new Test("BinaryTrees", BinaryTrees.Main, new string[]{"12"}),
		new Test("chameneosredux", chameneosredux.Main, new string[]{"60000"}),
		new Test("FannkuchRedux", FannkuchRedux.Main, new string[]{"10"}),
		new Test("Fasta", Fasta.Main, new string[]{"250000"}),
		new Test("FastaRedux", FastaRedux.Main, new string[]{"250000"}),
		new Test("knucleotide", knucleotide.Main, new string[]{"250000"}),
		new Test("MandelBrot", MandelBrot.Main, new string[]{"1000"}),
		new Test("NBody", NBody.Main, new string[]{"500000"}),
		new Test("regexdna", regexdna.Main, new string[]{"50000"}),
		new Test("revcomp", revcomp.Main, new string[]{"250000"}),
		new Test("SpectralNorm", SpectralNorm.Main, new string[]{"500"})
	};
 
	private Rect reportDrawArea;
	private string report;
	private object reportMutex;
	private Thread reportThread;
 
	void Awake()
	{
		reportDrawArea = new Rect(0, 0, Screen.width, Screen.height);
		reportMutex = new object();
 
		reportThread = new Thread(TestThread);
		reportThread.Start();
	}
 
	void TestThread()
	{
		var reportBuilder = new StringBuilder();
		var numRuns = 0;
		var stopwatch = new Stopwatch();
 
		while (true)
		{
			numRuns++;
			reportBuilder.Length = 0;
			reportBuilder.Append("Num Runs,");
			reportBuilder.Append(numRuns);
			reportBuilder.Append('\n');
			reportBuilder.Append("Test,Time\n");
			foreach (var test in tests)
			{
				stopwatch.Reset();
				stopwatch.Start();
				test.Main(test.Args);
				var elapsedMillis = stopwatch.ElapsedMilliseconds;
				test.TotalTime += elapsedMillis;
 
				var averageMillis = test.TotalTime / (float)numRuns;
				reportBuilder.Append(test.Name);
				reportBuilder.Append(',');
				reportBuilder.Append(averageMillis);
				reportBuilder.Append('\n');
			}
 
			lock (reportMutex)
			{
				report = reportBuilder.ToString();
				UnityEngine.Debug.Log(report);
			}
		}
	}
 
	void Update()
	{
		if (Input.GetMouseButtonDown(0))
		{
			reportThread.Abort();
		}
	}
 
	void OnApplicationQuit()
	{
		reportThread.Abort();
	}
 
	void OnGUI()
	{
		lock (reportMutex)
		{
			if (report != null)
			{
				GUI.TextArea(reportDrawArea, report);
			}
		}
	}
}

I then built the app for iOS using Mono and IL2CPP targeting ARMv7 and not using a “Development Build”. Here are the results I got for an iPad 3 running iOS 8.2.

Test Mono Time IL2CPP Time IL2CPP Time Change
BinaryTrees 439.3529 360.2 81.98%
chameneosredux 1786 206821.3 11580.14%
FannkuchRedux 4650.824 26470.4 569.16%
Fasta 711.2941 3508 493.19%
FastaRedux 573.5883 4176.9 728.21%
knucleotide 18.76471 23.4 124.70%
MandelBrot 88.11765 157.4 178.62%
NBody 2210.294 4245 192.06%
regexdna 11.35294 13.5 118.91%
revcomp 0.05882353 0 0.00%
SpectralNorm 480.4706 866.1 180.26%

Mono vs. IL2CPP performance graph (all)

Mono vs. IL2CPP performance graph (fast)

Any result greater than 100% is taking longer for IL2CPP to complete than Mono. Anything less than 100% is faster than Mono.

As you can see, all but two tests are slower in IL2CPP compared to Mono. Those two tests were BinaryTrees at 19% faster and revcomp that was so fast as to be unmeasurable.

The other tests all show slowdowns. These range from the very minor to the extreme. On the very minor side, knucleotide and regexdna are only about 20% slower. MandelBrot, NBody, and SpectralNorm are next-slowest at about 80% slower. Then things get really bad. Fasta and FannkuchRedux are 400-500% slower, and FastaRedux is 628% slower. The slowness crown, by far, goes to chameneosredux at a whopping 10580% slower.

As you can read repeatedly on the “Benchmarks Game” site, these should be taken with a grain of salt. They’re obviously not real apps or games running in Unity and differences can easily creep in. Then again, these 11 tests should be fairly representative of common programming tasks.

I expected to do this test and write an article showing the how awesomely fast the new IL2CPP scripting backend in Unity 4.6.2 and 5.0 is. Unfortunately, the results I’m seeing are quite the opposite. I don’t know why that is, either. If you do, please let me know in the comments.

UPDATE

Thanks to Ralph Hauwert in the comments, part of the reason for the slowdown has been discovered. It turns out that you need to explicitly set Xcode to make a release build, even when “Development Build” has not been checked in the Unity build settings. To do so, click the name of the project, choose “Edit Scheme…”, and change “Build Configuration” to “Release”. By the way, I built the test using Xcode 6.2 on Mac OS X 10.10.2.

Here are the updated results:

Test Mono Time IL2CPP Time (Debug) IL2CPP Time Change (Debug) IL2CPP Time (Release) IL2CPP Time Change (Release)
BinaryTrees 439.3529 360.2 81.98% 266.875 60.74%
chameneosredux 1786 206821.3 11580.14% 116754.1 6537.18%
FannkuchRedux 4650.824 26470.4 569.16% 5289.5 113.73%
Fasta 711.2941 3508 493.19% 2421.375 340.42%
FastaRedux 573.5883 4176.9 728.21% 3252.625 567.07%
knucleotide 18.76471 23.4 124.70% 19.375 103.25%
MandelBrot 88.11765 157.4 178.62% 99.625 113.06%
NBody 2210.294 4245 192.06% 1053.25 47.65%
regexdna 11.35294 13.5 118.91% 13.125 115.61%
revcomp 0.05882353 0 0.00% 0 0.00%
SpectralNorm 480.4706 866.1 180.26% 250.5 52.14%

Mono vs. IL2CPP Performance Graph (release, all)

Mono vs. IL2CPP Performance Graph (release, no chameneosredux)

Mono vs. IL2CPP Performance Graph (release, no chameneosredux or fasta)

IL2CPP in release mode is indeed much faster than in debug mode, as is to be expected. chameneosredux, fasta, and fastaredux remain ludicrously slow at 3x-64x slower than Mono, but the rest are much quicker. knucleotide is almost exactly the same speed as Mono. FannkuchRedux, MandleBrot, and regexdna are only about 10-15% slower.

But there are some bright spots with release mode, too. BinaryTrees, NBody, and SpectralNorm are all about twice as fast with IL2CPP as with Mono. revcomp remains unmeasurably fast whereas with Mono it registered an occasional millisecond of time.

Given that the performance can vary from wildly slower to wildly faster when switching to IL2CPP, the performance you ultimately see will depend highly on the kind of code your app has. Don’t simply assume that your app will see an across-the-board speedup though.

If you’ve seen a significant speedup or slowdown in your real-world code, especially particular kinds of code, I’d be really interested to hear about it in the comments.