Fastest Callbacks: Delegate, Event, or Interface?
How fast are C# delegates and events? The best answer is “compared to what?”. You probably use callbacks all the time, but what’s the fastest kind of callback? C# delegates and events are nice, built-in features of C#, but you could also implement your own callback interface. Would that gain you any speed? Read on for the performance test and results!
First let’s take a quick look at today’s contenders. First up is the lowly delegate:
// Define the type of function that the delegate calls delegate void Delegate(); // Create a delegate that can point to one ore more functions Delegate del; // Point the delegate at a single callback function void Foo() {} del = Foo; // Point the delegate at multiple callback functions void Goo() {} del = Foo; del += Goo; // Call all the functions the delegate points to del();
C# events are built directly on top of delegates:
// Create an event that can point to one ore more functions event Delegate ev; // Point the event at a single callback function void Foo() {} ev = Foo; // Point the event at multiple callback functions void Goo() {} ev = Foo; ev += Goo; // Call all the functions the event points to ev();
The last one is an odd-ball character for C#, but extremely common in other languages. For example, Java’s ActionListener and other “listeners” use this strategy. Simply put, you define an interface instead of a delegate and then implement a class instead of using a callback function. Here’s how that looks:
// Define an interface instead of a delegate interface ICallback { void Call(); } // Define a class instead of a callback function class Callback { public void Call() { } } // Define a callback for a single function ICallback callback; // Point the callback at a single function callback = new Callback(); // Call a single callback callback.Call(); // Define a callback for multiple functions ICallback[] callbacks; // Point the callback at multiple functions callbacks = new ICallback[]{ new Callback(), new Callback() }; // Call multiple callbacks for (var i = 0; i < callbacks.Length; ++i) { callbacks[i].Call(); }
The last one is much less convenient as you’re left to create the lists of callbacks, handle adding and removing during the callback process, and awkwardly define classes for each callback function. All without the help of Java’s anonymous classes:
// This is Java code. You can't do this in C#. callback = new ICallback() { public void Call() { } };
That said, all three options are valid ways to implement a callback scheme. So, which is fastest? The following test finds out with a single, small MonoBehaviour
:
using UnityEngine; public interface ICallback { void Call(); } public class Callback : ICallback { public void Call() { } } public delegate void Delegate(); public class TestScript : MonoBehaviour { private Delegate DelegateSingle; private Delegate DelegateMultiple; private event Delegate EventSingle; private event Delegate EventMultiple; private ICallback CallbackSingle; private ICallback[] CallbackMultiple; private const int NumIterations = 100000000; private const int MultipleCount = 10; private string report; void Start() { Setup(); Test(); } private void Setup() { DelegateSingle = CallbackFunction; DelegateMultiple = CallbackFunction; for (var i = 0; i < MultipleCount-1; ++i) { DelegateMultiple += CallbackFunction; } EventSingle = CallbackFunction; EventMultiple = CallbackFunction; for (var i = 0; i < MultipleCount-1; ++i) { EventMultiple += CallbackFunction; } var callback = new Callback(); CallbackSingle = callback; CallbackMultiple = new ICallback[MultipleCount]; for (var i = 0; i < MultipleCount; ++i) { CallbackMultiple[i] = callback; } } private void Test() { var stopwatch = new System.Diagnostics.Stopwatch(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { DelegateSingle(); } var delegateSingleTime = stopwatch.ElapsedMilliseconds; stopwatch.Reset(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { DelegateMultiple(); } var delegateMultipleTime = stopwatch.ElapsedMilliseconds / MultipleCount; stopwatch.Reset(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { EventSingle(); } var eventSingleTime = stopwatch.ElapsedMilliseconds; stopwatch.Reset(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { EventMultiple(); } var eventMultipleTime = stopwatch.ElapsedMilliseconds / MultipleCount; stopwatch.Reset(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { CallbackSingle.Call(); } var callbackSingleTime = stopwatch.ElapsedMilliseconds; stopwatch.Reset(); stopwatch.Start(); for (var i = 0; i < NumIterations; ++i) { for (var j = 0; j < MultipleCount; ++j) { CallbackMultiple[j].Call(); } } var callbackMultipleTime = stopwatch.ElapsedMilliseconds / MultipleCount; report = "Test,Single Time,Multiple Time\n" + "Delegate," + delegateSingleTime + "," + delegateMultipleTime + "\n" + "Event," + eventSingleTime + "," + eventMultipleTime + "\n" + "ICallback," + callbackSingleTime + "," + callbackMultipleTime; } void OnGUI() { GUI.TextArea(new Rect(0, 0, Screen.width, Screen.height), report); } private void CallbackFunction() { } }
If you want to try out the test yourself, simply paste the above code into a TestScript.cs
file in your Unity project’s Assets
directory and attach it to the main camera game object in a new, empty project. Then build in non-development mode for 64-bit processors and run it windowed at 640×480 with fastest graphics. I ran it that way on this machine:
- 2.3 Ghz Intel Core i7-3615QM
- Mac OS X 10.10.3
- Unity 5.0.1f1, Mac OS X Standalone, x86_64, non-development
- 640×480, Fastest, Windowed
And got these results:
Test | Single Time | Multiple Time |
---|---|---|
Delegate | 301 | 742 |
Event | 255 | 749 |
ICallback | 229 | 274 |
All three approaches have fairly similar times when only calling a single callback. This is probably the most common case, so it’s good to see that the easier-to-use approaches—delegates and events—are nearly as fast as the trickier one: interfaces. That said, interfaces was the fastest followed by events (11% slower) and then delegates (31% slower).
The big difference is when multiple (10) functions were called by each. Interfaces barely got slower then the single version. On the other hand, delegates and events took more than twice as long. Clearly there is some optimization for single function delegates and events. Interfaces are the big winner here are delegates are 271% slower and events are 273% slower.
Now that we’ve established which is faster—interfaces—the question becomes: should you bother? Interfaces are definitely more work to set up, harder to read, more error-prone, and more difficult to integrate into your code. The percentages faster are impressive, but will you save a lot of total CPU time? The above test results are measured in milliseconds per 100 million callback calls. That’s clearly way more than you’ll have in any reasonable game or app.
Say your game had a thousand callbacks per frame. That’d be a lot of callbacks, but still 100,000 times less than this article’s test results show. Even in the slowest case—events with multiple callback functions—you’d only be using 0.00749 milliseconds to call all the callbacks. That’s a very small slice of time and probably won’t ever matter to your overall frame rate. This means you can safely skip the interfaces optimization unless you find yourself wanting to call at least hundreds of thousands of callbacks per frame. Since that’ll probably never happen, you can probably skip interfaces as a callback mechanism altogether.
Got another way to do callbacks? Got a preference for one way over another? Sound off in the comments!
#1 by Anton on September 4th, 2017 ·
Please, delete my first comment. I failed to format
it properly.
I think your tests have several deficiencies:
1. When testing multiple dispatch, you use the same
instance of interface and of delegate, which may
trigger optimisation by caching. I suggest that
you measure performace with instances of differ-
ent classes that implement the same interface and
with different functions that correspond to the
same delegate. It is fairly easy to roll a some
analogour classes and functions by means of copy-
pasting.
2. Your delegates and interface callbacks have empty
implementations, which too may result in some op-
timisations. I suggest they perform some trivial
and very fast operation, such as incrementing a
global counter variable, which the program will
print in the console after all the tests. With a
local variable the optimiser may detect that it
is not used and remove the code altogether.
3. For multiple interface dispatch you manually con-
struct an array of interface instances. I sug-
gest that you do the same for multiple delegate
dispatch. Multicast delegates seem ugly and
slow. I never use them, employing instead a
List single delegates, through which my dis-
patcher loops.
#2 by jackson on September 5th, 2017 ·
1. That’s true. By using the same callback I’m not using as much of the CPU’s instruction cache. If there are cache misses then it may run much slower. I’ll try this out and see if the results change.
2. Both an empty function and a function with some trivial work may be inlined due to their small size. A large function won’t be inlined, but adds enough overhead to skew the test results. There’s no clear winner here, but I’ll try this out too.
3. I don’t think I understand what you’re suggesting. Could you provide a little sample code to clarify?
#3 by Anton on September 4th, 2017 ·
I am sorry for the duple post, but I can’t get the tag to work…
#4 by Raoul on September 8th, 2017 ·
I think 3. means something like :
#5 by jackson on September 8th, 2017 ·
Ah, it seems like that time would be the “Single Time” for “Delegate” times
MultipleCount
: 301 * 10 = 3010. So roughly 4x slower than the “Multiple Time” for “Delegate” and 11x slower thanICallback
.#6 by Kamyker on June 4th, 2019 ·
Interesting but I’ve rerun your test in the newest unity with the newest C# and it’s no longer true:
Test,Single Time,Multiple Time
Delegate,611,407
Event,581,415
ICallback,527,679
Also I don’t think this test makes sense in the first place. You would have to use List instead of array to have flexibility of events/delegates and in that case it’s twice slower:
Delegate,667,435
Event,545,438
ICallback,537,1150
Anyway thanks for the code for testing.
#7 by Kamyker on June 4th, 2019 ·
Ok I take it back, after building the project results are completely different:
Mono:
Test,Single Time,Multiple Time
Delegate,168,228
Event,168,233
ICallback,127,156
IL2CPP:
Delegate,624,526
Event,671,530
ICallback,190,364
I was using List
#8 by Saeed on September 1st, 2024 ·
I tested with the following code and the results (in both debug mode and release mode) was that all 3 performed *exactly* equal, with the differences being in the margin of error (a thousandth of the result number) and inconsistent.
using System;
using System.Diagnostics;
using System.Linq;
using UnityEngine;
using Debug = UnityEngine.Debug;
namespace Test
{
public interface ICallback
{
void Call();
}
public class Callback : ICallback
{
public void Call() => TestScript.CallbackFunction();
}
public delegate void Delegate();
public class TestScript : MonoBehaviour
{
private Delegate sample_delegate;
private event Delegate sample_event;
private ICallback[] sample_interface;
public int repeat = 10;
public int callbackCount = 10;
public static int s_result;
private static Stopwatch s_stopwatch = new();
[ContextMenu(“Start Test”)]
private void StartTest()
{
Setup();
Test();
}
private void Setup()
{
sample_delegate = null;
sample_event = null;
sample_interface = new ICallback[callbackCount];
for (var i = 0; i < callbackCount; ++i)
{
sample_delegate += CallbackFunction;
sample_event += CallbackFunction;
sample_interface[i] = new Callback();
}
}
private void Test()
{
double[] results_event = new double[repeat];
double[] results_delegate = new double[repeat];
double[] results_interface = new double[repeat];
for (int r = 0; r < repeat; r++)
{
#if UNITY_EDITOR
bool cancel = UnityEditor.EditorUtility.DisplayCancelableProgressBar("Benchmark Running…",
$"iteration {r}/{repeat}", (float)r / repeat);
if (cancel)
{
return;
}
#endif
results_event[r] = Test_Event();
results_delegate[r] = Test_Delegate();
results_interface[r] = Test_Interface();
}
UnityEditor.EditorUtility.ClearProgressBar();
Array.Sort(results_event);
Array.Sort(results_delegate);
Array.Sort(results_interface);
Debug.Log(
$"\nevent:\tmin:{results_event[0]:0.000}\tmax:{results_event[^1]:0.000}\tavg:{results_event.Average():0.000}\tmedian:{results_event[results_event.Length / 2]:0.000}" +
$"\ndelegate:\tmin:{results_delegate[0]:0.000}\tmax:{results_delegate[^1]:0.000}\tavg:{results_delegate.Average():0.000}\tmedian:{results_delegate[results_delegate.Length / 2]:0.000}" +
$"\ninterface:\tmin:{results_interface[0]:0.000}\tmax:{results_interface[^1]:0.000}\tavg:{results_interface.Average():0.000}\tmedian:{results_interface[results_interface.Length / 2]:0.000}"
);
}
private double Test_Event()
{
s_stopwatch.Restart();
sample_event?.Invoke();
return s_stopwatch.Elapsed.TotalMilliseconds * 1000;
}
private double Test_Delegate()
{
s_stopwatch.Restart();
sample_delegate?.Invoke();
return s_stopwatch.Elapsed.TotalMilliseconds * 1000;
}
private double Test_Interface()
{
s_stopwatch.Restart();
for (int i = 0; i s_result++;
}
}