Async & Await vs. Unity Jobs
Unity 2018.1 brought us two asynchronous code systems. First, there’s C# 5’s async
and await
keywords in conjunction with the Task
and Task<T>
types. Second, there’s Unity’s own C# jobs system. There are many differences, but which is faster? Today’s article puts them to the test to find out!
The goal of today’s test is to stress both asynchronous systems. As such, we’ll minimize the amount of work performed in each unit of work so that the system overhead is maximized.
To this end, we’ll use this super simple function with C#’s async
and await
keywords:
public static Task MyTask(NativeArray<int> result) { return Task.Run(() => result[0]++); }
This function just creates a Task
, but doesn’t run it. When the task is invoked, it runs the lambda delegate we provided. All the delegate does is increment the first element of a given NativeArray<int>
. That’s about as little work as we can encapsulate into a C# Task
and will allow us to compare directly to Unity’s jobs system.
Next, we need a way to run many of these tasks. This is where the async
and await
keywords come in:
public static async Task RunTasks(NativeArray<int> result, int num) { for (int i = 0; i < num; ++i) { await MyTask(result); } }
This function uses await
on MyTask
which schedules it for execution and then stops execution of RunTasks
until MyTask
is finished. As an async
function, RunTasks
is itself a Task
that can be run asynchronously.
Now let’s look at how to test this. First, we need to set up the test:
const int numRuns = 1000; NativeArray<int> result = new NativeArray<int>(1, Allocator.Temp);
Then we can actually run the test. We’ll again use Task.Run
but this time call Wait
on the resulting Task
. This blocks execution of the thread until the task is finished.
var sw = System.Diagnostics.Stopwatch.StartNew(); Task task = Task.Run(() => RunTasks(result, numRuns)); task.Wait(); long asyncTicks = sw.ElapsedTicks;
We now have the amount of time to sequentially run a bunch of tasks with async
and await
.
Next, let’s define the equivalent of a Task
in Unity’s C# job system:
public struct MyJob : IJob { public NativeArray<int> Result; public void Execute() { Result[0]++; } }
Just as with MyTask
, MyJob
simply takes a NativeArray<int>
and increments its first element. Now let’s make the equivalent of RunTasks
to run these jobs:
public static void RunJobs(NativeArray<int> result, int num) { MyJob job = new MyJob { Result = result }; for (int i = 0; i < num; ++i) { JobHandle handle = job.Schedule(); JobHandle.ScheduleBatchedJobs(); handle.Complete(); } }
This function creates the job just once as each job is the same. Unity’s job system will make copies of the job struct
for each execution. Then we call Schedule
and JobHandle.ScheduleBatchedJobs
to schedule the job for execution on another thread. Finally, we call JobHandle.Complete
to wait for the job to finish. The result is the same sequential execution that we got with async
and await
.
To test this, we just call RunJobs
:
result[0] = 0; sw.Reset(); sw.Start(); RunJobs(result, numRuns); long jobTicks = sw.ElapsedTicks; CheckResult(result, numRuns);
Finally, we tear down the test and print the results:
result.Dispose(); Debug.Log( "System,Ticks\n" + "Async+Await," + asyncTicks + "\n" + "Unity Jobs," + jobTicks);
I ran the performance test in this environment:
- 2.7 Ghz Intel Core i7-6820HQ
- macOS 10.13.6
- Unity 2018.2.0f2
- macOS Standalone
- .NET 4.x scripting runtime version and API compatibility level
- IL2CPP
- Non-development
- 640×480, Fastest, Windowed
Here are the results I got:
System | Ticks |
---|---|
Async+Await | 525900 |
Jobs | 128710 |
This test shows that Unity’s jobs system is about 4x faster than C#’s async
, await
, and Task
system.
Keep in mind that this won’t always be the case as the work being done in each task, complexity of dependencies, and variability in hardware will have a substantial effect. There are also a lot more restrictions when using Unity jobs, such as the inability to use reference types.
It’s also worth noting that the testing methodology used here executes all tasks one-at-a-time until completion. Many real-world uses will involve multiple tasks executing in parallel, which may change the results. As usual, it’s best to measure your own specific application or game, such as with Unity’s profiler. That said, the overhead required by async
and await
does seem quite a bit more substantial than Unity’s job system.
#1 by Sergey on September 10th, 2018 ·
“such as the inability to use reference types” – It’s not completly true :)In not Burst Compiled Jobs you can use reference and non blittable types in
ISharedComponentData
, arrays by usingIBufferElementData
, of course it’s in a bunch with ECS :)#2 by jackson on September 10th, 2018 ·
This article is just discussing the current state of Unity where the ECS is still in “preview,” but yes you are correct that reference types can be used in some cases.
#3 by Stephen Hodgson on September 10th, 2018 ·
It’s probably bc you’re using that lamda to run it. Why not just call the method directly from an async method?
#4 by jackson on September 10th, 2018 ·
There’s one lambda in
MyTask
and another in the code that callsRunTasks
. Which one are you referring to?#5 by Stephen Hodgson on September 10th, 2018 ·
Probably both. I’m wondering if you’ll see a difference if you’re not just calling Task.Run but instead calling the async method directly and awaiting on that.
#6 by Stephen Hodgson on September 10th, 2018 ·
Something like:
You might have to write a custom awaiter for your iterator though.
#7 by jackson on September 10th, 2018 ·
In the case of
MyTask
, your comment is basically my thought. ATask
needs to be created to encapsulate theresult[0]++
work which is notasync
so it can’t beawait
ed. This lambda delegate is created and invoked 1000 times, which I agree is probably part of the performance issue.For the call to
RunTasks
, the function calling it is aMonoBehavior
Awake
function which isn’tasync
. This means it can’t useawait
and instead usesTask.Run
andWait
. This lambda delegate is only created and invoked once, so it’s probably irrelevant to the overall time.#8 by Stephen Hodgson on September 10th, 2018 ·
You can make Awake async by adding it.
private async void Awake()
should be enough.
#9 by jackson on September 10th, 2018 ·
True, but
Awake
also callsRunJobs
and I’d prefer to keep the tests isolated so they don’t influence each others’ times.#10 by sebastian on September 10th, 2018 ·
`async` is not part of the type-system. You can make any function async as long as it returns `void` or one of the `Task` variants. Also, you never `await` a function — you `await` something that has an awaiter (there is no one interface for that; it just calls magic methods on the type). This means that you can `await` the results of non-`async` functions (which makes sense, given that there are no `async`-functions in the typesystem). For example:
What you could think might be right to do is something like this:
But this is not what you want to measure, because just `await`ing something does *not* mean that any asynchronicity is actually used. In this case, it will notice that the returned `Task` that it should `await` is already completed and just resume synchronously.
I actually think that your initial example is quite good. You can save some time on the closure, but you need to be careful: There is no overload for `Task.Run` that works like the `Thread.Start` (don’t quote me on the function name here) and takes another parameter that just gets passed to whatever function you’re running. So you have to capture the `result` somewhere and have to create *a* closure — but not a new one for each step of the iteration.
This works:
#11 by jackson on September 10th, 2018 ·
You make several good points and you’re right that a new delegate isn’t required for each task. Using your version of
RunTasks
in the same environment, I got about a 10% speedup compared to the version in the article. It’s still roughly 4x slower than Unity’s job system though, but it’s good to note that a unique delegate isn’t strictly required.#12 by JamesM on September 10th, 2018 ·
Using await means your function will yield control back to Unity, so it’s going to be much slower then your native method which does not yield control back to unity. The equivalent for the await would be (not using await)
Or if you really wanted it to be parallel
If your going to try comparing the two it is best to compare them on equal ground (Which isn’t going to be easy as they are different things and ideally should be able to be used together)
It would be good if unity supported something like the following (it might already)
This would allow you to use jobs and async/await together.
#13 by jackson on September 10th, 2018 ·
I’m definitely meaning to test
async
andawait
on level ground with Unity’s job system. I think the way it’s written in the article is the most fair comparison.As you point out, your first version of
RunTasks
doesn’t use eitherasync
orawait
, so it’s not applicable to a test meant to testasync
andawait
. It does useTask
and the underlying asynchronous system, but I can’t use it in an article titled Async & Await vs. Unity Jobs when it doesn’t haveasync
orawait
.Your second version of
RunTasks
runs all the tasks in parallel, but the Unity jobs version runs them all sequentially off the main thread. That wouldn’t be level ground to test the two systems on sinceasync
andawait
could theoretically run on as many cores as the CPU has while Unity jobs would run on only one core.Your third version of
RunTasks
is a mix ofasync
andawait
with Unity jobs, which means there’s no way to attribute the results to one system or the other. Since that’s the goal of this test, that’s not a viable option.As sebastian points out, there may be a more level ground on which the systems could be compared. Perhaps a custom “awaiter” could be written that yields better performance. There’s a danger in going too far outside of the canonical usage of
async
andawait
though, as the custom “awaiter” code’s performance becomes more and more influential in the test time which may misrepresent the performance ofasync
andawait
.#14 by JamesM on September 11th, 2018 ·
What you have I feel does go outside of the benchmarking of await/async, your forcing code to be executed on another thread and creating a bunch of lambdas then yielding back to unity each time waiting for its next tick of the tasks in its main thread.
Even this would be more fair as your not waiting on another thread pool to have tasks added/removed.
Again this highly depends on how frequently unity handles tasks sent to its SynchronizationContext context. If it does not do an immediate loop of them and instead wait until its next ‘Update’ cycle then it will lose, because it is actually yielding control back unlike your Jobs RunTasks that does not yield control back to Unity, essentially freezing it until completion.
#15 by jackson on September 11th, 2018 ·
This seems analogous to the Unity jobs test, which forces code to be executed on another thread by scheduling a job and yields back to Unity by calling
JobHandle.ScheduleBatchedJobs
. The job system doesn’t use lambdas since that’s not part of it’s design, but creating a bunch of them only resulted in about a 10% difference.The new version of
RunTasks
you posted doesn’t encapsulate the work in a task like the Unity jobs test encapsulates the work in a job, so it doesn’t seem like an even comparison.Couldn’t the same inefficiency issues also be true of the jobs system? It’s possible that Unity’s implementation of one system or the other is especially inefficient, but finding that out is the point of the test. For this article, it’s less important why one system is slower than the other and more important how much slower it is.
As for “yield control back to Unity,” could you point out where this is happening? It seems to me like both tests schedule work off the main thread and then block the main thread until it’s completed, never yielding back as would be done in a coroutine or
Update
function.#16 by JamesM on September 11th, 2018 ·
The Job system more closely matches the Task system within C#, not the async/await system as they both use thread pools to run tasks/jobs, whereas async/await is used to suspend a function and resume it later.
If you want to compare async/await you should compare it with Unity corountines as this has the same goal of suspending and resuming functions.
C# Tasks = Unity Jobs
C# async/await = Unity Coroutines
With your tests for the job system you are not suspending/resuming the RunJobs method but instead only blocking on that method waiting for it to complete with JobHandle.Complete, however with the await/async implementation you are suspending that function and resuming it later often within an engine (or synchronisation context implementation) this resume will not happen until the next tick.
The async/await approach does not block the main thread, instead when you use the keyword ‘await’ it will suspend that function and continue running everything else (in this case unity), you could change your example to be a much higher ‘num’ and you will noticeably see that the await/async approach does not impact your game play, only the job system.
It should also be noted that even when comparing C# task and the Unity Job system they are there for two different process, the Unity Job system is designed to be short tasks which can be completed within a tick (hence no callback, only block), where the C# tasks have no expectation of lifetime. However for implementation both should be comparable and I feel this article might be better suited for comparing the two. (Then a second article comparing Unity coroutines with async/await)
#17 by jackson on September 12th, 2018 ·
I think I see what you’re getting at here:
async
andawait
can be used withoutTask
andTask<T>
andTask
andTask<T>
can be used withoutasync
andawait
. It just happens that the two are usually used together, especially when writing asynchronous code. From that perspective, the article can be seen as conflating the two in its performance test. I agree with this, so the article should technically be titled Async, Await, and Tasks vs. Unity Jobs.Unity jobs can easily be run for longer than one frame. There’s no requirement to call
JobHandle.Complete
each frame, so long-running jobs can continue off the main thread indefinitely.In any case, I’ll write a followup article to explore these topics further. Thanks for pointing out these technicalities! I’ll definitely incorporate your feedback into that article.
#18 by sebastian on September 11th, 2018 ·
I think that your first piece of code is also pretty much spot-on. The point really is what you want to measure: If you want to measure the overhead of the
async/await
-compiler transformations and the asynchronicity they entail *plus* the cost of scheduling and executing the task, then you (obviously) want to useawait
as, say, in Jackson’s initial example.If, however, you merely want to test the performance of the underlying system, then using
MyTask(result).Wait();
is *exactly* what you want.My takeaway from your comment then is: The comparison is inherently unfair because for Unity’s scheduling system, we are only measuring the performance of the underlying system. The
job.Complete()
call is blocking and there is no actual asynchronous code involved. We never *actually* yield from a coroutine (in the general sense, not the Unity-sense), so to speak, and jump back via the synchronization context. Theasync/await
example is then conflating both issues: It’s scheduling tasks (usingTask.Run
) and doing asynchronicity (usingasync/await
). In a sense, this is comparing apples (the complete synchronous, yet multithreaded, Unity code) and oranges (the asynchronous and multithreaded async/await code).I therefore completely agree that a fair comparison *either* uses your first piece of code or adds actual asynchronicity to the Unity code (which is what your last piece of code is all about).
This makes me wonder whether the example is just too simplistic to get the point across.
#19 by jackson on September 11th, 2018 ·
This is what the test is meant to measure. It therefore uses
async
,await
, and the underlyingTask
system.I think you’re right that it is a very simple test and because of that potentially too far from real-world usage to be especially useful. I’ll think about some other ways to compare the two systems, but any suggestions you (or other readers of this comment) have would be great to hear.
#20 by Jim B. on November 13th, 2018 ·
Thanks for sharing! Could you clarify, does Async & Await take advantage of multiple cores on the cpu?
#21 by jackson on November 13th, 2018 ·
Yes, they use multiple threads to spread the work across CPU cores.
#22 by Aditya Kresna on January 15th, 2019 ·
It will use another thread, only if you specifiy
ConfigureAwait(false)
, likeawait MyTask(result)
.Please CMIIW.
#23 by Aditya Kresna on January 15th, 2019 ·
I mean
await MyTask(result).ConfigureAwait(false);
#24 by Sam H. on June 11th, 2019 ·
Worst implementation of async/await I’ve seen, you pretty much used everything that should be avoided when using async/await.
#25 by jackson on June 11th, 2019 ·
Can you please point me to a specific issue that would help to improve the test in the article?
#26 by Sam H. on June 11th, 2019 ·
Not to mention that you are also measuring the time the Task engine takes to initialize in your benchmark