We’ve seen that Temp allocations are the fastest kind of allocations, but is this always the case? When the fixed-size block of memory they draw from runs out, are the overflow allocations just as fast? Today we’ll test to find out!

Test Design

Today’s test is quite straightforward. It works like this:

  1. Perform 819 four-byte allocations to use up the fixed-size block of memory suspected to be behind the Temp allocator
  2. Perform 819 more four-byte allocations, presumably from the alternative allocator that handles overflow for
    Temp</li>
    <li>Perform 819 four-byte allocations with <code>TempJob
    for comparison
  3. Deallocate the TempJob allocations

We’ll do this very frame and accumulate the time required for each set of 819 allocations. When we’ve counted up 1000 frames worth of allocations, we’ll print a report of the times taken. Note that deallocation time is not counted.

Here’s the test source code:

using System.Diagnostics;
using Unity.Collections;
using Unity.Collections.LowLevel.Unsafe;
using UnityEngine;
 
unsafe class TestScript : MonoBehaviour
{
   const int count = 819;
 
   private Stopwatch stopwatch;
   private void*[] allocs;
   private long blockTicks;
   private long overflowTicks;
   private long tempJobTicks;
   private int numFrames;
 
   private void Start()
   {
      stopwatch = new Stopwatch();
      allocs = new void*[count];
   }
 
   void Update()
   {
      stopwatch.Restart();
      for (int i = 0; i < count; ++i)
      {
         allocs[i] = UnsafeUtility.Malloc(4, 4, Allocator.Temp);
      }
      blockTicks += stopwatch.ElapsedTicks;
 
      stopwatch.Restart();
      for (int i = 0; i < count; ++i)
      {
         allocs[i] = UnsafeUtility.Malloc(4, 4, Allocator.Temp);
      }
      overflowTicks += stopwatch.ElapsedTicks;
 
      stopwatch.Restart();
      for (int i = 0; i < count; ++i)
      {
         allocs[i] = UnsafeUtility.Malloc(4, 4, Allocator.TempJob);
      }
      tempJobTicks += stopwatch.ElapsedTicks;
 
      for (int i = 0; i < count; ++i)
      {
         UnsafeUtility.Free(allocs[i], Allocator.TempJob);
      }
 
      numFrames++;
      if (numFrames == 1000)
      {
         print(
            "Allocation,Ticksn" +
            "Block," + blockTicks/numFrames + "n" +
            "Overflow," + overflowTicks/numFrames + "n" +
            "TempJob," + tempJobTicks/numFrames);
      }
   }
}
Test Results

I ran the test in this environment:

  • 2.7 Ghz Intel Core i7-6820HQ
  • macOS 10.15.2
  • Unity 2019.2.15f1
  • macOS Standalone
  • .NET 4.x scripting runtime version and API compatibility level
  • IL2CPP
  • Non-development
  • 640×480, Fastest, Windowed

And here are the results I got:

Allocation Ticks
Block 171
Overflow 491
TempJob 373

Temp Allocation Times Graph

Allocation from the fixed-size block is over twice as fast as the other kinds of allocation. Overflow allocations are not only slower than allocations from the fixed-size block, but they’re also slower than TempJob allocations! They’re closer to 3x slower than allocations from the block, where TempJob is closer to 2x.

Conclusions

Temp allocations are not always fast. When the fixed-size block runs out, allocations become about 3x slower which makes them slower than directly using the TempJob allocator. This point appears to be after about 16 KB (820*(4+16)=16400 bytes). Because there’s apparently no way to dispose of Temp memory or to configure the size of the block, it should only be used for very small allocations if allocator performance is important to the game.