JacksonDunstan.com

Today we’ll add two new types to the Native Collections suite: NativeIntPtr and NativeLongPtr. We’ll make them usable with both IJob and IJobParallelFor and explore some new features Unity’s native container system along the way.

What and why

NativeIntPtr and NativeLongPtr are types that point to a single int or long in native/unmanaged memory. It’s natural to immediately ask “why not use NativeArray<int> and NativeArray<long> with Length == 1?” There are three reasons for this.

The first reason is a technical one: they provide the ability to safety write to the value from an IJobParallelFor. NativeArray cannot do this safely.

The second reason is a readability one: NativeIntPtr expresses the programmer’s intent to store exactly one value. NativeArray implies that there may be multiple values and therefore makes reading, understanding, and using the type harder because the programmer must first analyze the code to determine that the NativeArray is being used in a special way with only one value.

The third reason is a related readability one: NativeIntPtr provides syntax that aligns with its single-value nature. For example, we can write counter.Value++ rather than counter[0]++. This makes understanding the code even easier and prevents errors such as using a non-0 index impossible.

IJob Implementation

Let’s step through a slightly simplified and uncommented version of the implementation of NativeIntPtr one chunk at a time. NativeLongPtr is identical except that it uses long instead of int, so it’s not shown in this article.

Let’s start with the declaration:

[NativeContainer]
[NativeContainerSupportsDeallocateOnJobCompletion]
[StructLayout(LayoutKind.Sequential)]
public unsafe struct NativeIntPtr : IDisposable
{

Here we’ve declared the type just like other native containers. It’s an unsafe struct that’s IDisposable. Like NativeArray, it can be auto-disposed after a job completes to avoid the need to manually call Dispose or use a using block.

Now let’s look at the fields:

[NativeDisableUnsafePtrRestriction]
internal unsafe int* m_Buffer;
 
internal Allocator m_AllocatorLabel;
 
#if ENABLE_UNITY_COLLECTIONS_CHECKS
    private AtomicSafetyHandle m_Safety;
 
    [NativeSetClassTypeToNullOnSchedule]
    private DisposeSentinel m_DisposeSentinel;
#endif

m_Buffer is a pointer to native/unmanaged memory that holds the int value. It’s got the required name for auto-disposal.

m_AllocatorLabel stores the Allocator that was used to allocate the native memory and will be used to free it in Dispose.

m_Safety and m_DisposeSentinel are only present when safety checks are enabled. They’re used to guarantee that the memory holding the int is accessed appropriately and only disposed once.

Now let’s look at the constructor:

public NativeIntPtr(Allocator allocator, int initialValue = 0)
{
    if (allocator <= Allocator.None)
    {
        throw new ArgumentException(
            "Allocator must be Temp, TempJob or Persistent",
            "allocator");
    }
 
    m_Buffer = (int*)UnsafeUtility.Malloc(
        sizeof(int),
        UnsafeUtility.AlignOf<int>(),
        allocator);
 
    m_AllocatorLabel = allocator;
 
#if ENABLE_UNITY_COLLECTIONS_CHECKS
#if UNITY_2018_3_OR_NEWER
    DisposeSentinel.Create(out m_Safety, out m_DisposeSentinel, 0, allocator);
#else
    DisposeSentinel.Create(out m_Safety, out m_DisposeSentinel, 0);
#endif
#endif
 
    *m_Buffer = initialValue;
}

The constructor error-checks the given Allocator then stores it and uses it to allocate native memory to hold the int. It creates the DisposeSentinel and stores the initial value, which defaults to 0 just like in managed memory.

public int Value
{
    get
    {
        RequireReadAccess();
        return *m_Buffer;
    }
 
    [WriteAccessRequired]
    set
    {
        RequireWriteAccess();
        *m_Buffer = value;
    }
}

The Value property simply dereferences the pointer to read or write the int. Let’s look at the error-checking functions:

[Conditional("ENABLE_UNITY_COLLECTIONS_CHECKS")]
[BurstDiscard]
private void RequireReadAccess()
{
#if ENABLE_UNITY_COLLECTIONS_CHECKS
    AtomicSafetyHandle.CheckReadAndThrow(m_Safety);
#endif
}
 
[Conditional("ENABLE_UNITY_COLLECTIONS_CHECKS")]
[BurstDiscard]
private void RequireWriteAccess()
{
#if ENABLE_UNITY_COLLECTIONS_CHECKS
    AtomicSafetyHandle.CheckWriteAndThrow(m_Safety);
#endif
}

[Conditional] is used here to remove all calls to these functions when safety checks are disabled. The body is also removed with a #if. The [BurstDiscard] attribute removes this when compiling with Burst. The actual work of the function is just to use the AtomicSafetyHandle to check the caller’s permission to either read or write to the integer.

[WriteAccessRequired]
public unsafe void Dispose()
{
    RequireWriteAccess();
 
#if ENABLE_UNITY_COLLECTIONS_CHECKS
#if UNITY_2018_3_OR_NEWER
    DisposeSentinel.Dispose(ref m_Safety, ref m_DisposeSentinel);
#else
    DisposeSentinel.Dispose(m_Safety, ref m_DisposeSentinel);
#endif
#endif
 
    UnsafeUtility.Free(m_Buffer, m_AllocatorLabel);
    m_Buffer = null;
}
}

Disposing, either manually or automatically, requires write access via the [WriteAccessRequired] attribute and the RequireWriteAccess check. The DisposeSentinel is used to prevent against double-disposing. The signature has changed in Unity 2018.3, so there’s an additional #if to handle the upgrade gracefully.

After the checks, the memory is freed and the buffer set to null. That allows the next function to work:

public bool IsCreated
{
    get
    {
        return m_Buffer != null;
    }
}

This function allows the caller to check if the backing memory was allocated by the constructor (i.e. not bypassed by calling default(T)) and Dispose wasn’t called. It’s important to remember that Dispose might have been called on a copy of the struct, so this doesn’t mean that the struct is safe to use.

With that, we have a basic one-int native collection that we can use with the job system:

struct SumJob : IJob
{
    public NativeArray<int> Values;
    public NativeIntPtr Sum;
 
    public void Execute()
    {
        for (int i = 0; i < Values.Length; ++i)
        {
            Sum.Value += Values[i];
        }
    }
}
 
void RunJob()
{
    NativeArray<int> values = new NativeArray<int>(3, Allocator.Temp);
    values[0] = 10;
    values[1] = 20;
    values[2] = 30;
 
    NativeIntPtr sum = new NativeIntPtr(Allocator.Temp);
 
    SumJob job = new SumJob { Values = values, Sum = sum };
    job.Run();
 
    Debug.Log("Sum: " + sum.Value); // prints "Sum: 60"
 
    values.Dispose();
    sum.Dispose();
}

No support for IJobParallelFor

At this point we have a basic drop-in replacement for NativeArray<int> with Length == 1 that satisfies the two readability reasons for creating such a type. Now let’s go further and add support for IJobParallelFor.

But first, let’s examine why we can’t use the type as-is with IJobParallelFor. Here’s an attempt:

struct ParallelSumJob : IJobParallelFor
{
    public NativeArray<int> Values;
    public NativeIntPtr Sum;
 
    public void Execute(int index)
    {
        Sum.Value += Values[index];
    }
}

Unity rightly gives a error when attempting to run this job. It’s reason is that NativeIntPtr isn’t marked with either [NativeContainerSupportsMinMaxWriteRestriction] or [NativeContainerIsAtomicWriteOnly]. That means that NativeIntPtr isn’t willing or able to restrict access to a certain range that the job’s Execute is working on and it’s also not willing to perform exclusively atomic write operations.

If Unity let us proceed with running the job then there could easily be a situation where the job is running on multiple threads and each of them is accessing the same memory pointed to by the NativeIntPtr. The add operation here isn’t atomic since it actually has three steps:

// Read
int oldValue = Sum.Value;
 
// Modify
int newValue = oldValue + Values[index];
 
// Write
Sum.Value = newValue;

Now imagine that Values has all 1 elements and the following order of execution occurs:

Thread A executes the “Read” step so its oldValue is 1
Thread B executes the “Read” step so its oldValue is 1
Thread B executes the “Modify” step so its newValue is 2
Thread B executes the “Write” step so Sum.Value is 2
Thread A executes the “Modify” steps so its newValue is 2
Thread A executes the “Write” step so Sum.Value is 2

In this case the result of adding two 1 elements to 1 is 2 when it should be 3.

IJobParallelFor implementation

Now let’s go ahead and add the support for IJobParallelFor. We’ll need to add either [NativeContainerSupportsMinMaxWriteRestriction] or [NativeContainerIsAtomicWriteOnly] to do this. Since we have only one value, [NativeContainerSupportsMinMaxWriteRestriction] doesn’t make sense as all jobs would have to operate on the same value. If we restrict to only atomic writes then we could use [NativeContainerIsAtomicWriteOnly] but that would mean we have to remove the getter for Value which is clearly unacceptable.

Since neither of these options work, we’ll add a nested type: NativeIntPtr.Parallel.

[NativeContainer]
[NativeContainerIsAtomicWriteOnly]
public struct Parallel
{

This type is also a native container but additionally marked with [NativeContainerIsAtomicWriteOnly] since we’ll be restricting it to only atomic write operations. This attribute makes it usable in an IJobParallelFor.

Now let’s look at the fields:

[NativeDisableUnsafePtrRestriction]
internal int* m_Buffer;
 
#if ENABLE_UNITY_COLLECTIONS_CHECKS
    internal AtomicSafetyHandle m_Safety;
#endif

This struct holds the same pointer to the integer as in the outer struct type. It’s also got the same AtomicSafetyHandle so it can check that writing is allowed when safety checks are enabled.

Now let’s look at the constructors:

#if ENABLE_UNITY_COLLECTIONS_CHECKS
    internal Parallel(int* value, AtomicSafetyHandle safety)
    {
        m_Buffer = value;
        m_Safety = safety;
    }
#else
    internal Parallel(int* value)
    {
        m_Buffer = value;
    }
#endif

These just take the one or two fields as parameters and set them. It’s a simple way to avoid using the default constructor or default(Parallel).

Everything’s been internal so far, so let’s look at the actual public atomic write operations that user code can call:

[WriteAccessRequired]
public void Increment()
{
    RequireWriteAccess();
    Interlocked.Increment(ref *m_Buffer);
}
 
[WriteAccessRequired]
public void Decrement()
{
    RequireWriteAccess();
    Interlocked.Decrement(ref *m_Buffer);
}
 
[WriteAccessRequired]
public void Add(int value)
{
    RequireWriteAccess();
    Interlocked.Add(ref *m_Buffer, value);
}

Each of these has the usual combination of [WriteAccessRequired] and RequireWriteAccess since they all write to the stored integer. Each also uses static methods of the Interlocked class to perform the atomic operations. The strange ref *m_Buffer syntax converts the int* pointer into a ref int reference.

The final step is to allow creation of a NativeIntPtr.Parallel from a NativeIntPtr, so let’s add NativeIntPtr.GetParallel:

public Parallel GetParallel()
{
#if ENABLE_UNITY_COLLECTIONS_CHECKS
    Parallel parallel = new Parallel(m_Buffer, m_Safety);
    AtomicSafetyHandle.UseSecondaryVersion(ref parallel.m_Safety);
#else
    Parallel parallel = new Parallel(m_Buffer);
#endif
    return parallel;
}

We call the appropriate constructor depending on whether safety checks are enabled or not. If they are, we also mark the Parallel copy of the NativeIntPtr AtomicSafetyHandle as secondary so Unity can track it that way.

With this in place we can go ahead and make an IJobParallelFor:

struct ParallelSumJob : IJobParallelFor
{
    public NativeArray<int> Values;
    public NativeIntPtr.Parallel Sum;
 
    public void Execute(int index)
    {
        Sum.Add(Values[index]);
    }
}
 
void RunJob()
{
    NativeArray<int> values = new NativeArray<int>(3, Allocator.Temp);
    values[0] = 10;
    values[1] = 20;
    values[2] = 30;
 
    NativeIntPtr sum = new NativeIntPtr(Allocator.Temp);
 
    SumJob job = new SumJob { Values = values, Sum = sum.GetParallel() };
    job.Run();
 
    Debug.Log("Sum: " + sum.Value); // prints "Sum: 60"
 
    values.Dispose();
    sum.Dispose();
}

The only changes to the job are to use a NativeIntPtr.Parallel field instead of a NativeIntPtr field and to use Add instead of +=. The code that runs the job just needs to set Sum to sum.GetParallel() instead of sum.

Unity allows us to run this job because NativeIntPtr.Parallel is marked with [NativeContainerIsAtomicWriteOnly]. Returning to the multi-threading example above, we now have this sequence:

Thread A executes the atomic Add so Sum.Value is 2
Thread B executes the atomic Add so Sum.Value is 3

Each addition is a little more expensive, but we now get the correct result.

Conclusion

That’s all there is to implementing NativeIntPtr and NativeLongPtr, which is idnetical except for the use of long instead of int. We’ve achieved all three technical and readability goals. These types may be useful for all sorts of purposes, but a main one is to count occurrences or sum values like in the above examples. For the full source code, check out the Native Collections GitHub project.

#1 by SKGames on October 15th, 2018 · Reply

Is just same as Unity NativeCounter sample :)

#2 by jackson on October 15th, 2018 · Reply

I just looked over NativeCounter and noted some differences:

NativeIntPtr is useful for more tasks than just counting due to having Decrement and Add methods

NativeIntPtr supports auto-disposing

NativeIntPtr has a debugger view

NativeIntPtr doesn’t needlessly check if int is blittable

NativeIntPtr.set Value is marked [WriteAccessRequired]

NativeIntPtr.Dispose requires write permissions

NativeIntPtr is thoroughly xml-doc commented

NativeIntPtr has an article explaining step-by-step how it’s built :)

NativeIntPtr has a long-based sibling: NativeLongPtr

#3 by JamesM on October 15th, 2018 · Reply

For your second approach of using parallel it’s probably best to have a value per thread so your not abusing atomic values.

https://docs.unity3d.com/ScriptReference/Unity.Collections.LowLevel.Unsafe.NativeSetThreadIndexAttribute.html

If you want another related interesting read I found this on google:

https://gist.github.com/joeante/3f6b75c738fe0a1be19207e7e4294578

It’s a similar approach but also shows the approach of a counter per thread/batch.

#4 by jackson on October 15th, 2018 · Reply

Thanks for the links! I’ll think about adapting this per-thread approach to NativeIntPtr. Even though NativeIntPtr is more than just a counter, I think it could still work.

#5 by WilliamX on April 15th, 2020 · Reply

I’m confused. Does NativeIntPtr.Parallel support both “reading” and “writing” when used for IJobParallelFor? I didn’t see how you solve the concurrent reading problem.

#6 by jackson on April 15th, 2020 · Reply

NativeIntPtr.Parallel exclusively writes using methods of Interlocked which provide atomic access. Reading is not supported by NativeIntPtr.Parallel.

#7 by Pennylane Goodman on July 5th, 2020 · Reply

Can this be made into a generic? Like say Native or Native, or even Native.

#8 by jackson on July 9th, 2020 · Reply

No, because Interlocked only supports particular types like int and long.

NativeIntPtr and NativeLongPtr

What and why

IJob Implementation

No support for IJobParallelFor

IJobParallelFor implementation

Conclusion

Comments