Adding Unions to C#
C and C++ have a great feature call the “union”. It’s like a struct
except it only has one of the fields at a time. C# lacks this feature, but with some trickery it can be added in. Today’s article shows how to do that!
Let’s start with an empty struct
and use the System.Runtime.InteropServices.Marshal.SizeOf
function to find out its size:
struct MyStruct { } Marshal.SizeOf(typeof(MyStruct)); // 0
That makes sense since the struct
is empty. Now let’s add a single 4-byte int
and hope to see a 4:
struct MyStruct { int I; } Marshal.SizeOf(typeof(MyStruct)); // 4
Everything’s going as expected so far. Let’s do one last obvious step and try a 4-byte int
and a 4-byte float
:
struct MyStruct { int I; float F; } Marshal.SizeOf(typeof(MyStruct)); // 8
Good. Now let’s convert this struct
into a union. As mentioned above, C# doesn’t have built-in unions, unlike C and C++. So let’s look and see what this struct would look like in C:
struct MyStruct { int I; float F; }; sizeof(struct MyStruct); // 8
Now if we wanted it to be a union so that it represented either an int
or a float
we’d just change struct
to union
:
union MyUnion { int I; float F; }; sizeof(union MyUnion); // 4
The size of the union indicates that the same memory is being used for either a 4-byte int
or a 4-byte float
. This becomes pretty obvious when you set both of them:
union MyUnion { int I; float F; }; union MyUnion u; u.I = 123; printf("%i\n", u.I); // prints: 123 u.F = 3.14f; printf("%i\n", u.I); // prints: 1078523331
When we set F
we overwrote the memory that was holding I
with a floating-point value. This is good because it means that the union is representing just one of the two values, not both, and using only enough memory to store one of them, not both.
Now how can we achieve this with C# given that it doesn’t have a union
keyword? The key lies in some goodies found in the System.Runtime.InteropServices
namespace. Don’t worry- it’s available in Unity’s old Mono-based .NET implementation, even with the “.NET 2.0 Subset”.
First, we need to take manual control over how the fields in the struct are laid out in memory. We do this with the [StructLayout]
attribute like so:
[StructLayout(LayoutKind.Explicit)] struct MyUnion { int I; float F; }
We’ve just told the compiler that we want to explicitly state where to put the I
and F
fields of the struct in memory, but haven’t yet stated where to put them. To do that, we use the [FieldOffset]
attribute:
[StructLayout(LayoutKind.Explicit)] struct MyUnion { [FieldOffset(0)] int I; [FieldOffset(0)] float F; }
By setting the offset of both fields to 0
, we’re saying that they should both use the same memory space. If this works, we should see the same overwriting behavior that we saw in C, so let’s try:
MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331
It worked! The last thing to check is to see that it’s not still using 8 bytes:
Marshal.SizeOf(typeof(MyUnion)); // 4
At this point we essentially have created a union in C#!
Finally for today, you often want a way to know which value has been stored in the union so you don’t accidentally access it the wrong way. A simple bool
will suffice for that. Here’s how it’d look in C:
struct TaggedUnion { bool IsInt; union { int I; float F; }; };
In C# there are, of course, no anonymous unions so we need to pack everything into the same struct
. But where do we set the offset of IsInt
? We definitely don’t want it using the same memory as the other values, so it needs to either come before or after them. Let’s try putting it before and moving the other fields forward one byte:
[StructLayout(LayoutKind.Explicit)] struct MyUnion { [FieldOffset(0)] bool IsInt; [FieldOffset(1)] int I; [FieldOffset(1)] float F; } MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331 Marshal.SizeOf(typeof(MyUnion)); // 8
Having the bool
before the other fields still lets them use the same memory, but resulted in using 8 bytes instead of just the 5 we’d expect. This is due to C# padding each field up to 4-byte boundaries. We need to override this in the [StructLayout]
attribute with Pack=1
to set the padding to 1-byte boundaries. That’ll get rid of the three bytes of padding:
[StructLayout(LayoutKind.Explicit, Pack=1)] struct MyUnion { [FieldOffset(0)] bool IsInt; [FieldOffset(1)] int I; [FieldOffset(1)] float F; } MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331 Marshal.SizeOf(typeof(MyUnion)); // 5
That’s more like it! Now let’s see what happens if we put the bool
afterward with no Pack=X
setting:
[StructLayout(LayoutKind.Explicit)] struct MyUnion { [FieldOffset(0)] int I; [FieldOffset(0)] float F; [FieldOffset(5)] bool IsInt; } MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331 Marshal.SizeOf(typeof(MyUnion)); // 8
Here we get 4 bytes for the int
or float
and 4 bytes for the bool
. Let’s try setting Pack=1
to see if we can reduce that bool
down to one byte:
[StructLayout(LayoutKind.Explicit, Pack=1)] struct MyUnion { [FieldOffset(0)] int I; [FieldOffset(0)] float F; [FieldOffset(4)] bool IsInt; } MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331 Marshal.SizeOf(typeof(MyUnion)); // 8
The bool
is still 4 bytes! That’s because bool
actually uses 4 bytes by default. So we need one more trick to slim it down to a single byte: the [MarshalAs]
attribute. With this we can specify the underlying system type we want it to be represented by. The best candidate here is UnmanagedType.I1
, which is a 1-byte integer.
[StructLayout(LayoutKind.Explicit)] struct MyUnion { [FieldOffset(0)] int I; [FieldOffset(0)] float F; [FieldOffset(4)] [MarshalAs(UnmanagedType.I1)] bool IsInt; } MyUnion u; u.I = 123; Debug.Log(u.I); // prints: 123 u.F = 3.14f; Debug.Log(u.I); // prints: 1078523331 Marshal.SizeOf(typeof(MyUnion)); // 5
That’s two ways of laying out the struct so that it acts like a union on the int
and float
fields but has extra fields and still takes up no unnecessary memory. Neither is really better than the other and both have their own quirks.
That wraps things up for today. If you’ve got any tips or tricks for working with structs, feel free to leave a comment!
#1 by Peter on May 20th, 2017 ·
nice explanation, thanks!
Is there a way to use arrays in such structs?
Like to have
byte[100] bA, UInt16[50] wA and Uint32[25] dA all using the same memory, in order to have byte, word and dword representation of the same data?
#2 by jackson on May 20th, 2017 ·
I haven’t tried it, but it seems like you might be able to do this with a fixed array field.
#3 by Christian on May 23rd, 2017 ·
Hello,
Great explanation. I have tried using it in the following way but Unity throws compilation errors:
Would you happen to know what I am doing wrong?
#4 by jackson on May 23rd, 2017 ·
Try taking out the string. Unlike the others, it’s a managed type: a class instance. As such C# is very protective when it comes to you knowing details like it’s size, contents, and location.
#5 by Evans on July 31st, 2019 ·
A small nitpick – In the example following “Let’s try setting Pack=1 to see if we can reduce that bool down to one byte:”, you forgot to specify Pack=1
#6 by jackson on August 1st, 2019 ·
Good catch! I’ve updated the article with a fix.