JacksonDunstan.com

C and C++ have a great feature call the “union”. It’s like a struct except it only has one of the fields at a time. C# lacks this feature, but with some trickery it can be added in. Today’s article shows how to do that!

Let’s start with an empty struct and use the System.Runtime.InteropServices.Marshal.SizeOf function to find out its size:

struct MyStruct
{
}
 
Marshal.SizeOf(typeof(MyStruct)); // 0

That makes sense since the struct is empty. Now let’s add a single 4-byte int and hope to see a 4:

struct MyStruct
{
	int I;
}
 
Marshal.SizeOf(typeof(MyStruct)); // 4

Everything’s going as expected so far. Let’s do one last obvious step and try a 4-byte int and a 4-byte float:

struct MyStruct
{
	int I;
	float F;
}
 
Marshal.SizeOf(typeof(MyStruct)); // 8

Good. Now let’s convert this struct into a union. As mentioned above, C# doesn’t have built-in unions, unlike C and C++. So let’s look and see what this struct would look like in C:

struct MyStruct
{
	int I;
	float F;
};
 
sizeof(struct MyStruct); // 8

Now if we wanted it to be a union so that it represented either an int or a float we’d just change struct to union:

union MyUnion
{
	int I;
	float F;
};
 
sizeof(union MyUnion); // 4

The size of the union indicates that the same memory is being used for either a 4-byte int or a 4-byte float. This becomes pretty obvious when you set both of them:

union MyUnion
{
	int I;
	float F;
};
 
union MyUnion u;
u.I = 123;
printf("%i\n", u.I); // prints: 123
u.F = 3.14f;
printf("%i\n", u.I); // prints: 1078523331

When we set F we overwrote the memory that was holding I with a floating-point value. This is good because it means that the union is representing just one of the two values, not both, and using only enough memory to store one of them, not both.

Now how can we achieve this with C# given that it doesn’t have a union keyword? The key lies in some goodies found in the System.Runtime.InteropServices namespace. Don’t worry- it’s available in Unity’s old Mono-based .NET implementation, even with the “.NET 2.0 Subset”.

First, we need to take manual control over how the fields in the struct are laid out in memory. We do this with the [StructLayout] attribute like so:

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
	int I;
	float F;
}

We’ve just told the compiler that we want to explicitly state where to put the I and F fields of the struct in memory, but haven’t yet stated where to put them. To do that, we use the [FieldOffset] attribute:

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
	[FieldOffset(0)] int I;
	[FieldOffset(0)] float F;
}

By setting the offset of both fields to 0, we’re saying that they should both use the same memory space. If this works, we should see the same overwriting behavior that we saw in C, so let’s try:

MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331

It worked! The last thing to check is to see that it’s not still using 8 bytes:

Marshal.SizeOf(typeof(MyUnion)); // 4

At this point we essentially have created a union in C#!

Finally for today, you often want a way to know which value has been stored in the union so you don’t accidentally access it the wrong way. A simple bool will suffice for that. Here’s how it’d look in C:

struct TaggedUnion
{
	bool IsInt;
	union
	{
		int I;
		float F;
	};
};

In C# there are, of course, no anonymous unions so we need to pack everything into the same struct. But where do we set the offset of IsInt? We definitely don’t want it using the same memory as the other values, so it needs to either come before or after them. Let’s try putting it before and moving the other fields forward one byte:

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
	[FieldOffset(0)] bool IsInt;
	[FieldOffset(1)] int I;
	[FieldOffset(1)] float F;
}
 
MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331
Marshal.SizeOf(typeof(MyUnion)); // 8

Having the bool before the other fields still lets them use the same memory, but resulted in using 8 bytes instead of just the 5 we’d expect. This is due to C# padding each field up to 4-byte boundaries. We need to override this in the [StructLayout] attribute with Pack=1 to set the padding to 1-byte boundaries. That’ll get rid of the three bytes of padding:

[StructLayout(LayoutKind.Explicit, Pack=1)]
struct MyUnion
{
	[FieldOffset(0)] bool IsInt;
	[FieldOffset(1)] int I;
	[FieldOffset(1)] float F;
}
 
MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331
Marshal.SizeOf(typeof(MyUnion)); // 5

That’s more like it! Now let’s see what happens if we put the bool afterward with no Pack=X setting:

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
	[FieldOffset(0)] int I;
	[FieldOffset(0)] float F;
	[FieldOffset(5)] bool IsInt;
}
 
MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331
Marshal.SizeOf(typeof(MyUnion)); // 8

Here we get 4 bytes for the int or float and 4 bytes for the bool. Let’s try setting Pack=1 to see if we can reduce that bool down to one byte:

[StructLayout(LayoutKind.Explicit, Pack=1)]
struct MyUnion
{
	[FieldOffset(0)] int I;
	[FieldOffset(0)] float F;
	[FieldOffset(4)] bool IsInt;
}
 
MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331
Marshal.SizeOf(typeof(MyUnion)); // 8

The bool is still 4 bytes! That’s because bool actually uses 4 bytes by default. So we need one more trick to slim it down to a single byte: the [MarshalAs] attribute. With this we can specify the underlying system type we want it to be represented by. The best candidate here is UnmanagedType.I1, which is a 1-byte integer.

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
	[FieldOffset(0)] int I;
	[FieldOffset(0)] float F;
	[FieldOffset(4)] [MarshalAs(UnmanagedType.I1)] bool IsInt;
}
 
MyUnion u;
u.I = 123;
Debug.Log(u.I); // prints: 123
u.F = 3.14f;
Debug.Log(u.I); // prints: 1078523331
Marshal.SizeOf(typeof(MyUnion)); // 5

That’s two ways of laying out the struct so that it acts like a union on the int and float fields but has extra fields and still takes up no unnecessary memory. Neither is really better than the other and both have their own quirks.

That wraps things up for today. If you’ve got any tips or tricks for working with structs, feel free to leave a comment!

#1 by Peter on May 20th, 2017 · Reply

nice explanation, thanks!

Is there a way to use arrays in such structs?
Like to have
byte[100] bA, UInt16[50] wA and Uint32[25] dA all using the same memory, in order to have byte, word and dword representation of the same data?

#2 by jackson on May 20th, 2017 · Reply

I haven’t tried it, but it seems like you might be able to do this with a fixed array field.

#3 by Christian on May 23rd, 2017 · Reply

Hello,

Great explanation. I have tried using it in the following way but Unity throws compilation errors:

  [StructLayout(LayoutKind.Explicit)]
  public struct Union
  {
    public enum Types
    {
      Integer, Boolean, Float, String, Vector3
    }
 
    [FieldOffset(0)]
    private Types Type;
    [FieldOffset(4)]
    private int Integer;
    [FieldOffset(4)]
    private float Float;
    [FieldOffset(4)]
    private bool Boolean;
    [FieldOffset(4)]
    private string String;
    [FieldOffset(4)]
    private Vector3 Vector3;
  }

Would you happen to know what I am doing wrong?

#4 by jackson on May 23rd, 2017 · Reply

Try taking out the string. Unlike the others, it’s a managed type: a class instance. As such C# is very protective when it comes to you knowing details like it’s size, contents, and location.

#5 by Evans on July 31st, 2019 · Reply

A small nitpick – In the example following “Letâ€™s try setting Pack=1 to see if we can reduce that bool down to one byte:”, you forgot to specify Pack=1

#6 by jackson on August 1st, 2019 · Reply

Good catch! I’ve updated the article with a fix.

Adding Unions to C#

Comments