JacksonDunstan.com

Bit fields give us the ability to use less than one byte to store an integer. This can be really useful to fit more data into a CPU cache or to reduce memory usage in general. Unfortunately, bit fields aren’t built into C#. Today we’ll work around that by creating our own!

The Goal

We want to store integers in less than one byte, but the smallest type in C# is byte so we seemingly can’t do that. Bit fields allow us to break that rule and treat runs of bits in a type like byte as a nested integer. For example, we could create a bit field like this to store information about a player:

Bit Index	Meaning
0	Is alive?
1	Lives
2	Lives
3	Lives
4	Team
5	Team
6	Weapon ID
7	Weapon ID

Here we’ve packed four variables into one byte! The downside is that these variables now need to be very small. “Is alive?” is only one bit, but that’s OK because it’s a boolean. “Lives” takes up three bits, so it can be 0-7. Maybe that’s OK for our particular game. “Team” and “Weapon ID” are both two bits and can be 0-3, so we can have only four teams and four types of weapons. That could be OK for some games, but a ushort, uint, or ulong may be needed for others.

Now consider the more straightforward approach of defining a struct with individual fields:

public struct Player
{
    public bool IsAlive;
    public byte Lives;
    public byte Team;
    public byte WeaponID;
}

sizeof(Player) tells us this takes up 4 bytes: 4x larger than our byte-based bit field!

Now imagine we could define a struct with just a byte field:

public struct Player
{
   public byte BitField;
 
   public bool IsAlive
   {
      get { return BitManip.Get0(BitField) != 0; }
      set { BitField = BitManip.Set0(BitField, (byte)(value ? 1 : 0)); }
   }
 
   public byte Lives
   {
      get { return BitManip.Get1to3(BitField); }
      set { BitField = BitManip.Set1to3(BitField, value); }
   }
 
   public byte Team
   {
      get { return BitManip.Get4to5(BitField); }
      set { BitField = BitManip.Set4to5(BitField, value); }
   }
 
   public byte WeaponID
   {
      get { return BitManip.Get6to7(BitField); }
      set { BitField = BitManip.Set6to7(BitField, value); }
   }
}

BitField takes the place of all the fields but properties allow just as convenient access to them. All the code using Player is the same as before since it’s API is essentially the same.

Bit Runs

All we need now is the BitManip static class full of functions to get runs of bits. Writing these functions is quite simple. For example, here are the functions for the 4-5 run:

public static byte Get4to5(byte src)
{
    return (byte)((src & 0b00110000) >> 4);
}
 
public static byte Set4to5(byte src, byte val)
{
    return (byte)((src & 0b11001111) | ((val & 0b00000011) << 4));
}

For the “get” function, we mask out just the bits we want and then shift them to the least-significant bits so they make sense. For the “set” function, we mask out all the bits to not set and take the binary OR of that with the bits to set, which are masked for safety and shifted into place.

Code Generation

Writing these functions quickly becomes tedious. Considering we have to write every possible run for byte, ushort, uint, and ulong, we’re looking at more functions than is feasible to write by hand. So let’s turn to code generation to generate all the runs of all the bit lengths for all the types. Here is a small JavaScript program that outputs the class:

// Code generator for the following class
console.log(`/// <summary>`);
console.log(`/// Functions for reading and writing runs of bits from a byte.`);
console.log(`/// This class is auto-generated. Don't modify by hand.`);
console.log(`/// </summary>`);
console.log(`///`);
console.log(`/// <author>`);
console.log(`/// Jackson Dunstan, https://jacksondunstan.com/articles/5393`);
console.log(`/// </author>`);
console.log(`///`);
console.log(`/// <license>`);
console.log(`/// MIT`);
console.log(`/// </license>`);
console.log(`public static class BitManip`);
console.log(`{`);
function getTypeName(size) {
    if (size <= 8) {
        return 'byte';
    }
    if (size <= 16) {
        return 'ushort';
    }
    if (size <= 32) {
        return 'uint';
    }
    return 'ulong';
}
for (let size = 8; size <= 64; size *= 2) {
    const name = getTypeName(size);
    for (let numBits = 1; numBits <= (size - 1); ++numBits) {
        const getRetType = getTypeName(numBits);
        console.log(`    // ${numBits} bit${numBits > 1 ? "s" : ""} of a ${name}`);
        console.log(``);
        const remainBits = size - numBits;
        for (let firstBitIndex = 0; firstBitIndex <= remainBits; ++firstBitIndex) {
            const lastBitIndex = firstBitIndex + numBits;
            let keepBitsMask = '';
            let getMask = '';
            for (let i = 0; i < firstBitIndex; ++i) {
                keepBitsMask += '1';
                getMask += '0';
            }
            for (let i = firstBitIndex; i < lastBitIndex; ++i) {
                keepBitsMask += '0';
                getMask += '1';
            }
            for (let i = lastBitIndex; i < size; ++i) {
                keepBitsMask += '1';
                getMask += '0';
            }
            keepBitsMask = '0b' + keepBitsMask.split('').reverse().join('');
            getMask = '0b' + getMask.split('').reverse().join('');
 
            let takeBitsMask = '0b';
            for (let i = 0; i < remainBits; ++i) {
                takeBitsMask += '0';
            }
            for (let i = 0; i < numBits; ++i) {
                takeBitsMask += '1';
            }
 
            const shift = firstBitIndex;
 
            let getSetName = firstBitIndex;
            if (lastBitIndex > firstBitIndex + 1) {
                getSetName += 'to' + (lastBitIndex-1);
            }
 
            console.log(`    public static ${getRetType} Get${getSetName}(${name} src)`);
            console.log(`    {`);
            console.log(`       return (${getRetType})((src & ${getMask}) >> ${shift});`);
            console.log(`    }`);
            console.log(`    `);
            console.log(`    public static ${name} Set${getSetName}(${name} src, ${name} val)`);
            console.log(`    {`);
            console.log(`       return (${name})((src & ${keepBitsMask}) | ((val & ${takeBitsMask}) << ${shift}));`);
            console.log(`    }`);
            console.log(`    `);
        }
    }
}
console.log(`}`);

This can be pasted into a web browser or, much more conveniently, run on the command line with node generate-bitmanip-cs.js. The result is a 28008 line C# file that we can add to our Unity project.

Conclusion

By combining a little bit manipulation with some code generation, we now have the ability to create bit fields in C#! By shrinking our data, we can use less of a CPU cache line and less memory or file size in general. We need to keep our values small, pay a little bit of CPU to perform these bit operations, need to generate a large C# file, and need to write some properties to access the bits, but overall the costs are pretty low. Consider bit fields as another tool in our toolbox, ready to be employed when the benefits outweigh the costs.

#1 by Liam on October 21st, 2019 · Reply

I remember doing similar in AS3 many moons ago :)

#2 by Joshy on October 23rd, 2019 · Reply

Hi Jackson,

Can’t we do the same using byte fields and bitwise operation in raw C# using some lines of code than using 28008 lines of code?

#3 by jackson on October 23rd, 2019 · Reply

You don’t have to use any of the functions in the generated BitManip.cs file, but they make implementing bit fields easier and less prone to errors. If you’d rather hand type in the bit manipulation for each of the properties of your bit field struct, it’ll work just as well.

#4 by CodeMolester on December 2nd, 2019 · Reply

I would like to warn anyone reading this article against bit stuffing unless you are facing real, justified memory constraints.

You will be able to fit more data into cache, however, due to how x86 and x64 CPUs operate low level, manipulating bitfields requires more instructions, inflating the instruction cache.

The company I’m at right now has experimented with bit stuffing some time ago and it quickly became clear that its’ benefits are somewhat obsolete. Memory is huge these days, even on mobile devices, so you rarely actually need to care about it, and all of our tests indicate that performance suffers if you work with these stuffed bitfields.

The performance of one read is ofcourse insignificant, but then again you are making your code more complicated to work with and not getting much value in return. I could only think of one scenario where a platform that supports C# can benefit from this approach and it is networking.

Hopefully this doesn’t come across as me being a smartass.

#5 by jackson on December 2nd, 2019 · Reply

You’re definitely right that it requires more instructions to read from and write to the fields of a bit field than a normal struct. Anyone considering using one should weigh that trade-off as part of their decision. I mentioned as this in the conclusion of the article.

As for memory usage, it’s true that RAM and disk have both grown a lot. However, if you’re looking to push the limits then there’s often a reward to better data packing. Bit fields might help those efforts. Bandwidth, on the other hand, is where they really shine these days. You can get more bit fields into the same amount of throughput compared to regular structs. That’s great for RAM, networking, and disk loads.

In the real world we have very specific problems that may or may not be helped by bit fields. They’re definitely not a general-purpose solution to be used for everything, but they can be a handy tool in some cases.
#6 by Laireon Games on April 10th, 2020 · Reply

I am working on a voxel based project which are currently really popular hobby projects and memory is a HUGE concern for everything we do.

Our chunks are 16*128 *16
We have 16*16 chunks in a region and I run a region per job.

This is 8 million bytes for Each piece of information we need (Block type, scale, rotation, state etc etc).

We can easily have 9 regions loaded at time in a scene (thankfully we have a cached system so only important chunks are in memory otherwise it would fall over!).

A saving like this very quickly turns eating GB of RAM into more sane amounts.

Also don’t forget Unity runs on C#…

Adding Bit Fields to C#

The Goal

Bit Runs

Code Generation

Conclusion

Comments