The Easy Power of Code Generation
C# generics are weak. The where
clause allows you to know a little about the generic (“T
“) types you’re given, but that’s just scratching the surface of what you can do with code generation. Today’s article will show you how easy it is to add a little code generation to a project and the power that brings.
Consider the ubiquitous List<T>
class. It uses the generic T
type parameter to allow you to make a strongly-typed list of anything. Strong typing is great, but the class is hobbled by the C# generics system. How so? Consider a simple operation like the bool Contains(T val)
function. It might be implemented like this:
bool Contains(T val) { for (int i = 0; i < this.count; ++i) { if (this.array[i].Equals(val)) { return true; } } return false; }
The problem comes with this line:
if (this.array[i].Equals(val))
That Equals
call is to this function:
namespace System { public class Object { public virtual bool Equals(object other) { // ... } } }
Two things should jump out at you at this point. First, this is a virtual function call even though the types are probably known. If you’d written a Vector3List
you certainly wouldn’t be making a virtual function call and taking a performance hit. This is a minor point though.
The second issue is that value types need to be “boxed” so that they can be represented as an object
. So if you have a List<Vector3>
then Contains
will need to box every element of the list because Vector3
is a struct. Every time the Vector3
is boxed there is a 28 byte allocation of garbage. So a list with 1024 elements creates 28 KB of garbage in tiny little chunks which may cause fragmentation of managed memory.
At this point it’s worth noting that the implementors of List<T>
or Contains
didn’t do this because they’re bad programmers. I’m not picking on them in particular. The point is that they were forced to call object.Equals
because the C# generics system gave them no other choice. Imagine they wrote this code instead:
if (this.array[i] == val)
This would produce a compiler error because the compiler can’t guarantee that all possible types that could be used as T
have an ==
operator available. Likewise, the compiler also doesn’t know that Vector3
happens to have a bool Equals(Vector3)
function because the rules of the C# generics system prohibit it from knowing that.
Our only real tool to get some insight into the types our generic code is operating on is the where T : ...
syntax. That allows us six ways to constrain what kinds of types we’ll accept for T
and therefore what kinds of features we can use on a T
variable. Unfortunately, none of these allows us to say “T
has an ==
operator that takes another T
“.
The best we could do would be to say where T : IEquatable<T>
so that we could call bool Equals(T)
to prevent the boxing. But this doesn’t solve the problem even by a long shot. We’d exclude all kinds of valid types like Vector3
that don’t implement IEquatable<T>
. In the case of Vector3
, we can’t even modify it to make it extend IEquatable<T>
since it’s not our source code. And even if we could we’d have an unnecessary virtual function call for every element of the array.
This is just one of many examples of generic code that’s either forced to be watered down or is impossible to write because of the weakness of the C# generics system. Fortunately, our best tool is one that works in all languages, is way more powerful, and is extremely easy to add to a project: code generation.
Code generation is simply code that generates other code. Behind the scenes this is what IL2CPP does with our generics anyhow. It simply makes a ListVector3
, ListInt
, ListFloat
, and so forth for each type we use. The names it actually uses are different, but you can think of generics in IL2CPP as a code generator that you’re already using. To gain more power, simply implement your own code generator!
Let’s look at a simple example of a code generator to solve the above problem. It took me under half an hour to write. That’s a drop in the ocean of a typical project and certainly shouldn’t pose any scheduling problems for any team. All this code generator does is replace some tokens in a “template” C# file. Here’s my template for an extension function version of Contains
called ContainsNoAlloc
that uses the ==
operator to avoid creating garbage.
//////////////////////////////////////////////////////////////////////////////// // This file is auto-generated. // Do not hand modify this file. // It will be overwritten next time the generator is run. //////////////////////////////////////////////////////////////////////////////// using System.Collections.Generic; using TYPENAMESPACE; namespace INTONAMESPACE { public static class INTOTYPE { public static bool ContainsNoAlloc(this List<TYPE> list, TYPE element) { for (int i = 0, count = list.Count; i < count; ++i) { if (list[i] == element) { return true; } } return false; } } }
I put the template here: /path/to/my/unity/project/Templates/TYPEListExtensions.cs
.
There are four variables that will be replaced:
TYPE
: Name of theT
type (e.g. “Vector3”)INTOTYPE
: Name of the type to generate (e.g. “Vector3ListExtensions”)TYPENAMESPACE
: Namespace the type is in (e.g. “UnityEngine”)INTONAMESPACE
: Namespace to generate the file in (e.g. “Extensions”)
So how does the code generator work? It’s extremely straightforward, almost boring, code to write. You just load the template, replace some strings, and write out the file. I just made a custom “My Project/Generate Code” editor script that I can either click in the GUI or run from the command line during an automated build. It uses the template to make files for Vector2
, Vector3
, and Vector4
. Look how simple it is. It’s mostly comments and string literals!
using System; using System.IO; using UnityEditor; using UnityEngine; static class GenerateCode { // Generate all code for the project [MenuItem("My Project/Generate Code")] static void Generate() { // Generate List<T> extensions GenerateSimpleTemplate( "TYPEListExtensions", typeof(Vector2), "Extensions", "Vector2ListExtensions" ); GenerateSimpleTemplate( "TYPEListExtensions", typeof(Vector3), "Extensions", "Vector3ListExtensions" ); GenerateSimpleTemplate( "TYPEListExtensions", typeof(Vector4), "Extensions", "Vector4ListExtensions" ); } /// <summary> /// Generate for a simple template. This is a template that has one type variable. /// </summary> /// /// <param name="templateName"> /// Name of the template file without the ".cs" extension. /// </param> /// /// <param name="type"> /// Type to replace the type variable with: TYPE and TYPENAMESPACE. /// </param> /// /// <param name="intoNamespace"> /// Namespace to generate the output file into (INTONAMESPACE). /// Directories are created as needed. /// </param> /// /// <param name="intoType"> /// Type to generate (INTOTYPE). /// </param> static void GenerateSimpleTemplate( string templateName, Type type, string intoNamespace, string intoType ) { // Read the template from ProjectDir/Templates/{templateName}.cs string assetsDirPath = Application.dataPath; string projectDirPath = Directory.GetParent(assetsDirPath).FullName; string templatesDirPath = Path.Combine(projectDirPath, "Templates"); string templatePath = Path.Combine(templatesDirPath, templateName) + ".cs"; string template = File.ReadAllText(templatePath); // Replace variables in the template string result = template .Replace("TYPENAMESPACE", type.Namespace) .Replace("INTONAMESPACE", intoNamespace) .Replace("INTOTYPE", intoType) .Replace("TYPE", type.Name); // Output the result and create directories as necessary string[] intoNamespaceParts = intoNamespace.Split('.'); string intoPath = assetsDirPath; for (int i = 0, len = intoNamespaceParts.Length; i < len; ++i) { string part = intoNamespaceParts[i]; intoPath = Path.Combine(intoPath, part); if (!Directory.Exists(intoPath)) { Directory.CreateDirectory(intoPath); } } intoPath = Path.Combine(intoPath, intoType); intoPath += ".cs"; File.WriteAllText(intoPath, result); // Refresh the asset database to show the (potentially) new file AssetDatabase.Refresh(); } }
When you run it, you’ll get three new (or overwritten) C# files in /path/to/my/unity/project/Assets/Extensions
that look like this:
//////////////////////////////////////////////////////////////////////////////// // This file is auto-generated. // Do not hand modify this file. // It will be overwritten next time the generator is run. //////////////////////////////////////////////////////////////////////////////// using System.Collections.Generic; using UnityEngine; namespace Extensions { public static class Vector3ListExtensions { public static bool ContainsNoAlloc(this List<Vector3> list, Vector3 element) { for (int i = 0, count = list.Count; i < count; ++i) { if (list[i] == element) { return true; } } return false; } } }
And now we have a ContainsNoAlloc
that creates no garbage and makes no virtual function calls. If we want to use this for more types, simply add some entries to the code generator and run it again.
This is just one example of a code generator. It serves the simple purpose of replacing generics that have one type in them. It’d be trivial to add support for multiple types so you could make a Dictionary<TKey, TValue>
) replacement. Likewise, you could add support for inserting the generated code into a #region
of an existing file so that some templates could generate individual functions. Really, you can generate whatever C# source code text you want to. As long as the code you generate compiles, you can generate whatever you want to.
Code generators are a great tool to free you from the weaknesses of C#, especially the generics system. You can use them to eliminate code duplication while not limiting yourself with the highly-restrictive set of rules imposed by C#. You can add a simple code generator to your project in next to no time or you can use a sophisticated template engine with just a bit more work. Whichever you choose, you’ll have a powerful new tool that you can use to write generic code.
#1 by Todd Ogrin on June 5th, 2017 ·
Great article! It’s also worth mentioning that, much like your generics use case, code generation can overcome a lot of the problems posed by using reflection (i.e. speed).
#2 by jackson on June 5th, 2017 ·
Yes, that’s a great use for a code generator. The more you think about it the more applications you come up with. One popular use is to pre-compute a bunch of data and generate a big lookup table.
#3 by Pierre Korsowski on June 21st, 2017 ·
Simple text-replace-templates are a good start, though I ran relatively early into problems, where replacing things is too simple for my requirements.
The next step would then be either
– the template engines you linked
– Visual Studio T4 template engine (https://docs.microsoft.com/en-us/visualstudio/modeling/code-generation-and-t4-text-templates, if you have access to VS) or
– using CodeDOM (https://docs.microsoft.com/en-us/dotnet/framework/reflection-and-codedom/using-the-codedom)
I’m preferring CodeDOM, since you can code typesafe code generators which have access to the Unity runtime and all other .Net libraries.
#4 by jackson on June 21st, 2017 ·
Thanks for the links! CodeDOM does indeed look like an interesting alternative to template engines and simple text replacement. It’s definitely a different model though, but it may suit some developers (like you) more than a template language.
#5 by misha on May 8th, 2018 ·
crazy — i actually do this exact technique but with the asset post processor.. whena c# file is created in unity, my script will fill in the unity c# template with a template i wrote (ie, inject namespacing and the like).
#6 by Stephen Hodgson on November 12th, 2018 ·
I like to edit Unity’s code templates and insert my own, lol.
I do this a lot with many different projects.
The asset post processor did the rest for me.