JacksonDunstan.com

Global variables are bad for a variety of reasons. Chief among them is that you can’t just look at one part of the code in isolation because it may be affected by a global variable that’s being used elsewhere. The problem actually exists as a spectrum where global variables are the worst and local variables are the best. In between are all kinds variables that make up the program’s “shared state”. Today’s article discusses that part and shows just how easy it is to inadvertently introduce it!

Consider the best-case scenario for a variable: a local. It only exists during one call of the function and can’t be read or written by anyone else unless we pass it to them. Even with local variables it’s a good idea to limit their scope to just the part of the function where they’re needed. Here’s a few ways to do that in C#:

void Foo()
{
	// Extra curly braces. The 'a' variable is only visible inside them.
	{
		int a = 2;
		Debug.Log("a is " + a);
	}
 
	// A 'using' block with an 'IDisposable' object
	using (var f = File.Open("/path/to/file"))
	{
		// 'f' can only be used here
	}
 
	// 'for' and 'foreach' loops
	for (var i = 0; i < 10; ++i)
	{
		Debug.Log("i is " + i);
	}
}

All of these language features exist for a good reason. Limiting the scope of even a local variable helps us reason about its usage. When we’re looking at i we only need to look at the for loop to figure out how it’s read and written. Likewise, we only need to look at the using block to figure out f and the extra curly braces to figure out a. There’s no need to look at the whole function.

Moving up one level we arrive at classes and structs. A field or property of a class or struct is a variable that’s scoped to all the functions in that class in the best case. That’s only true when the variable is private and its scope grows even further with other access specifiers like protected and public. Even as a private variable, you now need to look at all the functions in the class to figure out how the variable is used. Since the variable sticks around between function calls, you also need to think about the object’s lifespan and the order in which the functions are called.

Let’s pause and think about the magnitude of raising a variable up to this level. We’ve just taken a huge jump from a local variable to a field variable, but programmers make this jump all the time. There’s often not enough thought put into this transition from a relatively-benign local variable to a variable that is now shared among many functions, many invocations of those functions and, with non-private variables, across classes and even files. When you think about it, the complexity being introduced by this simple change is quite large!

This isn’t just theoretical, either. Almost every class has at least one field and usually many. Even worse is that these variables are usually not simple types like int or string. Instead, they’re references to instances of other classes. Here’s where things get really interesting.

When a class has a field that’s a reference to an instance of another class, it effectively has all of that class’ fields inside it. After all, the class has a reference for some reason. It’s only indirectly changing the fields of the class it has a reference to. Let’s look at a simple weapon to see how this looks:

public class Player
{
	public Player(int health)
	{
		Health = health;
	}
 
	public int Health { get; private set; }
 
	public void TakeDamage(int amount)
	{
		Health -= amount;
	}
}
 
public class LaserCannon
{
	public Player Target { get; set; }
 
	private float startChargingTime;
 
	public void StartCharging()
	{
		startChargingTime = Time.time;
	}
 
	public void Fire()
	{
		var timeCharged = Time.time - startChargingTime;
		var damage = timeCharged * 100;
		Target.TakeDamage(damage);
	}
}

Simple! Now you see code like this:

laserCannon.Fire();

And you wonder “what did that do?” and “who did that hit?”. There’s no Player passed to Fire, so you need to go searching for the code that set the Target property. But you don’t even know that you should do that until you read the Fire function to figure out how it figures out what Player to damage.

This is a silly, small example where you can easily guess what’s going to happen. You’ve also just read the code, so it’s fresh in your mind. Imagine if it were much more complex and you hadn’t seen it in a few months. You might not realize that laserCannon.Fire() changes Player.Health. You might not realize that the LaserCannon class is essentially an “owner” of Player.

This relationship where LaserCannon “owns” Player isn’t very obvious to many programmers, even those skilled in object-oriented design. It’s true that Player ultimately decides how its Health changes, but the variable is still being changed by an outside source. If you imagine the classes of your code and their references to instances of other classes, you might see something like a tree:

Original Class Dependency Graph

Notice that each class is only referenced by one other class. D, E, F, and G own their variables directly, but B and C own them and A in turn owns them. So A is essentially the owner of the whole tree as it can change any class indirectly via B and C.

Let’s say you introduce a new class S and you have D reference it:

Class Dependency Graph with No Problem

This follows the “one reference” rule because only D references S. I’ve highlighted the chain of ownership from S up to A to illustrate the effects of adding S in this way.

Now let’s break the rule and add S in a way that it’s referenced by D and E:

Class Dependency Graph with Medium Problem

S now has two owners: D and E. They, in turn, are owned by B and B is owned by A. The problem here is with the dual ownership. D and E can now fight over the state of S. You need to read D in order to understand E and visa versa. It’s just like the global variable problem, except that it’s limited in scope to D and E primarily and B and A indirectly.

What if we introduced S so that E and F had the references to it?

Class Dependency Graph with Big Problem

The difference here is that E and F also have different owners: B and C. This means that the problem isn’t contained by B anymore. It spreads the problem so that now B, C, E, and F all affect each other. A mere two references to one class was enough to make it so you need to read four classes to understand any one of them. Without graphing it out like I’ve done you might not even realize which other classes you need to read!

Now for the worst-case scenario, short of a true global (public static) variable. Let’s have all the bottom classes reference S:

Class Dependency Graph with Global

This is less likely to happen, but it shows the extreme case that you might run into with an extremely common class such as a Logger. In this case the variables in S are effectively global variables. They’re shared with every class in the system, so you need to read every class to understand how they’re used.

Of course real apps are much more complex than this simple system. You’ll have hundreds of classes, probably with references at various levels of the hierarchy. Most apps won’t even have a strict hierarchy. They’re usually more like a graph where there’s no real “up” or “down” and arrows point in no strictly-ordered direction:

Class Dependency Graph

They’re not technically global variables, but in most object-oriented designs they effectively are.

Inadvertent “Global” Variables

Comments