Handling Internal Errors
Some errors can be handled and some cannot. Nevertheless, it’s extremely common to see codebases chock-full of ineffective error handling for these unrecoverable issues. The result is a lot of extra code to write, maintain, and test that often serves to make debugging harder. Today’s article shows you how to make debugging internal errors so much easier by effectively writing code to handle them.
Scope
First things first, let’s be clear about what we’re discussing and not discussing in this article.
We’re not talking about handling errors like “file not found,” “socket closed,” or “incorrect password.” These are expected situations the occur due to external forces such as corrupt hard drives, unplugged network cables, and typos. These kinds of errors are well-defined and we can easily write code that handles them.
What we are talking about in this article are errors like “index out of bounds,” “null reference,” or “division by zero.” These scenarios occur due to internal factors within the code itself and writing code to handle them is nigh impossible.
The Problem
The problem is that the caller of a function called it with incorrect parameters. Normally these are formal parameters, but they may also be implicit parameters such as the values of static
variables. Functions are frequently written to detect a subset of the incorrect parameters. For example, it’s extremely common to see checks for null
references:
int CalculateScore(Player p) { if (p == null) { // p is not allowed to be null, but it is // Do something here in response to this } }
These are typically the easy checks, such as comparing with null
. The harder checks are often omitted entirely. These include checking if the reference is the logically correct one, an IDisposable
object is disposed, or a pointer references freed memory. It’s simply too much code to write or even reason about or execute to add these kinds of checks, so they’re usually left out.
Handling These Errors
What can you do as the writer of the function to handle these internal errors? The usual error-handling strategies are all available:
- Return an error code such as
-1
ornull
- Return a
Nullable<T>
/T?
orMaybe<T>
- Throw an exception
- Terminate the app
- Break the debugger
- Write to a log file
- Upload to a data analytics service (e.g. Splunk)
- Store a crash dump
- Upload the crash dump to a server (e.g. Apteligent)
These are all signals that something has gone wrong. It's important to remember that none of these strategies can actually fix the problem. All they can do is signal that an error has been made. Even if the calling code receives and processes the signal, it's highly unlikely that it'll be able to fix the problem. If the caller knew how to properly call the function, they would have done so the first time. Calling again with different parameters in a sort of guess-and-check programming is not a valid strategy.
Solution Requirements
So if these signals don't fix the problem and don't cause the caller to fix the problem, what good are they? They actually serve an important purpose: signalling the programmer to let them know that their code has a bug and needs to be fixed. With this audience in mind, let's think about what they (we!) want so we can choose a good signalling strategy.
There are two main desires that programmers have when debugging an error. First, we'd like for the debug report to come with a lot of specific information. We want to know the values of variables, the call stack, how many threads were running, how much memory was in use, and so forth.
Second, we want this information to be presented to us at the time of the problem. We don't want an error to occur, the program to continue along for a while, and then for an debug report to be generated because all of that specific information is now describing a different program state.
Sometimes there is a third desire. We'd like for the error-handling code to be removed in release builds of the game so that we don't incur the CPU overhead. This will allow errors to go uncaught, but that is a worthy tradeoff in many cases.
Strategy Analysis
With the solution requirements in mind, let's revisit our error-handling strategies and think how well they satisfy the requirements.
If we return a Nullable<T>
, Maybe<T>
, or error code, we run the risk of the caller ignoring it. There's even a likelihood with error codes that the value will be used as a non-error code which would likely cause further errors. These strategies therefore don't satisfy the need for timely information.
Throwing an exception might also result in a delay, even indefinitely, in reporting the error if the exception is caught. If the code never has any catch
blocks and has proper Unity configuration (e.g. fast but no exceptions for iOS), then exceptions may be useful as they will either crash the program or cause the debugger to break. Both are full of rich information, so this is a good option as long as catch
is avoided.
Terminating the app with Application.Quit
or EditorApplication.Exit
will definitely cause the problem to occur right away, but won't provide the specific information we need as there will be no crash dump or debugger breaking.
Explicitly breaking the debugger (e.g. with Debugger.Break
) works well, but only if a debugger is attached. That may not always be the case during development and definitely isn't the case for production builds. If the debugger doesn't break then this is essentially a no-op, so execution will continue and we won't get any information to analyze.
Writing to a log file and uploading data to an analytics service are common and can provide some information for later analysis. However, this information is often limited by the burden to explicitly type out and transmit all of the data necessary for later analysis. Compared to the treasure trove available in a debugger or crash dump, this option typically includes just the values of a few variables plus a hand-written message. Still, it is available in all builds and may provide helpful clues during debugging.
Finally, crashing the program to generate and possibly upload a crash dump produces quite a lot of information at exactly the time of the problem. It is, however, harder to analyze it during development than simply breaking into the debugger where a uniform view of the code, call stack, and memory is all presented.
Crafting a Solution
It's clear from the above analysis that breaking into the debugger when attached provides the most specific, timely, user-friendly information. When a debugger isn't attached, a combination of error logging and producing a crash dump is preferable. So what would it take signal errors in this way? Not much!
int CalculateScore(Player p) { if (p == null) { // Break the debugger // Good for when a programmer is running the game Debugger.Break(); // Write a log message // Good for human-readable hints in debug and release builds Debug.LogError("Parameter can't be null"); // Crash the app (hopefully there are no 'catch' blocks) // Good for generating crash dumps in debug and release builds throw new Exception("Parameter can't be null"); } }
This is a lot to type, but it's easy to encapsulate into an assertion-style function:
[Conditional("SUPER_ASSERTIONS_ENABLED")] void SuperAssert(bool truth, string message) { #if SUPER_ASSERTIONS_ENABLED if (!truth) { Debugger.Break(); Debug.LogError(message); throw new Exception(message); } #endif }
Then this can be used like so:
int CalculateScore(Player p) { SuperAssert(p != null, "Parameter can't be null"); }
To signal errors even when assertions aren't enabled, simply make a version of the function that doesn't have the [Conditional]
attribute or #if
test:
void SuperAssertRelease(bool truth, string message) { if (!truth) { Debugger.Break(); Debug.LogError(message); throw new Exception(message); } }
Then call it like this:
int CalculateScore(Player p) { SuperAssertRelease(p != null, "Parameter can't be null"); }
Feel free to adjust these functions to add uploading to crash or analytics services or utilize a different logging library.
Conclusion
Today we've seen how to handle internal errors in our code. Approaches like breaking the debugger are extremely useful while others like returning error codes are extremely unhelpful. With a trivial amount of work, we can write a function or two that gives us the exact behavior we want when we detect these kinds of errors and have it stripped out of our code automatically in release builds.
#1 by PZB on December 28th, 2018 ·
There is a problem bothering me, when maintain both SuperAssert and SuperAssertRelease functions,means I need to consider call one or both of them at my code, it’s a big complex.
I try a way like this, what do you think?
But I can’t fount out a good way for release build to report it’s bugs.
#2 by jackson on December 29th, 2018 ·
You can certainly have a combined function like that if you want all your assertions to either be enabled or disabled depending on the build type. If you want only some of them enabled or disabled, then the approach in the article of having two functions is a good way to go. The choice really depends on the behavior you’re looking to achieve.