Say you need to keep track of things you’ve already done, perhaps to avoid doing them again. What’s the fastest way to do that? HashSet<T> seems like a natural fit, so you might choose that without a second thought. But is it faster than similar collections like Hashtable and Dictionary<TKey, TValue>? Today’s article puts all three to the test to see which one can insert elements, check for containment, and remove elements the quickest. Read on for the surprising results!

HashSet<T> is, like other sets, a collection of unique values of type T. Internally it uses a hash to quickly add values to the set, check if values are contained, and remove elements from the set. It should be the go-to data structure when you need to do containment checks only. That’s different than the next two classes.

Hashtable is a one-to-many map of any type of key to any type of value. There’s no generic T type here, just plain object keys and values. Internally it also uses a hash for add, contains, and remove operations. If you have varying types of keys and values, it seems like the obvious choice for a key-value map.

Dictionary<TKey, TValue> is a one-to-many map of TKey keys to TValue values. It’s the strongly-typed generic version of Hashtable. Since all the keys and all the values in a map are almost always the same type as each other, this class is much more common than Hashtable in C# code. You get strong typing and support for value types (e.g. int) without the need for boxing.

Today’s test doesn’t need a map, just containment checks. Maps can do that too, so the test adds on a dummy value to the Hashtable and Dictionary. That means the three classes are used like this:

  • HashSet<T>T is a MyObject class
  • Hashtable — map of MyObject keys to a single dummy MyObject value
  • Dictionary<TKey, TValue>TKey is a MyObject, TValue is a dummy bool

Now let’s add, check containment, and remove to each collection a million times:

using System;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
 
class TestScript : MonoBehaviour
{
	class MyObject
	{
	}
 
	string report = "";
 
	void Start()
	{
		// Setup
 
		System.Diagnostics.Stopwatch stopwatch = new System.Diagnostics.Stopwatch();
		Hashtable hashtable = new Hashtable();
		HashSet<MyObject> hashSet = new HashSet<MyObject>();
		Dictionary<MyObject, bool> dictionary = new Dictionary<MyObject, bool>();
		MyObject dummyValue = new MyObject();
		const int size = 1000000;
		MyObject[] myObjects = new MyObject[size];
		for (int i = 0; i < size; ++i)
		{
			myObjects[i] = new MyObject();
		}
 
		// Add
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashtable.Add(myObjects[i], dummyValue);
		}
		long hashtableAddTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashSet.Add(myObjects[i]);
		}
		long hashSetAddTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			dictionary.Add(myObjects[i], true);
		}
		long dictionaryAddTime = stopwatch.ElapsedMilliseconds;
 
		// Contains
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashtable.Contains(myObjects[i]);
		}
		long hashtableContainsTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashSet.Contains(myObjects[i]);
		}
		long hashSetContainsTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			dictionary.ContainsKey(myObjects[i]);
		}
		long dictionaryContainsTime = stopwatch.ElapsedMilliseconds;
 
		// Remove
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashtable.Remove(myObjects[i]);
		}
		long hashtableRemoveTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			hashSet.Remove(myObjects[i]);
		}
		long hashSetRemoveTime = stopwatch.ElapsedMilliseconds;
 
		stopwatch.Reset();
		stopwatch.Start();
		for (int i = 0; i < size; ++i)
		{
			dictionary.Remove(myObjects[i]);
		}
		long dictionaryRemoveTime = stopwatch.ElapsedMilliseconds;
 
		// Report
 
		report = string.Format(
			"Container,Add Time,Contains Time,Remove Time\n"
			+ "Hashtable,{0},{1},{2}\n"
			+ "HashSet,{3},{4},{5}\n"
			+ "Dictionary,{6},{7},{8}",
			hashtableAddTime, hashtableContainsTime, hashtableRemoveTime,
			hashSetAddTime, hashSetContainsTime, hashSetRemoveTime,
			dictionaryAddTime, dictionaryContainsTime, dictionaryRemoveTime
		);
		Debug.Log(report);
	}
 
	void OnGUI()
	{
		GUI.TextField(new Rect(0, 0, Screen.width, Screen.height), report);
	}
}

If you want to try out the test yourself, simply paste the above code into a TestScript.cs file in your Unity project’s Assets directory and attach it to the main camera game object in a new, empty project. Then build in non-development mode, ideally with IL2CPP. I ran it that way on this machine:

  • LG Nexus 5X
  • Android 7.1.1
  • Unity 5.5.1f1, IL2CPP

And here are the results I got:

Container Add Time Contains Time Remove Time
Hashtable 2184 714 794
HashSet 2635 697 659
Dictionary 2247 622 720

Hash set performance graph

The surprising result here is that each of these container classes is best at one of the three operations! Hashtable is best at adding, HashSet is best at removing, and Dictionary is best at containment checks. That means it’s important to know which type of operation your code is likely to do the most of. For example, if you usually clear instead of removing individual elements then you probably shouldn’t pick HashSet. If you check a lot more than you add then you should probably choose Dictionary.

Unfortunately this means that the container class that makes the most sense as a set is not always the best pick. You may need to go with a Dictionary or Hashtable to get the most performance in some situations, even though you never use their mapping capability.