JacksonDunstan.com

The C++ Standard Library’s algorithms are culmination of a lot of C++ language and library features. They’re like a much more featureful, much faster version of LINQ in C#. This powerful combination makes most “raw” loops unnecessary as they can be replaced by named function calls that are well-tested and often compile to the same machine code as a “raw” loop. Read on to learn about them!

Table of Contents

Iterator

We’ve seen that containers each have an iterator class member that’s used like IEnumerator in C#: to keep track of iterating over the collection. The <iterator> header is the support library for these types. Until C++17 deprecated it, there was a std::iterator base class of all the containers’ iterator classes. This is now unnecessary with language features like concepts. Specifically, the <iterator> header provides a lot of concepts to define different kinds of iterators. Here’s how they relate to each other:

Iterator Concepts

“Tag” classes are available for most of these concepts. These are empty structs that iterators and other tag classes derive from to form an inheritance hierarchy that matches the concepts:

Iterator Tags

There are also concepts such as std::sortable, std::mergeable, and std::permutable that are used as requirements for various algorithms we’ll see shortly.

C# doesn’t have concepts and its IEnumerator interface doesn’t have “tag” interfaces, so there’s really no equivalent of this.

Besides concepts, <iterator> provides a lot of iterator-related utilities. This includes a lot of “adapter” classes that change the behavior of iterators. For example, std::reverse_iterator has an overloaded ++ operator that moves backward instead of forward:

#include <iterator>
#include <vector>
 
std::vector<int> v{ 1, 2, 3 };
 
// Adapt iterators to go backward instead of forward when using ++
std::reverse_iterator<std::vector<int>::iterator> it{ v.end() };
std::reverse_iterator<std::vector<int>::iterator> end{ v.begin() };
while (it != end)
{
    DebugLog(*it); // 3 then 2 then 1
    ++it;
}
 
// Less verbose version using class template argument deduction
for (std::reverse_iterator it{ v.end() };
    it != std::reverse_iterator{ v.begin() };
    ++it)
{
    DebugLog(*it); // 3 then 2 then 1
}
 
// Even less verbose version using rebgin() and rend()
for (auto it{ v.rbegin() }; it != v.rend(); ++it)
{
    DebugLog(*it); // 3 then 2 then 1
}

There’s also a std::back_insert_iterator that overloades the assignment (=) operator to call push_back on a collection:

#include <iterator>
#include <vector>
 
// Empty collection
std::vector<int> v{};
 
// Create an iterator to insert into the std::vector
std::back_insert_iterator<std::vector<int>> it{ v };
 
// Insert three elements
it = 1;
it = 2;
it = 3;
 
for (int x : v)
{
    DebugLog(x); // 1 then 2 then 3
}

Some functions are also available for common operations on iterators:

#include <iterator>
#include <vector>
 
std::vector<int> v{ 10, 20, 30, 40, 50 };
 
// Get the distance (how many iterations) between two iterators
DebugLog(std::distance(v.begin(), v.end())); // 5
 
// Advance an iterator by a certain number of iterations
std::vector<int>::iterator it{ v.begin() };
std::advance(it, 2);
DebugLog(*it); // 30
 
// Get the next iterator
DebugLog(*std::next(it)); // 40
DebugLog(*std::prev(it)); // 20

There are also some utility functions for containers:

#include <iterator>
#include <vector>
 
std::vector<int> v{ 10, 20, 30, 40, 50 };
 
// Check if a container is empty
DebugLog(std::empty(v)); // false
 
// Get a pointer to a container's data
int* d{ std::data(v) };
DebugLog(*d); // 10

A bunch of overloaded operators are provided outside of any particular iterator class to perform binary operations on iterators:

#include <iterator>
#include <vector>
 
std::vector<int> v{ 10, 20, 30, 40, 50 };
std::vector<int>::iterator itA{ v.begin() };
std::vector<int>::iterator itB{ v.end() };
 
// Subtraction is a synonym for std::distance
DebugLog(itB - itA); // 5
 
// Inequality and inequality operators compare iteration position
DebugLog(itA == itB); // false
DebugLog(itA < itB); // true

Some of this iterator-agnostic functionality is provided by the System.Linq.Enumerable class in C#. For example, Skip is very similar to advance in that it advances an IEnumerator forward by a given number of iterations.

Algorithm

The <algorithm> header provides a ton of functions that operate on iterators, just as LINQ in C# provides generic algorithms that operate on IEnumerable and IEnumerator. For example, std::all_of is the equivalent of Enumerable.All:

#include <algorithm>
#include <vector>
 
std::vector<int> v{ 10, 20, 30, 40, 50 };
 
bool allEven = std::all_of(
    v.begin(), // Iterator to start at
    v.end(), // Iterator to stop at
    [](int x) { return (x % 2) == 0; }); // Predicate to call with elements
DebugLog(allEven); // true

Right away we see a difference from LINQ: two iterators are passed in instead of an IEnumerable with its single GetEnumerator method. This makes it easy to operate on a subset of the container such as the middle three elements of an array:

#include <algorithm>
#include <vector>
 
int v[]{ 10, 20, 30, 40, 50 };
 
bool allSmall = std::all_of(
    std::begin(v) + 1,
    std::begin(v) + 4,
    [](int x) { return x >= 20 && x <= 40; });
DebugLog(allSmall); // true

The same can be done in C#, but only by allocating new managed class objects that implement the IEnumerable interface:

using System;
using System.Collections.Generic;
using System.Linq;
 
public class Program
{
    public static void Main()
    {
        int[] a = { 10, 20, 30, 40, 50 };
 
        // Skip allocates a class instance
        IEnumerable<int> skipped = a.Skip(1);
 
        // Take allocates a class instance
        IEnumerable<int> taken = skipped.Take(3);
 
        // Operates on only the middle three elements
        bool allSmall = taken.All(x => x >= 20 && x <= 40);
        Console.WriteLine(allSmall); // true
    }
}

LINQ is often avoided, such as in Unity games, due to the high amount of managed allocation and eventual garbage collection involved in creating so many IEnumerable class objects. The function calls to these, and the IEnumerator objects they return, are all virtual functions which run slower than non-virtual function calls. Additional GC pressure may even be generated by “boxing” of IEnumerator structs into managed objects.

C++ algorithms, in contrast, only create lightweight iterators that aren’t garbage-collected, are never boxed, and don’t use virtual functions. The result is that compilers often totally wipe out the iterators and algorithms resulting in a “raw” loop. That “raw” loop may even be optimized away if the contents of the collection are known at compile time. For example, a command line program version of the above returns allSmall as the exit code:

#include <algorithm>
#include <iterator>
 
int main()
{
    int v[]{ 10, 20, 30, 40, 50 };
 
    bool allSmall = std::all_of(
        std::begin(v) + 1,
        std::begin(v) + 4,
        [](int x) { return x >= 20 && x <= 40; });
    return allSmall;
}

GCC 10.3 with basic optimization (-O1) enabled compiles this to the constant 1 (for true) on x86-64:

main:
        mov     eax, 1
        ret

As with LINQ, C++ provides dozens of algorithm functions for many common operations. Here’s a sampling of the non-modifying algorithms:

#include <algorithm>
#include <vector>
 
std::vector<int> v1{ 10, 20, 30, 40, 50 };
std::vector<int> v2{ 10, 20, 35, 45, 55 };
 
auto isEven = [](int x) { return (x % 2) == 0; };
 
// Query the contents of a vector
DebugLog(std::any_of(v1.begin(), v1.begin(), isEven)); // true
DebugLog(std::none_of(v1.begin(), v1.begin(), isEven)); // false
 
// Get a pair of iterators where the vectors diverge
auto mm{ std::mismatch(v1.begin(), v1.end(), v2.begin()) };
DebugLog(*mm.first, *mm.second); // 30, 35
 
// Get an iterator to the first matching element
auto firstEven{ std::find_if(v2.begin(), v2.end(), isEven) };
DebugLog(*firstEven); // 10
 
// Get an iterator to the first matching element that doesn't match
auto firstOdd{ std::find_if_not(v2.begin(), v2.end(), isEven) };
DebugLog(*firstOdd); // 35
 
// Get an iterator to the first element of an element sequence
auto seq{ std::search(v1.begin(), v1.end(), v2.begin(), v2.begin() + 1) };
DebugLog(*seq); // 10

Here are some modifying algorithms:

#include <algorithm>
#include <vector>
#include <random>
 
std::vector<int> v{ 10, 20, 35, 45, 55 };
 
auto isEven = [](int x) { return (x % 2) == 0; };
auto print = [&]() {
    for (int x : v)
    {
        DebugLog(x);
    }
};
 
// Remove matching elements by shifting toward the front
// Returns an iterator just after the new end
auto end{ std::remove_if(v.begin(), v.end(), isEven) };
DebugLog(*end); // 45
print(); // 35, 45, 55
 
// Replace matching elements with a new value
std::replace_if(v.begin(), v.end(), [](int x) { return x < 50; }, 50);
print(); // 50, 50, 55, 50, 55
 
// Rotate left by two elements
std::rotate(v.begin(), v.begin() + 2, v.end());
print(); // 55, 50, 55, 50, 50
 
// Randomly shuffle elements
// Note: random_shuffle() isn't thread-safe and is deprecated since C++17
std::random_device rd{};
std::mt19937 gen{ rd() };
std::shuffle(v.begin(), v.end(), gen);
print(); // Some permutation of 55, 50, 55, 50, 50
 
// Assign a value to every element
std::fill(v.begin(), v.end(), 10);
print(); // 10, 10, 10, 10, 10

There are also algorithms related to sorting sequences:

#include <algorithm>
#include <vector>
 
std::vector<int> v{ 35, 45, 10, 20, 55 };
 
auto isEven = [](int x) { return (x % 2) == 0; };
auto print = [](auto& c) {
    for (int x : c)
    {
        DebugLog(x);
    }
};
 
// Check if a sequence is sorted
DebugLog(std::is_sorted(v.begin(), v.end())); // false
 
// Sort elements until an iterator is reached
std::partial_sort(v.begin(), v.begin() + 2, v.end());
print(v); // 10, 20, 45, 35, 55
DebugLog(std::is_sorted(v.begin(), v.end())); // false
 
// Sort the whole sequence
std::sort(v.begin(), v.end());
print(v); // 10, 20, 35, 45, 55
DebugLog(std::is_sorted(v.begin(), v.end())); // true
 
// Binary search a sorted sequence
DebugLog(std::binary_search(v.begin(), v.end(), 45)); // true
 
// Merge two sorted sequences into a sorted sequence
std::vector<int> v2{ 15, 25, 40, 50 };
std::vector<int> v3{};
std::merge(
    v.begin(), v.end(), // First sequence
    v2.begin(), v2.end(), // Second sequence
    std::back_insert_iterator<std::vector<int>>{ v3 }); // Output iterator
print(v3); // 10, 15, 20, 25, 35, 40, 45, 55
 
// Check if a sorted sequence includes another sorted sequence
// Inclusion doesn't need to be contiguous
bool inc{ std::includes(v3.begin(), v3.end(), v2.begin(), v2.end()) };
DebugLog(inc); // true

And finally there are some query operations:

#include <algorithm>
#include <vector>
 
std::vector<int> v1{ 35, 45, 10, 20, 55 };
std::vector<int> v2{ 35, 45, 10, 15, 30 };
 
// Check if two sequences' elements are equal
DebugLog(std::equal(v1.begin(), v1.end(), v2.begin(), v2.end())); // false
DebugLog(
    std::equal(
        v1.begin(), v1.begin() + 3,
        v2.begin(), v2.begin() + 3)); // true
 
// Find the min, max, and both of elements in a sequence
DebugLog(*std::min_element(v1.begin(), v1.end())); // 10
DebugLog(*std::max_element(v1.begin(), v1.end())); // 55
auto [minIt, maxIt] = std::minmax_element(v1.begin(), v1.end());
DebugLog(*minIt, *maxIt); // 10, 55
 
// Single value versions don't operate on sequences
int a = 10;
int b = 20;
DebugLog(std::min(a, b)); // 10
DebugLog(std::max(a, b)); // 20
auto [minVal, maxVal] = std::minmax(a, b);
DebugLog(minVal, maxVal); // 10, 20
 
// Other single value functions
DebugLog(std::clamp(1000, 0, 100)); // 100
std::swap(a, b);
DebugLog(a, b); // 20, 10

All of these examples have used std::vector<int>, but it’s important to know that these algorithms apply to any container type with any element type as long as the requirements of the algorithm are satisfied. This includes container and element types implemented in the C++ Standard Library as well as container and element types we create in our own code:

#include <algorithm>
 
// A custom enum and a function to get enumerator string names
enum class Element { Earth, Water, Wind, Fire };
const char* GetName(Element e)
{
    switch (e)
    {
        case Element::Earth: return "Earth";
        case Element::Water: return "Water";
        case Element::Wind: return "Wind";
        case Element::Fire: return "Fire";
        default: return "";
    }
}
 
// A custom struct
struct PrimalElement
{
    Element Element;
    int Power;
};
 
// Forward-declare a class that holds an array of the custom struct
class PrimalElementsArray;
 
// An iterator type for the custom struct
class PrimalElementIterator
{
    // Keep track of the current iteration position
    PrimalElement* Array;
    int Index;
 
public:
 
    PrimalElementIterator(PrimalElement* array, int index)
        : Array(array)
        , Index(index)
    {
    }
 
    // Advance the iterator
    PrimalElementIterator& operator++()
    {
        Index++;
        return *this;
    }
 
    // Compare with another iterator
    bool operator==(const PrimalElementIterator& other)
    {
        return Array == other.Array && Index == other.Index;
    }
 
    // Dereference to get the current element
    PrimalElement& operator*()
    {
        return Array[Index];
    }
};
 
// A class that holds an array of the custom struct
class PrimalElementsArray
{
    PrimalElement Elements[4];
 
public:
 
    PrimalElementsArray()
    {
        Elements[0] = PrimalElement{ Element::Earth, 50 };
        Elements[1] = PrimalElement{ Element::Water, 20 };
        Elements[2] = PrimalElement{ Element::Wind, 10 };
        Elements[3] = PrimalElement{ Element::Fire, 75 };
    }
 
    // Get an iterator to the first element
    PrimalElementIterator begin()
    {
        return PrimalElementIterator{ Elements, 0 };
    }
 
    // Get an iterator to one past the last element
    PrimalElementIterator end()
    {
        return PrimalElementIterator{ Elements, 4 };
    }
};
 
// Create our custom array type
PrimalElementsArray pea{};
 
// Use std::find_if to find the PrimalElement with more than 50 power
PrimalElementIterator found{
    std::find_if(
        pea.begin(),
        pea.end(),
        [](const PrimalElement& pe) { return pe.Power > 50; }) };
DebugLog(GetName((*found).Element), (*found).Power); // Fire, 75

Numeric

Finally for today, we’ll revisit the numbers library by looking at <numeric>. It turns out that it has some number-specific generic algorithms. Here’s a few of them:

#include <numeric>
#include <vector>
 
std::vector<int> v{};
v.resize(5);
 
auto print = [](auto& c) { for (int x : c) DebugLog(x); };
 
// Initialize with sequential values starting at 10
std::iota(v.begin(), v.end(), 10);
print(v); // 10, 11, 12, 13, 14
 
// Sum the range starting at 100
DebugLog(std::accumulate(v.begin(), v.end(), 100)); // 160
 
// Sum in an arbitrary order
DebugLog(std::reduce(v.begin(), v.end(), 100)); // 160
 
// C++17: transform pairs of elements and then reduce in an arbitrary order
// Equivalent to 1000000 + 10*10 + 11*10 + 12*10 + 13*10 + 14*10
DebugLog(
    std::transform_reduce(
        v.begin(), v.end(), // Sequence
        1000000, // Initial value
        [](int a, int b) { return a + b; }, // Reduce function
        [](int x) { return x*10; }) // Transform function
); // 1000600
 
// Output sums up to current iteration
std::vector<int> sums{};
std::partial_sum(
    v.begin(), v.begin() + 3, // Sequence
    std::back_insert_iterator{ sums }); // Output iterator
print(sums); // 10, 21, 33

Conclusion

Both languages have a wide variety of generic algorithms but they differ quite a bit in implementation. That ranges from the trivial naming differences of enumerators and iterators to the giant performance gulf between LINQ and C++ algorithm functions in <algorithm> and <numeric>.

It’s hard to overstate just how many generic algorithms are available in the C++ Standard Library. This is especially true when looking at the huge number of permutations of each of these functions. It’s common to see five or even ten overloads of these to customize for a wide variety of parameters running the gamut from simple versions to extremely generic and flexible versions. That’s another difference with C# where LINQ functions typically have just one or two overloads.

The design of the language, especially the very powerful support for compile-time generic programming via templates, combines with the iterator paradigm to enable all of this functionality on all of the many container types but also all of the container types we might implement in our own code to suit our own needs. We inherit the same high level of optimization that C++ Standard Library types receive, which gives us little excuse for writing a lot of “raw” loops.

C++ For C# Developers: Part 48 – Algorithms Library

Iterator

Algorithm

Numeric

Conclusion

Comments