C++ For C# Developers: Part 38 – C Standard Library
Today we’ll begin exploring the C++ Standard Library. As C++ is mostly a superset of C, the C++ Standard Library is mostly a superset of the C Standard Library. So we’ll begin there!
Table of Contents
- Part 1: Introduction
- Part 2: Primitive Types and Literals
- Part 3: Variables and Initialization
- Part 4: Functions
- Part 5: Build Model
- Part 6: Control Flow
- Part 7: Pointers, Arrays, and Strings
- Part 8: References
- Part 9: Enumerations
- Part 10: Struct Basics
- Part 11: Struct Functions
- Part 12: Constructors and Destructors
- Part 13: Initialization
- Part 14: Inheritance
- Part 15: Struct and Class Permissions
- Part 16: Struct and Class Wrap-up
- Part 17: Namespaces
- Part 18: Exceptions
- Part 19: Dynamic Allocation
- Part 20: Implicit Type Conversion
- Part 21: Casting and RTTI
- Part 22: Lambdas
- Part 23: Compile-Time Programming
- Part 24: Preprocessor
- Part 25: Intro to Templates
- Part 26: Template Parameters
- Part 27: Template Deduction and Specialization
- Part 28: Variadic Templates
- Part 29: Template Constraints
- Part 30: Type Aliases
- Part 31: Deconstructing and Attributes
- Part 32: Thread-Local Storage and Volatile
- Part 33: Alignment, Assembly, and Language Linkage
- Part 34: Fold Expressions and Elaborated Type Specifiers
- Part 35: Modules, The New Build Model
- Part 36: Coroutines
- Part 37: Missing Language Features
- Part 38: C Standard Library
- Part 39: Language Support Library
- Part 40: Utilities Library
- Part 41: System Integration Library
- Part 42: Numbers Library
- Part 43: Threading Library
- Part 44: Strings Library
- Part 45: Array Containers Library
- Part 46: Other Containers Library
- Part 47: Containers Library Wrapup
- Part 48: Algorithms Library
- Part 49: Ranges and Parallel Algorithms
- Part 50: I/O Library
- Part 51: Missing Library Features
- Part 52: Idioms and Best Practices
- Part 53: Conclusion
Background
First, a word of caution: the C Standard Library is very old. Most of it dates back at least 30 years and even the newer parts are about 10 years old and built to fit in with the original design. The C language itself is also very simple. Its lack of features impacts the library design.
For example, there are “families” of functions that all do the same thing but on different data types. To take an absolute value of a floating point value we call fabs
for double
, fabsf
for float
, and fabsl
for long double
. In C++, we’d just overload abs
with different parameter types and the compiler would choose the right one to call.
The C++ Standard Library includes many more modern designs that rely on C++ language features. It has that abs
overloaded function, for example. The C Standard Library is included in the C++ Standard Library largely as part of C++’s broad goal to maintain a high degree of compatibility with C code. There are a few parts of it that are genuinely useful on their own, but these are few and far between.
Still, 30+ years of momentum is a powerful force and it’s extremely common to see the C Standard Library in use even when more modern alternatives are available. That makes it important for us to understand as many C++ codebases will include some C Standard Library usage.
We’re not going to go in depth and cover every little corner of the C Standard Library today, but we’ll survey its highlights.
General Purpose
As for composition, the C++ Standard Library is made up of header files. As of C++20, modules are also available. The C Standard Library is available only as header files. C Standard Library header files are named with a .h
extension: math.h
. These can be included directly into C++ files: #include <math.h>
. They are also wrapped by the C++ Standard Library. The wrapped versions begin with a c
and drop the .h
extension, so we can #include <cmath>
. These wrapped header files place everything in the std
namespace and may also place everything in the global namespace so both std::fabs
and ::fabs
work.
There’s one truly general purpose header file in the C Standard Library: stdlib.h
/cstdlib
. Unlike a more focused header file like math.h
/cmath
that obviously focuses on mathematics, a variety of utilities are provided by this header. Some of the basics include size_t
, the type that the sizeof
operator evaluates to, and NULL
, a null pointer constant widely used before the advent of nullptr
in C++11. The broad nature of this header file makes it hard to compare to C#, but it can roughly be though of as the System
namespace:
#include <stdlib.h> // sizeof() evaluates to size_t size_t intSize = sizeof(int); DebugLog(intSize); // Maybe 4 // NULL can be used as a pointer to indicate "null" int* ptr = NULL; // It's vulnerable to accidental misuse in arithemtic int sum = NULL + NULL; // nullptr isn't: this is a compiler error int sum2 = nullptr + nullptr;
Before C++ introduced the new
and delete
operators for dynamic memory allocation, C code would use the malloc
, calloc
, realloc
, and free
functions. The C# equivalent of malloc
is Marshal.AllocHGlobal
, realloc
is Marshal.ReallocHGlobal
, and free
is Marshal.FreeHGlobal
:
// Allocate 1 KB of uninitialized memory // Returns null upon failure // Memory is untyped, so casting is required to read or write void* memory = malloc(1024); // Reading it before initialization is undefined behavior int firstInt = ((int*)memory)[0]; // Release the memory. Failing to do so is a memory leak. free(memory); // Allocate and initialize to all zeroes 1 KB of memory: 256 x 4 bytes memory = calloc(256, 4); // Re-allocate previously-allocated memory to get more or less // Old memory is not freed if allocation fails memory = realloc(memory, 2048); // Also need to release memory from calloc and realloc free(memory);
There are some functions to parse numbers from strings, similar to int.Parse
, float.Parse
, etc.:
// Parse a double double d = atof("3.14"); DebugLog(d); // 3.14 // Parse an int int i = atoi("123"); DebugLog(i); // 123 // Parse a float and get a pointer to its end in a string const char* floatStr = "2.2 123.456"; char* pEnd; float f = strtof(floatStr, &pEnd); DebugLog(f); // 2.2 // Use the end pointer to parse more f = strtof(pEnd, &pEnd); DebugLog(f); // 123.456
Some generic algorithms are provided, similar to the C# Array
class as well as Random
and Math
:
// Seed the global randomizer // This is not thread-safe srand(123); // Use the global randomizer to generate a random number int r = rand(); DebugLog(r); // Maybe 440 // Compare pointers to ints auto compare = [](const void* a, const void* b) { return *(int*)a - *(int*)b; }; // Sort an array int a[] = { 4, 2, 1, 3 }; qsort(a, 4, sizeof(int), compare); DebugLog(a[0], a[1], a[2], a[3]); // 1, 2, 3, 4 // Binary search the array for 2 int valToFind = 2; int* pVal = (int*)bsearch(&valToFind, a, 4, sizeof(int), compare); int index = pVal - a; DebugLog(index); // 1 // Take an absolute value DebugLog(abs(-10)); // 10 // Divide and also get the remainder // stdlib.h/cstdlib also provides the div_t struct type div_t d = div(11, 3); DebugLog(d.quot, d.rem); // 3, 2
Finally, there’s some OS-related functionality:
// Run a system command int exitCode = system("ping example.com"); DebugLog(exitCode); // 0 if successful // Get an environment variable char* path = getenv("PATH"); DebugLog(path); // Path to executables // Register a function (lambda in this case) to be called when the program exits atexit([]{ DebugLog("Exiting..."); }); // Explicitly exit the program with an exit code exit(1); // Exiting...
Math and Numbers
The next category of header in the C Standard Library relates to mathematics. One we’ve seen throughout the series is stdint.h
/cstdint
, which provides integer types via typedef. Basic types like int
have guaranteed sizes in C#, but this header file goes above and beyond to also define types that fulfill particular requirements:
#include <stdint.h> int32_t i32; // Always signed 32-bit int_fast32_t if32; // Fastest signed integer type with at least 32 bits intptr_t ip; // Signed integer that can hold a pointer int_least32_t il; // Smallest signed integer with at least 32 bits intmax_t imax; // Biggest available signed integer // Range of 32-bit integer values DebugLog(INT32_MIN, INT32_MAX); // -2147483648, 2147483647 // Biggest size_t DebugLog(SIZE_MAX); // Maybe 18446744073709551615
There are also some types in stddef.h
/cstddef
. Some of these are more types that satisfy particular requirements. Unusually, there are also types that are C++-specific in the cstddef
version of this header:
#include <cstddef> // C and C++ types std::max_align_t ma; // Type with the biggest alignment std::ptrdiff_t pd; // Big enough to hold the subtraction of two pointers // C++-specific types std::nullptr_t np = nullptr; // The type of nullptr std::byte b; // An "enum class" version of a single byte
limits.h
/climits
also has some maximum and minimum macros, equivalent to int.MaxValue
and similar in C#:
#include <limits.h> // Range of int values DebugLog(INT_MIN, INT_MAX); // Maybe -2147483648, 2147483647 // Range of char values DebugLog(CHAR_MIN, CHAR_MAX); // Maybe -128, 127
The inttypes.h
/cinttypes
header also has integer-related utilities. These are needed because conversions to and from strings aren’t built into the language as they are in C# with functions like int.Parse
:
#include <inttypes.h> // Parse a hexadecimal string to an int // The nullptr means we don't want to get a pointer to the end intmax_t i = strtoimax("f0a2", nullptr, 16); DebugLog(i); // 61602
Similarly, float.h
/cfloat
provides a bunch of floating point macros similar to what C# provides via constants like float.MaxValue
:
#include <float.h> // Biggest float float f = FLT_MAX; DebugLog(f); // 3.40282e+38 // Difference between 1.0 and the next larger float float ep = FLT_EPSILON; DebugLog(ep); // 1.19209e-07
fenv.h
/cfenv
gives us fine-grain control over how the CPU deals with floating point numbers. There’s no real equivalent to this in C#:
#include <fenv.h> // Clear CPU float exceptions. Different than C++ exceptions. feclearexcept(FE_ALL_EXCEPT); // Divide by zero // Use volatile to prevent the compiler from removing this volatile float n = 1.0f; volatile float d = 0.0f; volatile float q = n / d; // Check float exceptions to see if this was a divide by zero or produced // an inexact result int divByZero = fetestexcept(FE_DIVBYZERO); int inexact = fetestexcept(FE_INEXACT); DebugLog(divByZero != 0); // true DebugLog(inexact != 0); // false // Clear float exceptions feclearexcept(FE_ALL_EXCEPT); // Perform a division whose quotient can't be represented exactly d = 10.0f; q = n / d; // Check float exceptions divByZero = fetestexcept(FE_DIVBYZERO); inexact = fetestexcept(FE_INEXACT); DebugLog(divByZero != 0); // false DebugLog(inexact != 0); // true
Strings and Arrays
The next category of headers deals with strings and arrays. Let’s start with string.h
/cstring
which has a lot of operations that are built into the string
class, managed arrays, and Buffer
in C#:
#include <string.h> // Compare strings: 0 for equality, -1 for less than, 1 for greater than DebugLog(strcmp("hello", "hello")); // 0 DebugLog(strcmp("goodbye", "hello")); // -1 // Copy a string char buf[32]; strcpy(buf, "hello"); DebugLog(buf); // Concatenate strings strcat(buf + 5, " world"); DebugLog(buf); // hello world // Count characters in a string (its length) // This iterates until NUL is found DebugLog(strlen(buf)); // 11 // Get a pointer to the first occurrence of a character in a string DebugLog(strchr(buf, 'o')); // o world // Get a pointer to the first occurrence of a string in a string DebugLog(strstr(buf, "ll")); // llo world // Get a pointer to the next "token" in a string, separated by a delimiter // Stores state globally: not thread-safe char* next = strtok(buf, " "); DebugLog(next); // hello next = strtok(nullptr, ""); // null means to continue the global state DebugLog(next); // world // Copy the first three bytes of buf ("hel") to later in the buffer memcpy(buf + 3, buf, 3); DebugLog(buf); // helhelworld // Set all bytes in buf to 65 and put a NUL at the end memset(buf, 65, 31); buf[31] = 0; DebugLog(buf); // AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
wchar.h
/cwchar
is the equivalent for “wide” characters. Support for various character types in C# is provided by the System.Text
namespace, which has similar functionality to what’s in this header:
#include <wchar.h> wchar_t input[] = L"foo,bar,baz"; // Get the first token, using 'state' to hold the tokenization state wchar_t* state; wchar_t* token = wcstok(input, L",", &state); DebugLog(token); // foo // Get the second token token = wcstok(nullptr, L",", &state); DebugLog(token); // bar // Get the third token token = wcstok(nullptr, L",", &state); DebugLog(token); // baz
ctype.h
/cctype
has functions related to just characters. C’s lack of a bool
type means 0
is used instead of false
and non-0
is used instead of true
. C# doesn’t use ASCII natively, so this is approximated by ASCIIEncoding
there:
#include <ctype.h> // Check for alphabetical characters DebugLog(isalpha('a') != 0); // true DebugLog(isalpha('9') != 0); // false // Check for digit characters DebugLog(isdigit('a') != 0); // false DebugLog(isdigit('9') != 0); // true // Change to uppercase DebugLog(toupper('a')); // A
wctype.h
/cwctype
is the equivalent for “wide” characters. A lot of this is built into the char
type in C#:
#include <wctype.h> // Check for alphabetical characters DebugLog(iswalpha(L'a') != 0); // true DebugLog(iswalpha(L'9') != 0); // false // Check for digit characters DebugLog(iswdigit(L'a') != 0); // false DebugLog(iswdigit(L'9') != 0); // true // Change to uppercase DebugLog(towupper(L'a')); // A
uchar.h
/cuchar
has character conversion functions. The “encoding” classes in C#’s System.Text
namespace provide for these conversions in .NET:
// Convert to UTF-16 char input[] = "A"; char16_t output; mbstate_t state{}; size_t len = mbrtoc16(&output, input, MB_CUR_MAX, &state); DebugLog(len); // 1 uint8_t* outputBytes = (uint8_t*)&output; DebugLog(outputBytes[0], outputBytes[1]); // 65, 0
Language Tools
This category of header files includes a range of tools that aren’t part of the C or C++ language, but are closely tied to it or would be built into other languages.
First up is stdarg.h
/cstdarg
. This header contains the required types and macros to implement variadic function. These are uncommonly used in C++ since variadic templates are available, easier to use, and type-safe. In C#, we’d use the params
keyword to have the compiler generate a managed array of arguments at the call site. Here’s how to use the va_
macros to implement a variadic function:
#include <stdarg.h> // The "..." indicates a variadic function void PrintLogs(int count, ...) { // A va_list holds the state va_list args; // Use the "va_start" macro to start getting args va_start(args, count); for (int i = 0; i < count; ++i) { // Use the "va_arg" macro to get the next arg const char* log = va_arg(args, const char*); DebugLog(log); } // Use the "va_end" macro to stop getting args va_end(args); } // Call the variadic function PrintLogs(3, "foo", "bar", "baz"); // foo, bar, baz
Next there is assert.h
/cassert
containing the assert
macro. If the NDEBUG
preprocessor symbol is defined, this checks if a condition is false and calls std::abort
to end the program, possibly with additional debugging steps such as breaking an interactive debugger. If the condition is true, nothing happens. If NDEBUG
isn’t defined, the condition itself is stripped out of the program and not compiled. In C#, we’d use the [Conditional]
attribute to build an assert or make use of the existing Debug.Assert
:
#include <assert.h> assert(2 + 2 == 4); // OK assert(2 + 2 == 5); // Calls std::abort and maybe more
Then we have setjmp.h
/csetjmp
, used to implement a high-powered version of goto
. This can jump outside of a function, but by breaking these normal language rules eschews the normal destructor calls that are used to clean up local objects. None of this is available in C#:
#include <setjmp.h> // Saved execution state jmp_buf buf; // Use volatile to prevent the compiler from optimizing this away volatile int count = 0; void Goo() { count++; DebugLog("Goo calling longjmp with", count); // Go to the saved execution state and pass 'count' as the 'status' longjmp(buf, count); } void Foo() { DebugLog("Foo"); // Save the execution state // When longjmp is called, execution goes here // The passed 'status' is "returned" from setjmp int status = setjmp(buf); DebugLog("Foo got status", status); if (status >= 3) { return; } DebugLog("Foo calling Goo"); Goo(); }
This prints the following:
Foo Foo got status, 0 Foo calling Goo Goo calling longjmp with, 1 Foo got status, 1 Foo calling Goo Goo calling longjmp with, 2 Foo got status, 2 Foo calling Goo Goo calling longjmp with, 3 Foo got status, 3
Lastly, there’s errno.h
/cerrno
. This header provides the errno
macro that holds a global error flag used by several C Standard Library functions. This is generally considered to be a poor way of handling errors as it’s not thread-safe and the caller needs to know to check something that isn’t part of the function signature. It’s never used in C#, so there’s really no equivalent. It is widely used in the C Standard Library though, so let’s see how it works:
#include <errno.h> // Pass an invalid argument to sqrt (from math.h) float root = sqrt(-1.0f); // It returns NaN DebugLog(root); // NaN // It signals this error by setting errno to EDOM (out of domain) DebugLog(errno); // Maybe 33 // Check that this is what was set DebugLog(errno == EDOM); // true
System Integration
The last category of header files deals with the system on which we run our programs. Let’s start with time.h
/ctime
which is like a basic version of DateTime
in C#:
#include <time.h> // Get the time in the return value and in the pointer we pass time_t t1{}; time_t t2 = time(&t1); DebugLog(t1, t2); // Maybe 1612052060, 1612052060 // Get the amount of CPU time the program has used // Not in relation to any particular time (like the UNIX epoch) clock_t c1 = clock(); // Do something expensive we want to benchmark volatile float f = 123456; for (int i = 0; i < 1000000; ++i) { f = sqrtf(f); } // Check the clock again clock_t c2 = clock(); double secs = ((double)(c2) - c1) / CLOCKS_PER_SEC; DebugLog("Took", secs, "seconds"); // Maybe: Took 0.011 seconds
We also have signal.h
/csignal
to deal with OS signals. This allows us to deal with signals such as being terminated by the OS and to raise such signals ourselves. This isn’t normally done with C# as the .NET environment our program is running in handles such signals:
#include <signal.h> signal(SIGTERM, [](int val){DebugLog("terminated with", val); }); raise(SIGTERM); // Maybe: terminated with 15
Many C Standard Library functions use a global “locale” setting to determine how they work. The locale.h
/clocale
header file has functions to change this setting. It’s similar to the thread-specific CultureInfo
in C#:
#include <locale.h> // Set the locale for everything to Japanese // This is global: not thread-safe setlocale(LC_ALL, "ja_JP.UTF-8"); // Get the global locale lconv* lc = localeconv(); DebugLog(lc->currency_symbol); // ¥
And finally, we’ll end with the header that enables “Hello, world!” in C: stdio.h
/cstdio
. This is like Console
in C#. There’s also file system access, similar to the methods of File
in C#:
#include <stdio.h> // Output a formatted string to stdout // The first string is the "format string" with value placeholders: %s %d // Subsequent values must match the placeholders' types // This is a variadic function printf("%s %d\n", "Hello, world!", 123); // Hello, world! 123 // Read a value from stdin // The same "format string" is used to accept different types int val; int numValsRead = scanf("%d", &val); DebugLog(numValsRead); // {1 if the user entered a number, else 0} if (numValsRead == 1) { DebugLog(val); // {Number the user typed} } // Open a file, seek to its send, get the position, and close it FILE* file = fopen("/path/to/myfile.dat", "r"); fseek(file, 0, SEEK_END); long len = ftell(file); fclose(file); DebugLog(len); // {Number of bytes in the file} // Delete a file int deleted = remove("/path/to/deleteme.dat"); DebugLog(deleted == 0); // True if the file was deleted // Rename a file int renamed = rename("/path/to/oldname.dat", "/path/to/newname.dat"); DebugLog(renamed == 0); // True if the file was renamed
Conclusion
The C Standard Library is very old, but still very commonly used. Being so old and based on the much less powerful C, a lot of its design leaves a lot to be desired. The global states used by functions like rand
and strtok
and macros like errno
aren’t thread-safe and are difficult to understand how to use correctly. Using special int
values, even inconsistently, instead of more structured outputs like exceptions and enumerations is similarly difficult to use.
Regardless of any complaints we may have about the C Standard Library’s design, we still need to know how to use it. The C++ Standard Library offers alternatives to much of what we’ve seen here today, but that’s not always the case. Sure, we can swap in <random>
for rand
, <chrono>
for time
, and <filesystem>
for remove
, but assert
and stdint.h
remain the most modern standardized ways of achieving those areas of functionality.
From here on we’ll be covering the C++ part of the C++ Standard Library. We’ll see a lot more modern designs for areas like containers, algorithms, I/O, strings, math, and threading!