C++ For C# Developers: Part 10 – Struct Basics
Let’s continue the series today by starting to look at structs. These are far more powerful in C++ than in C#, so today we’ll start with basics like defining and initializing them. Read on to get started!
Table of Contents
- Part 1: Introduction
- Part 2: Primitive Types and Literals
- Part 3: Variables and Initialization
- Part 4: Functions
- Part 5: Build Model
- Part 6: Control Flow
- Part 7: Pointers, Arrays, and Strings
- Part 8: References
- Part 9: Enumerations
- Part 10: Struct Basics
- Part 11: Struct Functions
- Part 12: Constructors and Destructors
- Part 13: Initialization
- Part 14: Inheritance
- Part 15: Struct and Class Permissions
- Part 16: Struct and Class Wrapup
- Part 17: Namespaces
- Part 18: Exceptions
- Part 19: Dynamic Allocation
- Part 20: Implicit Type Conversion
- Part 21: Casting and RTTI
- Part 22: Lambdas
- Part 23: Compile-Time Programming
- Part 24: Preprocessor
- Part 25: Intro to Templates
- Part 26: Template Parameters
- Part 27: Template Deduction and Specialization
- Part 28: Variadic Templates
- Part 29: Template Constraints
- Part 30: Type Aliases
- Part 31: Deconstructing and Attributes
- Part 32: Thread-Local Storage and Volatile
- Part 33: Alignment, Assembly, and Language Linkage
- Part 34: Fold Expressions and Elaborated Type Specifiers
- Part 35: Modules, The New Build Model
- Part 36: Coroutines
- Part 37: Missing Language Features
- Part 38: C Standard Library
- Part 39: Language Support Library
- Part 40: Utilities Library
- Part 41: System Integration Library
- Part 42: Numbers Library
- Part 43: Threading Library
- Part 44: Strings Library
- Part 45: Array Containers Library
- Part 46: Other Containers Library
- Part 47: Containers Library Wrapup
- Part 48: Algorithms Library
- Part 49: Ranges and Parallel Algorithms
- Part 50: I/O Library
- Part 51: Missing Library Features
- Part 52: Idioms and Best Practices
- Part 53: Conclusion
Declaration and Definition
Just like with functions and enumerations, structs may be declared and defined separately:
// Declaration struct Vec3; // Definition struct Vec3 { float x; float y; float z; };
Notice how struct declarations and definitions looks pretty similar to enumeration declarations and definitions. We use the struct
keyword, give it a name, add curly braces to hold its contents, then finish up with a semicolon.
We can create struct variables just like we create primitive and enumeration variables:
Vec3 vec;
As with primitives and enumerations, this variable is uninitialized. Initialization of structs is a surprisingly complex topic compared to C# and we’ll cover it in depth later on in the series. For now, let’s just initialize by individually setting each of the struct’s data members. That’s the C++ term for the equivalent of fields in C#. They’re also commonly called “member variables.” To do this, we use the .
operator just like in C#:
Vec3 vec; vec.x = 1; vec.y = 2; vec.z = 3; DebugLog(vec.x, vec.y, vec.z); // 1, 2, 3
We can also initialize the data members in the struct definition with either =x
or {x}
:
struct Vec3 { float x = 1; float y{2}; float z = 3; }; Vec3 vec; DebugLog(vec.x, vec.y, vec.z); // 1, 2, 3
As with enumerations, we can also declare variables between the closing curly brace and the semicolon of a definition:
struct Vec3 { float x; float y; float z; } v1, v2, v3;
This is sometimes used when omitting the name of the struct. This anonymous struct has no name we can type out, but it can be used all the same in a similar way to C# tuples ((string Name, int Year) t = ("Apollo 11", 1969);
):
// Anonymous struct with immediate variable struct { const char16_t* Name; int32_t Year; } moonMission; // Variables of this type can be used just like named struct types moonMission.Name = u"Apollo 11"; moonMission.Year = 1969; DebugLog(moonMission.Name, moonMission.Year);
Because an anonymous struct can only be used via immediate variables, declaring one without any immediate variables isn’t allowed:
// Compiler error: anonymous struct requires at least one immediate variable struct { float x; };
Like with enumerations whose underlying type isn’t in the declaration, the compiler doesn’t know the size of a struct after it’s declared. The definition is required to know its size, so a declared struct can’t be used to create a variable or define a function using the struct type as an argument or return value:
// Declaration struct Vec3; // Compiler error: can't create a variable before definition Vec3 v; // Compiler error: can't take a function argument before definition float GetMagnitudeSquared(Vec3 vec) { return 0; } // Compiler error: can't return a function return value before definition Vec3 MakeVec(float x, float y, float z) { // Compiler error: can't create a variable before definition Vec3 v; // Compiler error: can't return a struct before definition return v; }
This also means that we can’t declare immediate variables after a struct declaration:
// Compiler error: can't create a variable before definition struct Vec3 v1, v2, v3;
We can, however, use pointers and references to the struct since they don’t depend on its size:
// Declaration struct Vec3; // Pointer Vec3* p = nullptr; // lvalue reference float GetMagnitudeSquared(Vec3& vec) { return 0; } // rvalue reference float GetMagnitudeSquared(Vec3&& vec) { return 0; }
To access the fields of a pointer, we can either dereference with *p
and then use .x
or use the shorthand p->x
. Both are exactly equivalent to struct pointers in C#. With lvalue or rvalue references, we just use .
because they are essentially just aliases to a variable, not a pointer.
// Variable Vec3 vec; vec.x = 1; vec.y = 2; vec.z = 3; // Pointer Vec3* p = &vec; p->x = 10; p->y = 20; (*p).z = 30; // Alternate version of p->z // lvalue reference float GetMagnitudeSquared(Vec3& vec) { return vec.x*vec.x + vec.y*vec.y + vec.z*vec.z; } // rvalue reference float GetMagnitudeSquared(Vec3&& vec) { return vec.x*vec.x + vec.y*vec.y + vec.z*vec.z; }
Layout
Like in C#, the data members of a struct are grouped together in memory. Exactly how they’re laid out in memory isn’t defined by the C++ Standard though. Each compiler will lay out the data members as appropriate for factors such as the CPU architecture being compiled for.
This is similar to the default struct layout in C#, which behaves as though [StructLayout(LayoutKind.Auto)]
were explicitly added. There is no [StructLayout]
attribute in C++, but compiler-specific preprocessor directives are available to gain similar levels of control.
That said, compilers virtually always lay out the data members in a predictible pattern. Each is placed sequentially in the same order as written in the source code. Padding is placed between the data members according to the alignment requirements of the data types, which varies by CPU architecture. For example:
struct Padded // Takes up 8 bytes { int8_t a; // Takes up 1 byte // Padding of 3 bytes int32_t b; // Takes up 4 bytes };
The C++ Standard does make one guarantee though: a “standard layout.” This says that if two structs start with the same sequence of data types then those data members will be laid out the same. There are complex exceptions to this, but it’ll hold for most normal use cases like these. This means we can safely reinterpret some common struct types:
struct Vec3 { float x; float y; float z; }; struct Quat { // Starts with the same three floats as Vec3 float x; float y; float z; // Not in common. May be placed anywhere later in memory. float w; }; // Reinterpret Vec3 as Quat Vec3 vec; Vec3* pVec = &vec; Quat* pQuat = (Quat*)pVec; // Safe to use the three starting data members because types match pQuat->x = 1; pQuat->y = 2; pQuat->z = 3; // Definitely not safe to use the last data member // Vec3 doesn't have a fourth float // This is undefined behavior and probably corrupts memory pQuat->w = 4; DebugLog(pQuat->x, pQuat->y, pQuat->z); // 1, 2, 3 DebugLog(pVec->x, pVec->y, pVec->z); // 1, 2, 3 DebugLog(vec.x, vec.y, vec.z); // 1, 2, 3
Bit Fields
In C#, we can manually create bit fields but C++ supports them natively for all integer data members including bool
. This allows us to specify how many bits of memory a data member occupies:
struct Player { bool IsAlive : 1; uint8_t Lives : 3; uint8_t Team : 2; uint8_t WeaponID : 2; };
This struct takes up just one byte of memory because the sum of its bit fields’ sizes is 8
. Normally it would have taken up 4
bytes since each data member would take up a whole byte of its own.
We can access these data members just like normal:
Player p; p.IsAlive = true; p.Lives = 5; p.Team = 2; p.WeaponID = 1; DebugLog(p.IsAlive, p.Lives, p.Team, p.WeaponID); // true, 5, 2, 1
The compiler will, as always, generate CPU instructions specific to the arhitecture being compiled for and depending on settings such as optimization level. Generally though, the instructions will read one or more bytes containing the desired bits, use a bit mask to remove the other bits that were read, and shift the desired bits to the least-significant part of the data member’s type. Writing to a bit field is a similar process.
As of C++20, bit fields may be initialized in the struct definition just like other data members:
struct Player { bool IsAlive : 1 = true; uint8_t Lives : 3 {5}; uint8_t Team : 2 {2}; uint8_t WeaponID : 2 = 1; }; DebugLog(p.IsAlive, p.Lives, p.Team, p.WeaponID); // true, 5, 2, 1
Note that the size of a bit field may be larger than the stated type:
struct SixtyFourKilobits { uint8_t Val : 64*1024; };
The size of Val
and the struct itself is 64 kilobits, but Val
is still used just like an 8-bit integer.
Bit fields may also be unnamed:
struct FirstLast { uint8_t First : 1; // First bit of the byte uint8_t : 6; // Skip six bits uint8_t Last : 1; // Last bit of the byte };
Unnamed bit fields can also have zero size, which tells the compiler to put the next data member on the next byte it aligns to:
struct FirstBitOfTwoBytes { uint8_t Byte1 : 1; // First bit of the first byte uint8_t : 0; // Skip to the next byte uint8_t Byte2 : 1; // First bit of the second byte };
Finally, since bit fields don’t necessarily start at the beginning of a byte we can’t take their memory address:
FirstBitOfTwoBytes x; // Compiler error: can't take the address of a bit field uint8_t* p = &x.Byte1;
Static Data Members
Like static fields in C#, data members may be static in C++:
struct Player { int32_t Score; static int32_t HighScore; };
The meaning is the same as in C#. Each Player
object doesn’t have a HighScore
but rather there is one HighScore
for all Player
objects. Because it’s bound to the struct type, not an instance of the struct, we use the scope resolution operator (::
) as we did with scoped enumerations to access the data member:
Player::HighScore = 0;
What we put inside the struct definition is actually just a declaration of a variable, so we still need to define it outside the struct:
struct Player { int32_t Score; static int32_t HighScore; // Declaration }; // Definition int32_t Player::HighScore; // Incorrect definition // This just creates a new HighScore variable // We need the "Player::" part to refer to the declaration int32_t HighScore;
This also gives us an opportunity to initialize the variable:
int32_t Player::HighScore = 0;
Because the static data member inside the struct definition is just a declaration, it can use other types that haven’t yet been defined as long as they’re defined by the time we define the static data member:
// Declaration struct Vec3; struct Player { int32_t Health; // Declaration static Vec3 Fastest; }; // Definition struct Vec3 { float x; float y; float z; }; // Definition Vec3 Player::Fastest;
If the static data member is const
, we can initialize it inline. We’ll go over const
later in the series, but for now it’s similar to readonly
in C#.
struct Player { int32_t Health; const static int32_t MaxHealth = 100; };
We’re still allowed to put the definition outside the struct, but it’s optional to do so. If we do, we can only put the initialization in one of the two places:
// Option 1: initialize in the struct definition struct Player { int32_t Health; const static int32_t MaxHealth = 100; }; const int32_t Player::MaxHealth; // Option 2: initialize outside the struct definition struct Player { int32_t Health; const static int32_t MaxHealth; }; const int32_t Player::MaxHealth = 100; // Compiler error if initializing in both places struct Player { int32_t Health; const static int32_t MaxHealth = 100; }; const int32_t Player::MaxHealth = 100;
Static data members may also be inline
, much like with global variables:
struct Player { int32_t Health; inline static int32_t MaxHealth = 100; };
In this case, we can’t put a definition outside of the struct:
struct Player { int32_t Health; inline static int32_t MaxHealth = 100; }; // Compiler error: can't define outside the struct int32_t Player::MaxHealth;
Lastly, static data members can’t be bit fields. This would make no sense since they’re not part of instances of the struct and aren’t even necessarily located together in memory with other static data members of the struct:
struct Flags { // All of these are compiler errors // Static data members can't be bit fields static bool IsStarted : 1; static bool WonGame : 1; static bool GotHighScore : 1; static bool FoundSecret : 1; static bool PlayedMultiplayer : 1; static bool IsLoggedIn : 1; static bool RatedGame : 1; static bool RanBenchmark : 1; };
To work around this, make a struct with non-static bit fields and another struct with a static instance of the first struct:
struct FlagBits { bool IsStarted : 1; bool WonGame : 1; bool GotHighScore : 1; bool FoundSecret : 1; bool PlayedMultiplayer : 1; bool IsLoggedIn : 1; bool RatedGame : 1; bool RanBenchmark : 1; }; struct Flags { static FlagBits Bits; }; FlagBits Flags::Bits; Flags::Bits.WonGame = true;
Disallowed Data Members
C++ forbids using some kinds of data members in structs. First, auto
is not allowed for the data type:
struct Bad { // Compiler error: auto isn't allowed even if we initialize it inline auto Val = 123; };
An exception to this rule is when the data member is both static
and const
:
struct Good { // OK since data member is static and const static const auto Val = 123; };
Next, while register
is only deprecated for other kinds of variables, it’s illegal for data members:
struct Bad { // Compiler error: data members can't be register variables register int Val = 123; };
This is also true for other storage class specifiers like extern
:
struct Bad { // Compiler error: data members can't be extern variables extern int Val = 123; };
The entire struct can be declared with either storage class specifier instead:
struct Good { uint8_t Val; }; register Good r; extern Good e;
While we saw above that declared types that aren’t yet defined can be used for static data members, this is not the case for non-static data members:
struct Vec3; struct Bad { // Compiler error: Vec3 isn't defined yet Vec3 Pos; };
As with other variables of types that are declared but not yet defined, we are allowed to have pointers and references:
struct Vec3; struct Good { // OK to have a pointer to a type that's declared but not yet defined Vec3* PosPointer; // OK to have an lvalue to a type that's declared but not yet defined Vec3& PosLvalueReference; // OK to have an rvalue to a type that's declared but not yet defined Vec3&& PosRvalueReference; };
Nested Types
C++ allows us to nest types within structs just like we can in C#. Let’s start with a scoped enumeration:
struct Character { enum struct Type { Player, NonPlayer }; Type Type; }; Character c; c.Type = Character::Type::Player;
Note how we use Character::Type
to refer to Type
within Character
and then ::Player
to refer to an enumerator within Type
.
Also note how we can have both a Type
enumeration and a Type
data member. The two are disambiguated by the operator used to access the content of the struct:
Character c; Character* p = &c; Character& r = c; // . operator means "access data member" auto t = c.Type; t = r.Type; // -> operator means "dereference pointer then access data member" t = p->Type; // :: operator means "get something scoped to the type" Character::Type t2;
Ambiguity arises if the data member is static and has the same name as a nested type:
struct Character { enum struct Type { Player, NonPlayer }; static Type Type; }; // Compiler error: Character::Type is ambiguous // It could be either the scoped enumeration or the static data member Character::Type Character::Type = Character::Type::Player;
We can also nest unscoped enumerations:
struct Character { enum Type { Player, NonPlayer }; Type Type; } c; // Optionally specify the unscoped enumeration type name c.Type = Character::Type::Player; // Or don't specify it // Enumerators are added to the surrounding scope: the struct c.Type = Character::Player;
Finally, we can nest structs within structs. As with enumerations, this can be used to contextualize them such as to clean up our Flags
example above:
struct Flags { struct FlagBits { bool IsStarted : 1; bool WonGame : 1; bool GotHighScore : 1; bool FoundSecret : 1; bool PlayedMultiplayer : 1; bool IsLoggedIn : 1; bool RatedGame : 1; bool RanBenchmark : 1; }; static FlagBits Bits; }; Flags::FlagBits Flags::Bits;
We can combine this with anonymous structs to eliminate some of the verbosity. If we do, we’ll need to use decltype in order to state the type of the static variable when we define it outside the struct since we didn’t give it an explicit name:
struct Flags { // Unnamed struct with bit fields // The data member Bits is static static struct { bool IsStarted : 1; bool WonGame : 1; bool GotHighScore : 1; bool FoundSecret : 1; bool PlayedMultiplayer : 1; bool IsLoggedIn : 1; bool RatedGame : 1; bool RanBenchmark : 1; } Bits; }; // The unnamed struct has no name we can just type // Use decltype to refer to its type decltype(Flags::Bits) Flags::Bits; Flags::Bits.WonGame = true;
Of course we can continue to nest structs infinitly within other structs, but it’s generally a good idea to keep it to two or three levels and avoid resorting to anything like this:
struct S1 { struct S2 { struct S3 { struct S4 { struct S5 { uint8_t Val; }; }; }; }; }; S1::S2::S3::S4::S5 s; s.Val = 123;
Conclusion
We’re only just scratching the surface of C++ structs and already they have quite a few more advanced features than their C# counterparts:
Feature | Example |
---|---|
Split declaration and definition | struct S; struct S{}; |
Inline data member initializers | struct S {int X=1, int Y=2}; |
Bit fields | struct S {bool a:1; bool :6; bool b:1;}; |
Immediate variables | struct S {} s; |
Anonymous structs | struct {float X; float Y;} pos2; |
References to structs | struct S {} s; S& lr = s; S&& rr = S(); |
Automatic data member typing | struct S {static const auto X=1;}; |
Shared nested type and data member name | struct S {enum E{}; E E;}; |
Stay tuned for so, so many more features of C++ structs!
#1 by Domen on September 13th, 2020 ·
Hey, thanks for another article. :)
There is just a typo here: to keep it to two are three levels
#2 by jackson on September 14th, 2020 ·
You’re welcome! Thanks for pointing out the typo; I’ve updated the article with a fix.
#3 by Domen on September 21st, 2020 ·
Maybe it here it would just be worth mentioning that C++ structs and classes are fundamentally the same, whereas C# struct and class fundamentally behave differently.
Maybe this is more of a C# topic, but this could also be a separated article, where you could explain the differences between C++ and C# struct/class.
#4 by jackson on September 21st, 2020 ·
The differences between structs and classes is covered in Part 15 along with access specifiers.
#5 by Alexander on January 23rd, 2021 ·
Hi, Jackson. Thank you for the series, I’m enjoying it quite a bit so far :)
I think there’s a mistake in the Bit Fields section. The following example states that the size of the field “Val” is 64KB while it’s 64Kb, or 8KB. It’s not important to the example, but one may get confused whether the size is specified in bit or bytes.
#6 by jackson on January 24th, 2021 ·
Hi Alexander. I’m glad you’re enjoying the series. Thanks for pointing out the issue in this article. I’ve updated it with a fix. Please let me know if you spot any other issues.
#7 by Mathias Sønderskov on January 1st, 2023 ·
In the nameless you write
struct Vec3
{
float x;
float y;
float z;
} v1, v2, v3;
Shouldn’t you omit the “Vec3” in first line here?
#8 by jackson on January 10th, 2023 ·
No, this gives the struct type a name which you can use later like this:
Vec3 v4;