C++ For C# Developers: Part 2 – Primitive Types and Literals
The series continues today with our first actual C++ code! Today we’ll start with the absolute fundamentals—primitive types and literals—on which we’ll build through the rest of the series. As basic as this topic sounds, some of it can be pretty shocking when coming from a language like C#.
Table of Contents
- Part 1: Introduction
- Part 2: Primitive Types and Literals
- Part 3: Variables and Initialization
- Part 4: Functions
- Part 5: Build Model
- Part 6: Control Flow
- Part 7: Pointers, Arrays, and Strings
- Part 8: References
- Part 9: Enumerations
- Part 10: Struct Basics
- Part 11: Struct Functions
- Part 12: Constructors and Destructors
- Part 13: Initialization
- Part 14: Inheritance
- Part 15: Struct and Class Permissions
- Part 16: Struct and Class Wrapup
- Part 17: Namespaces
- Part 18: Exceptions
- Part 19: Dynamic Allocation
- Part 20: Implicit Type Conversion
- Part 21: Casting and RTTI
- Part 22: Lambdas
- Part 23: Compile-Time Programming
- Part 24: Preprocessor
- Part 25: Intro to Templates
- Part 26: Template Parameters
- Part 27: Template Deduction and Specialization
- Part 28: Variadic Templates
- Part 29: Template Constraints
- Part 30: Type Aliases
- Part 31: Deconstructing and Attributes
- Part 32: Thread-Local Storage and Volatile
- Part 33: Alignment, Assembly, and Language Linkage
- Part 34: Fold Expressions and Elaborated Type Specifiers
- Part 35: Modules, The New Build Model
- Part 36: Coroutines
- Part 37: Missing Language Features
- Part 38: C Standard Library
- Part 39: Language Support Library
- Part 40: Utilities Library
- Part 41: System Integration Library
- Part 42: Numbers Library
- Part 43: Threading Library
- Part 44: Strings Library
- Part 45: Array Containers Library
- Part 46: Other Containers Library
- Part 47: Containers Library Wrapup
- Part 48: Algorithms Library
- Part 49: Ranges and Parallel Algorithms
- Part 50: I/O Library
- Part 51: Missing Library Features
- Part 52: Idioms and Best Practices
- Part 53: Conclusion
Types
Let’s start with integers, which are surprising in two ways: how loosely defined they are and how many types there are. The type name itself is made up of one or more parts:
Part | Meaning |
---|---|
signed , unsigned , or none |
If the type is signed or not. None means signed. |
short , long , long long , or none |
Size classification of the integer. Not an exact size! None means int . |
int or none |
Explicitly state that this is an integer. None states this implicitly. |
Here’s all 24 permutations, including the sizes in bits on common platforms:
C# Type | C++ Type | Windows Size | Unix Size |
---|---|---|---|
short |
short |
16 | 16 |
short |
short int |
16 | 16 |
short |
signed short |
16 | 16 |
short |
signed short int |
16 | 16 |
ushort |
unsigned short |
16 | 16 |
ushort |
unsigned short int |
16 | 16 |
int |
int |
32 | 32 |
int |
signed |
32 | 32 |
int |
signed int |
32 | 32 |
uint |
unsigned |
32 | 32 |
uint |
unsigned int |
32 | 32 |
N/A | long |
32 | 64 |
N/A | long int |
32 | 64 |
N/A | long int |
32 | 64 |
N/A | signed long |
32 | 64 |
N/A | signed long int |
32 | 64 |
N/A | unsigned long |
32 | 64 |
N/A | unsigned long int |
32 | 64 |
long |
long long |
64 | 64 |
long |
long long int |
64 | 64 |
long |
signed long long |
64 | 64 |
long |
signed long long int |
64 | 64 |
ulong |
unsigned long long |
64 | 64 |
ulong |
unsigned long long int |
64 | 64 |
There is also a type called size_t
which is either a 32-bit or 64-bit unsigned integer, depending on the CPU being compiled for.
There are four 8-bit types:
C# Type | C++ Type | x86 and x64 | ARM |
---|---|---|---|
bool |
bool |
N/A | N/A |
sbyte |
char |
Signed | Unsigned |
sbyte |
signed char |
Signed | Signed |
byte |
unsigned char |
Signed | Signed |
The types named with char
are due to their original usage for characters in ASCII strings. There are also larger character types:
C# Type | C++ Type | Windows Size | Unix Size |
---|---|---|---|
N/A | char8_t |
8 | 8 |
N/A | char16_t |
16 | 16 |
N/A | char32_t |
32 | 32 |
N/A | wchar_t |
16 | 32 |
Next we have floating-point types, including a super high precision long double
type:
C# Type | C++ Type | x86 Size | ARM Size |
---|---|---|---|
float |
float |
32 | 32 |
double |
double |
64 | 32 |
N/A | long double |
80 | 128 |
There is no decimal
type in C++, but libraries such as GMP provide similar functionality.
Given the uncertainty of size across CPU and OS, it’s a best practice to avoid many of these types and instead use types that have specific sizes. These are found in the Standard Library or in game engine APIs. Here’s how much simpler that makes everything:
Meaning | C# Type | C++ Type | Unreal Type |
---|---|---|---|
Boolean | bool |
bool |
bool |
8-bit signed integer | sbyte |
int8_t |
int8 |
8-bit unsigned integer | byte |
uint8_t |
uint8 |
16-bit signed integer | short |
int16_t |
int16 |
16-bit unsigned integer | ushort |
uint16_t |
uint16 |
8-bit character | N/A | char8_t |
CHAR8 |
16-bit character | char |
char16_t |
CHAR16 |
32-bit character | N/A | char32_t |
CHAR32 |
32-bit signed integer | int |
int32_t |
int32 |
32-bit unsigned integer | uint |
uint32_t |
uint32 |
64-bit signed integer | long |
int64_t |
int64 |
64-bit unsigned integer | ulong |
uint64_t |
uint64 |
32-bit floating point number | float |
float |
float |
128-bit floating point number | decimal |
N/A | N/A |
Literals
Now that we know all these types, let’s express them by writing some literals. First, and most obviously, booleans:
Literal | Type | Value |
---|---|---|
true |
bool |
1 |
false |
bool |
0 |
Next are integers. They are written in four parts:
Part | Meaning |
---|---|
0x , 0X , 0 , 0b , 0B or none |
The chosen base: hexadecimal, octal, or binary. None means decimal. |
0123456789abcdefABCDEF' , 01234567' , or 01' |
Digits of the chosen base. ' characters are optional separators like _ in C#. |
u , U , or none |
If the integer is unsigned. None means signed for decimal and octal, unsigned for hexadecimal and binary. |
l , L , ll , LL , or none |
The size classification. None means “the smallest size that can fit the value” from the int size classification to long then to long long . Note: can be swapped with u or U , if specified |
Here are some examples:
Literal | Type | Base | Signed | Size |
---|---|---|---|---|
123 |
int |
Decimal (default) | Signed (default) | int (default) |
5000000000 |
long |
Decimal (default) | Signed (default) | long (default) |
123u |
unsigned int |
Decimal (default) | Unsigned (explicit) | int (default) |
123ul |
unsigned long |
Decimal (default) | Unsigned (explicit) | long (explicit) |
123lu |
unsigned long |
Decimal (default) | Unsigned (explicit) | long (explicit) |
0x123456 |
int |
Hexadecimal (explicit) | Signed (default) | int (default) |
0xffffffff |
unsigned int |
Hexadecimal (explicit) | Unsigned (default) | int (default) |
0xffffffffff |
long |
Hexadecimal (explicit) | Signed (default) | long (default) |
0xFFFFFFFFll |
long long |
Hexadecimal (explicit) | Signed (default) | long long (explicit) |
0b10101010'01010101'10101010'01010101 |
unsigned int |
Binary (explicit) | Unsigned (default) | int (default) |
0123 |
int |
Octal (explicit) | Signed (default) | int (default) |
Next up are floating point literals, which are also written in four parts parts:
Part | Meaning |
---|---|
0x , 0X , or none |
Choose hexadecimal, or none for decimal |
0123456789abcdefABCDEF.' |
Digits of the chosen base. ' characters are optional separators like _ in C#. May end in . for whole numbers. |
e , e then +- then 0123456789 , p , p then +- then 0123456789 , or none |
Exponent x to multiply digits by 10^x . Always required for hexadecimal and required for decimal if there’s no . in the digits. e for decimal and p for hexadecimal. |
f , F , l , L , or none |
Size classification of float (f ) or long double (l ). None means double . |
Here are some example floating point literals:
Literal | Type | Base |
---|---|---|
12.34 |
double |
Decimal |
12.34f |
float |
Decimal |
12.34F |
float |
Decimal |
12.34e2 |
double |
Decimal |
12.34e-2 |
double |
Decimal |
12.34e-2f |
float |
Decimal |
12.e1 |
double |
Decimal |
12'34.56'78f |
float |
Decimal |
0x12p2 |
double |
Hexadecimal |
0x12.p2 |
double |
Hexadecimal |
0x12'34'56.78p2f |
float |
Hexadecimal |
Finally, we have character literals which take several forms:
Form | Meaning |
---|---|
'c' |
char type if c fits, otherwise int type, with character c |
u8'c' |
char8_t type with UTF-8 character c |
u'c' |
char16_t type with UTF-16 character c |
U'c' |
char32_t type with UTF-32 character c |
L'c' |
wchar_t type with character c |
'abc' |
int type representing multiple characters abc |
Characters can be anything in their set (e.g. UTF-8) except '
, \
, and the newline character. To get those, and other special characters, use an escape sequence:
Meaning | Escape Sequence | Note | Example |
---|---|---|---|
Single quote | \' |
||
Double quote | \" |
||
Question mark | \? |
||
Backslash | \\ |
||
Bell | \a |
||
Backspace | \b |
||
Form feed | \f |
||
Line feed | \n |
||
Carriage return | \r |
||
Tab | \t |
||
Vertical tab | \v |
||
Octal value | \ABC |
\ABC is the octal value |
\0 is NUL |
Hexadecimal value | \xAB |
\AB is the hexadecimal value |
\x41 is A |
16-bit Unicode code point | \uABCD |
\ABCD is the code point |
\u03b1 is α |
32-bit Unicode code point | \UABCDEFGH |
\ABCDEFGH is the code point |
\U0001F389 is 🎉 |
Here are some example character literals:
Literal | Type | Decimal Value |
---|---|---|
'A' |
char |
65 |
'?' |
char |
63 |
u8'A' |
char8_t |
65 |
u'α' |
char16_t |
945 |
U'\x1f389' |
char32_t |
127881 |
'ab' |
int |
127881 |
Conclusion
C++ literals are similar to C# literals, but different in several ways. You can often write the exact same code in both languages and get the same effect. There are several edge cases though, so it’s important to know some of these details about how the language works.
Next week we’ll dive into variables!
#1 by Maxime on June 2nd, 2020 ·
Hi !
Can you add the link of part 3 in the Table of Contents of this Part 2 and 1 please ?
Thanks :)
#2 by jackson on June 2nd, 2020 ·
Done. Thanks for reminding me. :)
#3 by Etherlord on July 27th, 2020 ·
“0x, 0X, 0, 0b, 0B or none || The chosen base: decimal, hexadecimal, octal, or binary. None means decimal.” – it could confuse someone, reads as 0x being decimal, 0X being hexadecimal, 0 being octal, 0b being binary.
Last token for floating point literals is described as “f, F, l, L, or none”, but it should be “f, F, d, D”
“Characters can be anything in their set (e.g. UTF-8) except ‘, \n, and the newline character” – suggests \n is something different than a newline character
Thanks again for an article.
#4 by jackson on July 27th, 2020 ·
Thanks for pointing these issues out. I’ve made changes to the article for all three of them and it’s better for it. In the second case though, floating point literals can indeed be suffixed with
l
orL
. It was the note afterward that falsely mentioned ad
suffix instead ofl
meaninglong double
. The other changes were as straight forward as your suggestions.