C++ is a middle-level programming language developed by Bjarne Stroustrup starting in 1979 at Bell Labs. C++ runs on a variety of platforms, such as Windows, Mac OS, and the various versions of UNIX. This tutorial adopts a simple and practical approach to describe the concepts of C++.
Structure of a program
The best way to learn a programming language is by writing programs. Typically, the first program beginners write is a program called "Hello World", which simply prints "Hello World" to your computer screen. Although it is very simple, it contains all the fundamental components C++ programs have:
The left panel above shows the C++ code for this program. The right panel shows the result when the program is executed by a computer. The grey numbers to the left of the panels are line numbers to make discussing programs and researching errors easier. They are not part of the program.
Let's examine this program line by line:
|
| Hello World! |
The left panel above shows the C++ code for this program. The right panel shows the result when the program is executed by a computer. The grey numbers to the left of the panels are line numbers to make discussing programs and researching errors easier. They are not part of the program.
Let's examine this program line by line:
- Line 1:
// my first program in C++
- Two slash signs indicate that the rest of the line is a comment inserted by the programmer but which has no effect on the behavior of the program. Programmers use them to include short explanations or observations concerning the code or program. In this case, it is a brief introductory description of the program.
-
- Line 2:
#include <iostream>
- Lines beginning with a hash sign (
#
) are directives read and interpreted by what is known as the preprocessor. They are special lines interpreted before the compilation of the program itself begins. In this case, the directive#include <iostream>
, instructs the preprocessor to include a section of standard C++ code, known as header iostream, that allows to perform standard input and output operations, such as writing the output of this program (Hello World) to the screen. - Line 3: A blank line.
- Blank lines have no effect on a program. They simply improve readability of the code.
- Line 4:
int main ()
- This line initiates the declaration of a function. Essentially, a function is a group of code statements which are given a name: in this case, this gives the name "main" to the group of code statements that follow. Functions will be discussed in detail in a later chapter, but essentially, their definition is introduced with a succession of a type (
int
), a name (main
) and a pair of parentheses (()
), optionally including parameters.
The function namedmain
is a special function in all C++ programs; it is the function called when the program is run. The execution of all C++ programs begins with themain
function, regardless of where the function is actually located within the code. - Lines 5 and 7:
{
and}
- The open brace (
{
) at line 5 indicates the beginning ofmain
's function definition, and the closing brace (}
) at line 7, indicates its end. Everything between these braces is the function's body that defines what happens whenmain
is called. All functions use braces to indicate the beginning and end of their definitions. - Line 6:
std::cout << "Hello World!";
- This line is a C++ statement. A statement is an expression that can actually produce some effect. It is the meat of a program, specifying its actual behavior. Statements are executed in the same order that they appear within a function's body.
This statement has three parts: First,std::cout
, which identifies the standard character output device (usually, this is the computer screen). Second, the insertion operator (<<
), which indicates that what follows is inserted intostd::cout
. Finally, a sentence within quotes ("Hello world!"), is the content inserted into the standard output.
Notice that the statement ends with a semicolon (;
). This character marks the end of the statement, just as the period ends a sentence in English. All C++ statements must end with a semicolon character. One of the most common syntax errors in C++ is forgetting to end a statement with a semicolon.
You may have noticed that not all the lines of this program perform actions when the code is executed. There is a line containing a comment (beginning with//
). There is a line with a directive for the preprocessor (beginning with#
). There is a line that defines a function (in this case, themain
function). And, finally, a line with a statements ending with a semicolon (the insertion intocout
), which was within the block delimited by the braces ({ }
) of themain
function.
The program has been structured in different lines and properly indented, in order to make it easier to understand for the humans reading it. But C++ does not have strict rules on indentation or on how to split instructions in different lines. For example, instead of
1 2 3 4
int main () { std::cout << " Hello World!"; }
We could have written:
int main () { std::cout << "Hello World!"; }
all in a single line, and this would have had exactly the same meaning as the preceding code.
In C++, the separation between statements is specified with an ending semicolon (;
), with the separation into different lines not mattering at all for this purpose. Many statements can be written in a single line, or each statement can be in its own line. The division of code in different lines serves only to make it more legible and schematic for the humans that may read it, but has no effect on the actual behavior of the program.
Now, let's add an additional statement to our first program:
1 2 3 4 5 6 7 8
// my second program in C++ #include <iostream> int main () { std::cout << "Hello World! "; std::cout << "I'm a C++ program"; }
Hello World! I'm a C++ program
In this case, the program performed two insertions intostd::cout
in two different statements. Once again, the separation in different lines of code simply gives greater readability to the program, sincemain
could have been perfectly valid defined in this way:
int main () { std::cout << " Hello World! "; std::cout << " I'm a C++ program "; }
The source code could have also been divided into more code lines instead:
1 2 3 4 5 6 7
int main () { std::cout << "Hello World!"; std::cout << "I'm a C++ program"; }
And the result would again have been exactly the same as in the previous examples.
Preprocessor directives (those that begin by#
) are out of this general rule since they are not statements. They are lines read and processed by the preprocessor before proper compilation begins. Preprocessor directives must be specified in their own line and, because they are not statements, do not have to end with a semicolon (;
). - Line 2:
Using namespace std
If you have seen C++ code before, you may have seencout
being used instead ofstd::cout
. Both name the same object: the first one uses its unqualified name (cout
), while the second qualifies it directly within the namespacestd
(asstd::cout
).cout
is part of the standard library, and all the elements in the standard C++ library are declared within what is called a namespace: the namespacestd
.
In order to refer to the elements in thestd
namespace a program shall either qualify each and every use of elements of the library (as we have done by prefixingcout
withstd::
), or introduce visibility of its components. The most typical way to introduce visibility of these components is by means of using declarations:using namespace std;
The above declaration allows all elements in thestd
namespace to be accessed in an unqualified manner (without thestd::
prefix).
With this in mind, the last example can be rewritten to make unqualified uses ofcout
as:1 2 3 4 5 6 7 8 9
// my second program in C++ #include <iostream> using namespace std; int main () { cout << "Hello World! "; cout << "I'm a C++ program"; }
Hello World! I'm a C++ program
Both ways of accessing the elements of thestd
namespace (explicit qualification and using declarations) are valid in C++ and produce the exact same behavior. For simplicity, and to improve readability, the examples in these tutorials will more often use this latter approach with using declarations, although note that explicit qualification is the only way to guarantee that name collisions never happen.
Variables and types
The usefulness of the "Hello World" programs shown in the previous chapter is rather questionable. We had to write several lines of code, compile them, and then execute the resulting program, just to obtain the result of a simple sentence written on the screen. It certainly would have been much faster to type the output sentence ourselves.
However, programming is not limited only to printing simple texts on the screen. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work, we need to introduce the concept of variables.
Let's imagine that I ask you to remember the number 5, and then I ask you to also memorize the number 2 at the same time. You have just stored two different values in your memory (5 and 2). Now, if I ask you to add 1 to the first number I said, you should be retaining the numbers 6 (that is 5+1) and 2 in your memory. Then we could, for example, subtract these values and obtain 4 as result.
The whole process described above is a simile of what a computer can do with two variables. The same process can be expressed in C++ with the following set of statements:1 2 3 4
a = 5; b = 2; a = a + 1; result = a - b;
Obviously, this is a very simple example, since we have only used two small integer values, but consider that your computer can store millions of numbers like these at the same time and conduct sophisticated mathematical operations with them.
We can now define variable as a portion of memory to store a value.
Each variable needs a name that identifies it and distinguishes it from the others. For example, in the previous code the variable names werea
,b
, andresult
, but we could have called the variables any names we could have come up with, as long as they were valid C++ identifiers.Identifiers
A valid identifier is a sequence of one or more letters, digits, or underscore characters (_
). Spaces, punctuation marks, and symbols cannot be part of an identifier. In addition, identifiers shall always begin with a letter. They can also begin with an underline character (_
), but such identifiers are -on most cases- considered reserved for compiler-specific keywords or external identifiers, as well as identifiers containing two successive underscore characters anywhere. In no case can they begin with a digit.
C++ uses a number of keywords to identify operations and data descriptions; therefore, identifiers created by a programmer cannot match these keywords. The standard reserved keywords that cannot be used for programmer created identifiers are:alignas, alignof, and, and_eq, asm, auto, bitand, bitor, bool, break, case, catch, char, char16_t, char32_t, class, compl, const, constexpr, const_cast, continue, decltype, default, delete, do, double, dynamic_cast, else, enum, explicit, export, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, noexcept, not, not_eq, nullptr, operator, or, or_eq, private, protected, public, register, reinterpret_cast, return, short, signed, sizeof, static, static_assert, static_cast, struct, switch, template, this, thread_local, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t, while, xor, xor_eq
Specific compilers may also have additional specific reserved keywords.
Very important: The C++ language is a "case sensitive" language. That means that an identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example, theRESULT
variable is not the same as theresult
variable or theResult
variable. These are three different identifiers identifiying three different variables.
Fundamental data types
The values of variables are stored somewhere in an unspecified location in the computer memory as zeros and ones. Our program does not need to know the exact location where a variable is stored; it can simply refer to it by its name. What the program needs to be aware of is the kind of data stored in the variable. It's not the same to store a simple integer as it is to store a letter or a large floating-point number; even though they are all represented using zeros and ones, they are not interpreted in the same way, and in many cases, they don't occupy the same amount of memory.
Fundamental data types are basic types implemented directly by the language that represent the basic storage units supported natively by most systems. They can mainly be classified into:- Character types: They can represent a single character, such as
'A'
or'$'
. The most basic type ischar
, which is a one-byte character. Other types are also provided for wider characters. - Numerical integer types: They can store a whole number value, such as
7
or1024
. They exist in a variety of sizes, and can either be signed or unsigned, depending on whether they support negative values or not. - Floating-point types: They can represent real values, such as
3.14
or0.01
, with different levels of precision, depending on which of the three floating-point types is used. - Boolean type: The boolean type, known in C++ as
bool
, can only represent one of two states,true
orfalse
.
Here is the complete list of fundamental types in C++:Group Type names* Notes on size / precision Character types char
Exactly one byte in size. At least 8 bits. char16_t
Not smaller than char
. At least 16 bits.char32_t
Not smaller than char16_t
. At least 32 bits.wchar_t
Can represent the largest supported character set. Integer types (signed) signed char
Same size as char
. At least 8 bits.signed short int
Not smaller than char
. At least 16 bits.signed int
Not smaller than short
. At least 16 bits.signed long int
Not smaller than int
. At least 32 bits.signed long long int
Not smaller than long
. At least 64 bits.Integer types (unsigned) unsigned char
(same size as their signed counterparts) unsigned short int
unsigned int
unsigned long int
unsigned long long int
Floating-point types float
double
Precision not less than float
long double
Precision not less than double
Boolean type bool
Void type void
no storage Null pointer decltype(nullptr)
* The names of certain integer types can be abbreviated without theirsigned
andint
components - only the part not in italics is required to identify the type, the part in italics is optional. I.e.,signed short int
can be abbreviated assigned short
,short int
, or simplyshort
; they all identify the same fundamental type.
Within each of the groups above, the difference between types is only their size (i.e., how much they occupy in memory): the first type in each group is the smallest, and the last is the largest, with each type being at least as large as the one preceding it in the same group. Other than that, the types in a group have the same properties.
Note in the panel above that other thanchar
(which has a size of exactly one byte), none of the fundamental types has a standard size specified (but a minimum size, at most). Therefore, the type is not required (and in many cases is not) exactly this minimum size. This does not mean that these types are of an undetermined size, but that there is no standard size across all compilers and machines; each compiler implementation may specify the sizes for these types that fit the best the architecture where the program is going to run. This rather generic size specification for types gives the C++ language a lot of flexibility to be adapted to work optimally in all kinds of platforms, both present and future.
Type sizes above are expressed in bits; the more bits a type has, the more distinct values it can represent, but at the same time, also consumes more space in memory:Size Unique representable values Notes 8-bit 256
= 28 16-bit 65 536
= 216 32-bit 4 294 967 296
= 232 (~4 billion) 64-bit 18 446 744 073 709 551 616
= 264 (~18 billion billion)
For integer types, having more representable values means that the range of values they can represent is greater; for example, a 16-bit unsigned integer would be able to represent 65536 distinct values in the range 0 to 65535, while its signed counterpart would be able to represent, on most cases, values between -32768 and 32767. Note that the range of positive values is approximately halved in signed types compared to unsigned types, due to the fact that one of the 16 bits is used for the sign; this is a relatively modest difference in range, and seldom justifies the use of unsigned types based purely on the range of positive values they can represent.
For floating-point types, the size affects their precision, by having more or less bits for their significant and exponent.
If the size or precision of the type is not a concern, thenchar
,int
, anddouble
are typically selected to represent characters, integers, and floating-point values, respectively. The other types in their respective groups are only used in very particular cases.
The properties of fundamental types in a particular system and compiler implementation can be obtained by using the numeric_limits classes (see standard header<limits>
). If for some reason, types of specific sizes are needed, the library defines certain fixed-size type aliases in header<cstdint>
.
The types described above (characters, integers, floating-point, and boolean) are collectively known as arithmetic types. But two additional fundamental types exist:void
, which identifies the lack of type; and the typenullptr
, which is a special type of pointer. Both types will be discussed further in a coming chapter about pointers.
C++ supports a wide variety of types based on the fundamental types discussed above; these other types are known as compound data types, and are one of the main strengths of the C++ language. We will also see them in more detail in future chapters.- Character types: They can represent a single character, such as
Declaration of variables
C++ is a strongly-typed language, and requires every variable to be declared with its type before its first use. This informs the compiler the size to reserve in memory for the variable and how to interpret its value. The syntax to declare a new variable in C++ is straightforward: we simply write the type followed by the variable name (i.e., its identifier). For example:1 2
int a; float mynumber;
These are two valid declarations of variables. The first one declares a variable of typeint
with the identifiera
. The second one declares a variable of typefloat
with the identifiermynumber
. Once declared, the variablesa
andmynumber
can be used within the rest of their scope in the program.
If declaring more than one variable of the same type, they can all be declared in a single statement by separating their identifiers with commas. For example:int a, b, c;
This declares three variables (a
,b
andc
), all of them of typeint
, and has exactly the same meaning as:1 2 3
int a; int b; int c;
To see what variable declarations look like in action within a program, let's have a look at the entire C++ code of the example about your mental memory proposed at the beginning of this chapter:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
// operating with variables #include <iostream> using namespace std; int main () { // declaring variables: int a, b; int result; // process: a = 5; b = 2; a = a + 1; result = a - b; // print out the result: cout << result; // terminate the program: return 0; }
4
Don't be worried if something else than the variable declarations themselves look a bit strange to you. Most of it will be explained in more detail in coming chapters.Initialization of variables
When the variables in the example above are declared, they have an undetermined value until they are assigned a value for the first time. But it is possible for a variable to have a specific value from the moment it is declared. This is called the initialization of the variable.
In C++, there are three ways to initialize variables. They are all equivalent and are reminiscent of the evolution of the language over the years:
The first one, known as c-like initialization (because it is inherited from the C language), consists of appending an equal sign followed by the value to which the variable is initialized:type identifier = initial_value;
For example, to declare a variable of typeint
calledx
and initialize it to a value of zero from the same moment it is declared, we can write:int x = 0;
A second method, known as constructor initialization (introduced by the C++ language), encloses the initial value between parentheses (()
):type identifier (initial_value);
For example:int x (0);
Finally, a third method, known as uniform initialization, similar to the above, but using curly braces ({}
) instead of parentheses (this was introduced by the revision of the C++ standard, in 2011):type identifier {initial_value};
For example:int x {0};
All three ways of initializing variables are valid and equivalent in C++.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
// initialization of variables #include <iostream> using namespace std; int main () { int a=5; // initial value: 5 int b(3); // initial value: 3 int c{2}; // initial value: 2 int result; // initial value undetermined a = a + b; result = a - c; cout << result; return 0; }
6
Type deduction: auto and decltype
When a new variable is initialized, the compiler can figure out what the type of the variable is automatically by the initializer. For this, it suffices to useauto
as the type specifier for the variable:1 2
int foo = 0; auto bar = foo; // the same as: int bar = foo;
Here,bar
is declared as having anauto
type; therefore, the type ofbar
is the type of the value used to initialize it: in this case it uses the type offoo
, which isint
.
Variables that are not initialized can also make use of type deduction with thedecltype
specifier:1 2
int foo = 0; decltype(foo) bar; // the same as: int bar;
Here,bar
is declared as having the same type asfoo
.auto
anddecltype
are powerful features recently added to the language. But the type deduction features they introduce are meant to be used either when the type cannot be obtained by other means or when using it improves code readability. The two examples above were likely neither of these use cases. In fact they probably decreased readability, since, when reading the code, one has to search for the type offoo
to actually know the type ofbar
.Introduction to strings
Fundamental types represent the most basic types handled by the machines where the code may run. But one of the major strengths of the C++ language is its rich set of compound types, of which the fundamental types are mere building blocks.
An example of compound type is thestring
class. Variables of this type are able to store sequences of characters, such as words or sentences. A very useful feature!
A first difference with fundamental data types is that in order to declare and use objects (variables) of this type, the program needs to include the header where the type is defined within the standard library (header<string>
):1 2 3 4 5 6 7 8 9 10 11 12
// my first string #include <iostream> #include <string> using namespace std; int main () { string mystring; mystring = "This is a string"; cout << mystring; return 0; }
This is a string
As you can see in the previous example, strings can be initialized with any valid string literal, just like numerical type variables can be initialized to any valid numerical literal. As with fundamental types, all initialization formats are valid with strings:1 2 3
string mystring = "This is a string"; string mystring ("This is a string"); string mystring {"This is a string"};
Strings can also perform all the other basic operations that fundamental data types can, like being declared without an initial value and change its value during execution:1 2 3 4 5 6 7 8 9 10 11 12 13 14
// my first string #include <iostream> #include <string> using namespace std; int main () { string mystring; mystring = "This is the initial string content"; cout << mystring << endl; mystring = "This is a different string content"; cout << mystring << endl; return 0; }
This is the initial string content This is a different string content
Note: inserting theendl
manipulator ends the line (printing a newline character and flushing the stream).
The string class is a compound type. As you can see in the example above, compound types are used in the same way as fundamental types: the same syntax is used to declare variables and to initialize them.
C++ Tutorial
Reviewed by Wanem Club
on
October 14, 2017
Rating:
Comments
As noted above, comments do not affect the operation of the program; however, they provide an important tool to document directly within the source code what the program does and how it operates.C++ supports two ways of commenting code:
The first of them, known as line comment, discards everything from where the pair of slash signs (
//
) are found up to the end of that same line. The second one, known as block comment, discards everything between the/*
characters and the first appearance of the*/
characters, with the possibility of including multiple lines.Let's add comments to our second program:
If comments are included within the source code of a program without using the comment characters combinations
//
,/*
or*/
, the compiler takes them as if they were C++ expressions, most likely causing the compilation to fail with one, or several, error messages.