"# Getting started with C++ using Jupyter notebook: Part 1\n",
"\n",
"This notebook will guide you through the initial steps for an introduction to C++. \n",
"Usually, C++ programs have to be \"compiled\" and then the compiled programs can be executed. For educational purposes, the instant feedback one gets in *interpreted* languages like Python is often considered \"easier\". There have been a few attempts to create interpreters for C++, which execute C++ programs line by line without having to compile it first. Cling was such an interpreter. Much of the functionality of cling is now directly available in a normal clang compiler installation as the tool *clang-repl*. There are Jupyter notebook kernels using clang-repl, but while testing for this course, we found several weaknesses, and decided to use this somewhat older kernel, based on the cling interpreter.\n",
"\n",
"Some caveats though:\n",
"\n",
"- C++ is a compiled language. To make it work in an intrepreter, additional rules are required, which are outside standard C++ language rules. Learning to structure C++ programs to suit the interpreter will impose an unnatural organisation, outside what is considered best practice in the developer community.\n",
"- cling works for some straight forward code in C++, but it's C++ coverage is NOT EVEN VAGUELY complete. A failure of cling to understand a piece of code does not necessarily mean the syntax is incorrect. Reason with what you know about C++. Use the real clang compiler or a different interpreter based on the latest compiler (see next point!). And while in this course, ask me!\n",
"- The Cling version on JupyterHub JSC is based on an older version of LLVM, compared to what you have available on the command line. Some of the newest language features are likely to be missing in the current version of cling. Since C++20 and C++23 are by and large backwards compatible with C++17, we can use this notebook to learn about a subset of features.\n",
"- It is necessary to restart cling based Jupyter kernels often. Quite often it is because of inadequate language support. For instance, standard input does not work in Jupyter-cling, i.e., the C++ code can not ask the user for input and then process what the user types. Such things are trivial in normal C++, but not inside this notebook. We will skip anything that needs standard input in our notebooks.\n",
"\n",
"All that said, to test a single line of code, this interpreter provides a very practical shortcut over writing a full program and going through the write - compile - run cycle. While learning, we need a lot of really small programs, and we will use this Jupyter notebook with xeus-cling kernel to illustrate syntax when possible.\n",
"\n",
"Each \"cell\" in this notebook contains either some text I wrote or a little bit of code. To \"execute\" a cell, you press Shift+Enter on your keyboard. Use the Help menu above to find out about keyboard shortcuts for Jupyter notebook. This first exercise session of the course is mostly intended as \"self study\" with the help of this notebook. Insert new cells using the insert menu above. Either type some code to test something in the new cells, or make it into a \"markdown\" cell by typing Esc+m. You can then type your own notes along side the material given. Ask if you need anything explained!\n"
"// Hopefully, you have read the introductory chapters in the course book and understand\n",
"// the purpose of the include statements. The specific include files we are using here\n",
"// will be explained later.\n",
"\n",
"// The contents of this cell are to set up this notebook. You don't need to understand\n",
"// how this works right now. Run this and move on!\n",
"#include <iostream>\n",
"#include <type_traits>\n",
"#include <string>\n",
"#include <vector>\n",
"#include <map>\n",
"#include <cmath>\n",
"#include <algorithm>\n",
"#include <typeinfo>\n",
"\n",
"using namespace std;\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you are running this notebook on our JupyterJSC Hub, please run the following as it is. Otherwise, you have to adjust the include path to the location on your system where headers for software such as boost are stored. **Run only one of the following cells. If the first one works, leave the second one alone. If not, uncomment the code in the second cell and run it.** "
"To write something to the screen (standard output) in a C++ program, you say `std::cout << something` or, if you are already using the `std` namespace (more on that in a bit), you can say `cout << something` as above. Try it below! "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"using namespace std;\n",
"cout << \"Hello, world!\\n\";"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Lesson not-to-learn:\n",
"Notice that if you run the cell above without the semi-colon `;` at the end, no error is reported. That is non-standard behaviour only accepted in this interpreted environment. **All C++ statements end in a semi-colon**. The additional rule that the `cling` interpreter uses, which allows skipping the last semi-colon is this:\n",
"\n",
" - if the final semi-colon in a cell is missing, print the value of the expression\n",
"\n",
"The rule makes no sense outside the interpreter. Since we are in the interpreter right now, we might as well use it below to invoke this \"auto-print\" behaviour, but remember: that **will not work in a real C++ program!** When you need to write something on the screen, use `cout`. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To write more than one thing, you can \"chain\" your outputs together like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cout << \"Cube of 13 is \" << 13 * 13 * 13 << \",\\n and for 14 that is \" << 14 * 14 * 14 << \"\\n\";"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Observe where the output goes to a new line above. `cout` does not automatically print a line. You start writing in a new line when you explicitly write the newline character `'\\n'`. Can you change the code above so that all of the text is in one line? How about starting a new line where we did, but making the 'C' and the 'a' in the two lines aligned?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Values"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following few cells should be fairly obvious. Just execute them, and verify that you understand the output. Is the output always what you expect?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1 + 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you got an error above (cling gets all confused many times!), please first try to restart the notebook and run the cells again. If the problem does not go away, ask me."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1 - 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1 * 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1 / 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above expression sometimes confuses some people, especially if coming from python. In python, the answer will be 0.5. Here, since 1 and 2 are both integers, we perform integer arithmetic, i.e. division with a remainder. What you got was the quotient, which really is 0. To get the remainder, run the next cell:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1 % 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let's repeat that, but this time, using \"unsigned integer\" values. Unsigned integers are non-negative integers. To write unsigned integer literal values, you put a 'U' character after the integer value."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1U + 2U"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1U - 2U"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1U * 2U"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1U / 2U"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1U % 2U"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, the unexpected value was probably `1U - 2U`. This has the value $4294967295 = 2^{32}-1$. Integers and unsigned integers are commonly represented by 32 binary bits. Unsigned integers have modular arithmetic between $0$ and $2^{32} - 1$, like a clock, but with $4294967296$ hours instead of $24$. If you keep decreasing beyond the lower bound, you appear at the other end. If you keep increasing past the maximum, you start from the left side again. Observe:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"4294967295U + 1U"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"4294967295U + 2U"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's calculate with floating point decimal numbers. Decimal double precision floating point numbers are written using ordinary decimal point notation. If the fractional part is all $0$, you don't need to write it. So, `1.0` can be written as `1.` etc.:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1. + 2."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1. - 2."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1. * 2."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"1. / 2."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above should look fairly straightforward. Since we are dealing with decimal numbers, we are not using integer arithmetic, and we get $0.5$ when we divide $1.$ by $2.$. Now, let me say it one more time that the above lines are not full C++ statements, but mere expressions. If you want to do the operations above and print the results, you would do something like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cout << 1. / 2. << \"\\n\";"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the computer, $1.$ above is not simply $1$ with a dot attached. In the second chapter of the companion book for this course, we describe the binary representation of floating point numbers. The example program to find out the binary bits of a number, `binform.cc` requires more C++ than what this version of `cling` supports. But a version compatible with C++17 is available to you in the course set up. You can include that file and get bit representations of numbers as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#include <binform.hh>\n",
"using namespace cxx_course;\n",
"// The header declares functions in the namespace cxx_course."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showbits(33);\n",
"showbits(33.0);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, try running that function for a few numbers: (suggestions) 1, 1.0, 2, 2.5, -1.3333, -1.0, 4, 4.0, 3, 3.0, ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showbits(3.1415927F)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showbits(1.0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Types, values, variables and declarations\n",
"\n",
"In computer memory, all information is stored in *bits*, which can be either 0 or 1.\n",
"Depending on the \"type\" you associate with a bunch of bits, they may represent different values. For example, consider these bits ...\n",
"\n",
"```\n",
" 0100 0000 0100 1001 0000 1111 1101 1011\n",
"```\n",
"\n",
"If you decide that they represent a 32 bit integer, they represent the number **1078530011**. If you say it is a 32-bit floating point number (decimal number), the exact same bits represent the number **3.1415927**. It's not possible to tell what kind of value is stored in a block of memory by staring hard at the bits. The programming language associates \"types\" with stored (and literal) values, which define the number of bits in a value of that type and the rules governing the interpretation of those bits (among other things).\n",
"\n",
"There are a few types built into the C++ language, called imaginatively, \"built-in types\". These are,\n",
"\n",
" - `int`, `unsigned int` : (typically) 32-bit signed and unsigned integers\n",
" - `long`, `unsigned long` : (typically) 64-bit signed and unsigned integers\n",
" - `short`, `unsigned short`: (typically) 16-bit signed and unsigned integers\n",
" - `char`, `unsigned char`: 8-bit signed and unsigned integers, interpreted as characters\n",
" - `float` : 32-bit floating point numbers (no \"unsigned float\"!)\n",
" - `double` : 64-bit floating point numbers (no \"unsigned double\"!)\n",
" - `bool` : A type storing true or false\n",
" \n",
"Besides these built-in types, C++ provides mechanisms for programmers to create new data types whose behaviour (what operations make sense on that type, what happens when you add, subtract etc. etc.) is programmable. User defined data types are created using C++ **class**es (or **struct**s), which will be discussed in detail later in the course. The C++ standard library introduces a lot of very useful classes. These are written using the *same mechanisms* as those which are available to you to create your classes. The standard library classes are, by and large, user defined types, and will be called so in the following.\n",
"\n",
"The important takeaway from all this is that in order to know how to make meaningful calculations with a sequence of bits, the computer needs to know what type we associate with those bits. In other words, to interpret a \"value\" from a sequence of bits, the bit representation must be combined with type information. An \"object\" is a concrete example of a sequence of bits representing a value of a certain type.\n",
"\n",
"Like lots of programming languages, you can create \"named\" objects, or *variables*, to store and reuse values. *variables* are convenient stores of useful values so that we can use them in subsequent work, change the stored values etc. If you ever owned an old fashioned calculator, where you could memorize some calculated values and restore them later, that's roughly the role of a variable.\n",
"\n",
"### Declarations\n",
"Here is how you create variables in C++:\n",
"\n",
"```c++\n",
"auto x = 1; // create an integer variable\n",
"auto y = 1.0; // create a double variable\n",
"auto z = 2U; // create an unsigned integer variable\n",
"```\n",
"\n",
"A variable stores a value. If you have a value you want to store, assign it to a name of your choosing with the `auto` keyword before it. \n",
"\n",
"Instead of the `=` sign above, we could braces `{` and `}` to set th initial values.\n",
"\n",
"```c++\n",
"auto x{1}; // create an integer variable\n",
"auto y{1.0}; // create a double variable\n",
"auto z{2U}; // create an unsigned integer variable\n",
"```\n",
"\n",
"Another alternative syntax for declarations is as follows. In these declarations, we explicitly tell the type of the values we want:\n",
"\n",
"```c++\n",
"int x = 1;\n",
"double y = 1.0;\n",
"unsigned int z = 1U;\n",
"```\n",
"\n",
"or using braces:\n",
"\n",
"```c++\n",
"int x{1};\n",
"double y{1.0};\n",
"unsigned int z{1U};\n",
"```\n",
"\n",
"When using braces with explicit type specification above, the compiler can apply some additional checks. For instance, in `int x{1}`, we have told the \n",
"compiler that we are trying to create an integer. If we then try to initialize it with `1.4`, it will report it as an error. `int x = 1.4`, on the other hand,\n",
"will be accepted silently, converting the `1.4` in to the specified type, `int`. We will have a variable of `int` type with a value `1`. If we wrote `int i{1.4}`,\n",
"it is not a very sane thing to do. There are some possibilities:\n",
"\n",
" - we wanted to write double, but wrote int by mistake\n",
" - we wanted to write 14 but typed an extra dot at the wrong place\n",
" - other equally weird scenarios, none of them is \"exactly what we want\"\n",
" \n",
"It is useful if the compiler gives us an error in such situations. The brace `{`, `}` syntax for initialisation is therefore somewhat more restrictive, and\n",
"should be preferred. It also allows us to set variables to their default initial values:\n",
"```c++\n",
"unsigned long counter{};\n",
"```\n",
"In the above, we create an unsigned long integer type with its default initial value, $0$. Since we are telling the type directly, this can work, whereas `auto counter{}` would be meaningless, as `auto` infers the type from the initialiser expression. When using explicit typenames to declare a variable, we can completely skip the initial value, e.g., `double x;`. `x` is then created uninitialised. There are special cases where even such an expression will lead to `x` being nicely zero initialised, but it is much better to always initialise with a literal value or with the default value explicitly. It is not recommended to leave the initializer out for built in types. For user defined types, C++ offers so much control that the creater of that user defined type gets to decide whether `UserdefinedType X` should create `X` uninitialised, zero initialised or initialised to any other sensible value."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using variables in expresions\n",
"\n",
"If you write an expression using variable names, when evaluating (finding the value of) that expression, the variable names are substituted by their values. \n",
"\n",
"Let's also note here that the symbol `=` in C++ means \"assign\". `X = Y` means take the value of the entity on the right and put it in the object with the name `X`. We are not asking whether `X` is equal to `Y`. For that comparison, like in many other languages, C++ uses `==`. If `X` is an integer, `5 == X` is always a valid expression, which is either true or false, but `5 = X` is syntactically meaningless, because you can not give a new value to $5$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Quiz: What does `X = X + 1` mean ?\n",
"\n",
"\n",
" A. X cancels out on both sides and we have 0 equals 1\n",
" B. Whatever value X had before, increase it by 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Aside:** Historically, the choice of `=` for assignment (so that we have to type `==` when we want to check if two values are equal) goes back to a time before keyboard inputs and the Fortran language. It is a small shift to get used to, and no other token proposed for this very important operation, `:=`, `->`, `<-` etc. consist of a single character. In a typical C++ program, you will need assignment far more often than equality comparison, so that most programmers support this notation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Static typing\n",
"\n",
"C++ is a \"statically typed\" language. Every time you bring a variable to existence to store something for a while, the compiler must be given a \"type\" for the values to be stored in it. In C++, the type is associated with a variable at the point it enters the code, through the \"declaration\". Throughout the life of that variable, it has the same type. It is not possible for a variable to change from being a string to being an integer. This point is important to understand for people who have programmed in python and are learning C++. The following is perfectly acceptable in python:\n",
"\n",
"```python\n",
"# Python code to make a contrast\n",
"v = \"1\" + \"2\" + \"3\"\n",
"print(v)\n",
"v = int(v)\n",
"```\n",
"If you run the above in a Jupyter notebook with a python3 kernel, you can verify the type of the variable `v` after each statement, using `type(v)` after each statement. You will see that `v` becomes an `int` after the last line above, whereas it starts out being a `str`. Such a thing will never be allowed in C++. The type of an entity is set when you declare it. It can not be changed by assignment. Allowing that would make reasoning about properties of a variable extremely hard: the type of a variable in one given line of code might end up depending on the code path taken to reach a particular line. To emphasize it further, consider this:\n",
"\n",
"```python\n",
"# Python code to make a contrast\n",
"v = \"1\" + \"2\" + \"3\"\n",
"for i in range(10):\n",
" print(v)\n",
" if len(v) > 7:\n",
" v = int(v)\n",
" v = v + v\n",
"\n",
"```\n",
"\n",
"How the variable `v` is changed in the line `v = v + v` depends on the type of `v`. If it is a string, \"123\", it will become \"123123\", by concatenation. If is an integer, 123, it will become 246. We have to consider the immediate history of the variable to know what will actually happen in the line `v = v + v`. Statically typed languages avoid this entire category of problems by making types of variables constants. In any C++ code, a variable which starts out as an object of type T will end its life as an object of type T.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Quiz: Does the variable declaration with `auto` affect the fact that C++ has compile time fixed types?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The answer is no. Because in every version of the variable declaration with `auto` you saw above, the compiler can infer and set a unique type for the object you create. The `auto` keyword just spares the human programmer from typing everything by hand, but `double X{1.5};` and `auto X{1.5};` are completely equivalent."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Type promotion in expressions\n",
"\n",
"The usual `+`, `-`, `*` and `/` operators can be used with the integral and floating point number types. For integral types, `/` is the integer division and `%` is the remainder operator. If the operands to the arithmetic operation have the same type, i.e., `int + int`, the result also has the same type. If the operands have different types, the smaller *type* is promoted to the larger *type*, so that the result ends up having the larger type (see below). Floating point types are to be considered \"larger\" for this purpose."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"auto m = 99;\n",
"auto y{23}, z{44};\n",
"auto x = y * z + m; // y * z + m is an int, like y, z and m. Therefore, x is an int\n",
"auto low = 10;\n",
"auto high = 13;\n",
"auto lq = high / low; // lq is of type int, and a value 1\n",
"auto lr = high % low; // lr is of type int and value 3\n",
"auto fq = 13.0 / low; // fq is type double and value 1.3\n",
"float f = 1.2; \n",
"auto v = f * BIGVAL; // v is a float\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"low & high"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above result is because `low` is 1010 in binary, and `high` is 1101, and `low & high` is **bitwise and** operation on the two integers. Similarly there is *bitwise or* `|`, *bitwise exclusive or* `^` and *bitwise not* `~`.\n",
"\n",
"**Exercise:** Use the showbits() function from earlier to print out the bits of `low`, `high` and `low & high`, to understand what the bitwise `&` operator did. Do the same for the other bitwise operators."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are also logical operators working on boolean values *true* and *false*, `&&`, `||`, `^^` and `!`. For variables which are not of type `bool`, when they are used with these logical operations, they are converted to a true/false value. Typically, values were all bits are zero are interpreted as false, and every other value is regarded as true. Guess the output of the following cell before executing it!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"// This is valid code, although the cling interpreter seems to have problems with it.\n",
"low && high"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Logical operators in C++ can be written alternatively as `and`, `or` etc. For example, `a && b` could be written as `a and b`, `a or b` can be written as `a or b`, for greater readability. Full table for alternative tokens can be found [here](https://en.cppreference.com/w/cpp/language/operator_alternative)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"low and high"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Fused assignment and arithmetic/bitwise operators\n",
"\n",
"It is possible to shorten `a = a + b` to `a += b`, `a = a - b` to `a -= b` and so on for all arithmetic operators. The same can be done for bitwise operators, i.e., `a = a & b` can be written as `a &= b` and so on. By contrast, there is no fused assignment for logical `&&`, `||` operators. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"low &= high ; \n",
"low"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As we have discussed before, `==` serves as the comparison for equality in C++. The expression `a == b` is either true or false, depending on the values of `a` and `b`. The expression `a = b` is quite different: this means \"store the value of b in a \\[converting the value to the type of a if necessary\\]\". Other comparison operators are `!=` for \"not equal to\", and `<`, `>`, `<=`, `>=`, for \"less than\", \"greater than\", \"less than or equal to\" and \"greater than or equal to\" respectively. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Increment and decrement operators\n",
"\n",
"Like C, the operators `++` and `--` are used to indicate increment and decrement of a given variable, respectively. Both operators can be placed before or after the variable name for built in numeric types, `int`, `long`, `short`, `char` (and unsigned counterparts), `double`, `float` and `long double`. For user defined types, the meaning of the operator is decided by the implementor of that type. In all cases, there is a slight difference in the meaning between the prefix (e.g., `++i`) and postfix (e.g., `i++`) forms. To understand that, we have to start recognising \"value of an expression\" and \"side effects of an expression\". \n",
"\n",
"`2 * x` is an expression involving the variable `x`. If `x` is currently `1`, `2 * x` has a *value* of `2`. When we evaluate the expression `2 * x` there are no side effects on `x`. After we have evaluated `2 * x`, `x` is still `1`. But, `x = 4` is also an expression. Its value is the value of `x` after the assignment, i.e., `4`. But it also has a side effect. Whatever `x` was before, it has now changed to a value of `4`. Prefix and postfix `++` (or `--`) operators have the same side effect. They both increase a variable by 1. But they have different values. The value of a prefix `++` operator is the already incremented value, whereas the value of the postfix operator is the old value. With examples:\n",
"\n",
"```c++\n",
"auto i{0};\n",
"auto k = ++i; // i is now 1, but so is k\n",
"cout << i << \"\\t\" << k << \"\\n\";\n",
"auto l = i++; // i is now 2, but l is 1\n",
"cout << i << \"\\t\" << l << \"\\n\";\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"{\n",
" auto i{0};\n",
" auto k = ++i; // i is now 1, but so is k\n",
" cout << i << \"\\t\" << k << \"\\n\";\n",
" auto l = i++; // i is now 2, but l is 1\n",
" cout << i << \"\\t\" << l << \"\\n\";\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Pre-in(de)crement : In(de)crease and use the new value\n",
"- Post-in(de)crement : In(de)crease, but use the old value\n",
"\n",
"Typically, the implementation of the post-increment or post-decrement operators involves more steps (e.g., make a backup, change, return the backup value) than the prefix versions (change, return the changed value). It is therefore a good habbit to cultivate to always write `++i` by default, and only do `i++` when you have a compelling reason to do so.\n",
"\n",
"**Aside:** The name \"C++\", for our language, is therefore not the best choice! It means, in its own syntax, \"increment C, and then use the old C\"! It should have been \"++C\"!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "C++2b",
"language": "C++",
"name": "cling-cpp2b"
},
"language_info": {
"codemirror_mode": "c++",
"file_extension": ".c++",
"mimetype": "text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
%% Cell type:markdown id: tags:
# Getting started with C++ using Jupyter notebook: Part 1
This notebook will guide you through the initial steps for an introduction to C++.
Usually, C++ programs have to be "compiled" and then the compiled programs can be executed. For educational purposes, the instant feedback one gets in *interpreted* languages like Python is often considered "easier". There have been a few attempts to create interpreters for C++, which execute C++ programs line by line without having to compile it first. Cling was such an interpreter. Much of the functionality of cling is now directly available in a normal clang compiler installation as the tool *clang-repl*. There are Jupyter notebook kernels using clang-repl, but while testing for this course, we found several weaknesses, and decided to use this somewhat older kernel, based on the cling interpreter.
Some caveats though:
- C++ is a compiled language. To make it work in an intrepreter, additional rules are required, which are outside standard C++ language rules. Learning to structure C++ programs to suit the interpreter will impose an unnatural organisation, outside what is considered best practice in the developer community.
- cling works for some straight forward code in C++, but it's C++ coverage is NOT EVEN VAGUELY complete. A failure of cling to understand a piece of code does not necessarily mean the syntax is incorrect. Reason with what you know about C++. Use the real clang compiler or a different interpreter based on the latest compiler (see next point!). And while in this course, ask me!
- The Cling version on JupyterHub JSC is based on an older version of LLVM, compared to what you have available on the command line. Some of the newest language features are likely to be missing in the current version of cling. Since C++20 and C++23 are by and large backwards compatible with C++17, we can use this notebook to learn about a subset of features.
- It is necessary to restart cling based Jupyter kernels often. Quite often it is because of inadequate language support. For instance, standard input does not work in Jupyter-cling, i.e., the C++ code can not ask the user for input and then process what the user types. Such things are trivial in normal C++, but not inside this notebook. We will skip anything that needs standard input in our notebooks.
All that said, to test a single line of code, this interpreter provides a very practical shortcut over writing a full program and going through the write - compile - run cycle. While learning, we need a lot of really small programs, and we will use this Jupyter notebook with xeus-cling kernel to illustrate syntax when possible.
Each "cell" in this notebook contains either some text I wrote or a little bit of code. To "execute" a cell, you press Shift+Enter on your keyboard. Use the Help menu above to find out about keyboard shortcuts for Jupyter notebook. This first exercise session of the course is mostly intended as "self study" with the help of this notebook. Insert new cells using the insert menu above. Either type some code to test something in the new cells, or make it into a "markdown" cell by typing Esc+m. You can then type your own notes along side the material given. Ask if you need anything explained!
// Hopefully, you have read the introductory chapters in the course book and understand
// the purpose of the include statements. The specific include files we are using here
// will be explained later.
// The contents of this cell are to set up this notebook. You don't need to understand
// how this works right now. Run this and move on!
#include <iostream>
#include <type_traits>
#include <string>
#include <vector>
#include <map>
#include <cmath>
#include <algorithm>
#include <typeinfo>
using namespace std;
```
%% Cell type:markdown id: tags:
If you are running this notebook on our JupyterJSC Hub, please run the following as it is. Otherwise, you have to adjust the include path to the location on your system where headers for software such as boost are stored. **Run only one of the following cells. If the first one works, leave the second one alone. If not, uncomment the code in the second cell and run it.**
To write something to the screen (standard output) in a C++ program, you say `std::cout << something` or, if you are already using the `std` namespace (more on that in a bit), you can say `cout << something` as above. Try it below!
%% Cell type:code id: tags:
``` C++
using namespace std;
cout << "Hello, world!\n";
```
%% Cell type:markdown id: tags:
### Lesson not-to-learn:
Notice that if you run the cell above without the semi-colon `;` at the end, no error is reported. That is non-standard behaviour only accepted in this interpreted environment. **All C++ statements end in a semi-colon**. The additional rule that the `cling` interpreter uses, which allows skipping the last semi-colon is this:
- if the final semi-colon in a cell is missing, print the value of the expression
The rule makes no sense outside the interpreter. Since we are in the interpreter right now, we might as well use it below to invoke this "auto-print" behaviour, but remember: that **will not work in a real C++ program!** When you need to write something on the screen, use `cout`.
%% Cell type:markdown id: tags:
To write more than one thing, you can "chain" your outputs together like this:
%% Cell type:code id: tags:
``` C++
cout << "Cube of 13 is " << 13 * 13 * 13 << ",\n and for 14 that is " << 14 * 14 * 14 << "\n";
```
%% Cell type:markdown id: tags:
Observe where the output goes to a new line above. `cout` does not automatically print a line. You start writing in a new line when you explicitly write the newline character `'\n'`. Can you change the code above so that all of the text is in one line? How about starting a new line where we did, but making the 'C' and the 'a' in the two lines aligned?
%% Cell type:markdown id: tags:
## Values
%% Cell type:markdown id: tags:
The following few cells should be fairly obvious. Just execute them, and verify that you understand the output. Is the output always what you expect?
%% Cell type:code id: tags:
``` C++
1 + 2
```
%% Cell type:markdown id: tags:
If you got an error above (cling gets all confused many times!), please first try to restart the notebook and run the cells again. If the problem does not go away, ask me.
%% Cell type:code id: tags:
``` C++
1 - 2
```
%% Cell type:code id: tags:
``` C++
1 * 2
```
%% Cell type:code id: tags:
``` C++
1 / 2
```
%% Cell type:markdown id: tags:
The above expression sometimes confuses some people, especially if coming from python. In python, the answer will be 0.5. Here, since 1 and 2 are both integers, we perform integer arithmetic, i.e. division with a remainder. What you got was the quotient, which really is 0. To get the remainder, run the next cell:
%% Cell type:code id: tags:
``` C++
1 % 2
```
%% Cell type:markdown id: tags:
Now, let's repeat that, but this time, using "unsigned integer" values. Unsigned integers are non-negative integers. To write unsigned integer literal values, you put a 'U' character after the integer value.
%% Cell type:code id: tags:
``` C++
1U + 2U
```
%% Cell type:code id: tags:
``` C++
1U - 2U
```
%% Cell type:code id: tags:
``` C++
1U * 2U
```
%% Cell type:code id: tags:
``` C++
1U / 2U
```
%% Cell type:code id: tags:
``` C++
1U % 2U
```
%% Cell type:markdown id: tags:
Here, the unexpected value was probably `1U - 2U`. This has the value $4294967295 = 2^{32}-1$. Integers and unsigned integers are commonly represented by 32 binary bits. Unsigned integers have modular arithmetic between $0$ and $2^{32} - 1$, like a clock, but with $4294967296$ hours instead of $24$. If you keep decreasing beyond the lower bound, you appear at the other end. If you keep increasing past the maximum, you start from the left side again. Observe:
%% Cell type:code id: tags:
``` C++
4294967295U + 1U
```
%% Cell type:code id: tags:
``` C++
4294967295U + 2U
```
%% Cell type:markdown id: tags:
Now let's calculate with floating point decimal numbers. Decimal double precision floating point numbers are written using ordinary decimal point notation. If the fractional part is all $0$, you don't need to write it. So, `1.0` can be written as `1.` etc.:
%% Cell type:code id: tags:
``` C++
1. + 2.
```
%% Cell type:code id: tags:
``` C++
1. - 2.
```
%% Cell type:code id: tags:
``` C++
1. * 2.
```
%% Cell type:code id: tags:
``` C++
1. / 2.
```
%% Cell type:markdown id: tags:
The above should look fairly straightforward. Since we are dealing with decimal numbers, we are not using integer arithmetic, and we get $0.5$ when we divide $1.$ by $2.$. Now, let me say it one more time that the above lines are not full C++ statements, but mere expressions. If you want to do the operations above and print the results, you would do something like this:
%% Cell type:code id: tags:
``` C++
cout << 1. / 2. << "\n";
```
%% Cell type:markdown id: tags:
For the computer, $1.$ above is not simply $1$ with a dot attached. In the second chapter of the companion book for this course, we describe the binary representation of floating point numbers. The example program to find out the binary bits of a number, `binform.cc` requires more C++ than what this version of `cling` supports. But a version compatible with C++17 is available to you in the course set up. You can include that file and get bit representations of numbers as follows:
%% Cell type:code id: tags:
``` C++
#include <binform.hh>
using namespace cxx_course;
// The header declares functions in the namespace cxx_course.
```
%% Cell type:code id: tags:
``` C++
showbits(33);
showbits(33.0);
```
%% Cell type:markdown id: tags:
Now, try running that function for a few numbers: (suggestions) 1, 1.0, 2, 2.5, -1.3333, -1.0, 4, 4.0, 3, 3.0, ...
%% Cell type:code id: tags:
``` C++
showbits(3.1415927F)
```
%% Cell type:code id: tags:
``` C++
showbits(1.0)
```
%% Cell type:markdown id: tags:
## Types, values, variables and declarations
In computer memory, all information is stored in *bits*, which can be either 0 or 1.
Depending on the "type" you associate with a bunch of bits, they may represent different values. For example, consider these bits ...
```
0100 0000 0100 1001 0000 1111 1101 1011
```
If you decide that they represent a 32 bit integer, they represent the number **1078530011**. If you say it is a 32-bit floating point number (decimal number), the exact same bits represent the number **3.1415927**. It's not possible to tell what kind of value is stored in a block of memory by staring hard at the bits. The programming language associates "types" with stored (and literal) values, which define the number of bits in a value of that type and the rules governing the interpretation of those bits (among other things).
There are a few types built into the C++ language, called imaginatively, "built-in types". These are,
- `int`, `unsigned int` : (typically) 32-bit signed and unsigned integers
- `long`, `unsigned long` : (typically) 64-bit signed and unsigned integers
- `short`, `unsigned short`: (typically) 16-bit signed and unsigned integers
- `char`, `unsigned char`: 8-bit signed and unsigned integers, interpreted as characters
- `float` : 32-bit floating point numbers (no "unsigned float"!)
- `double` : 64-bit floating point numbers (no "unsigned double"!)
- `bool` : A type storing true or false
Besides these built-in types, C++ provides mechanisms for programmers to create new data types whose behaviour (what operations make sense on that type, what happens when you add, subtract etc. etc.) is programmable. User defined data types are created using C++ **class**es (or **struct**s), which will be discussed in detail later in the course. The C++ standard library introduces a lot of very useful classes. These are written using the *same mechanisms* as those which are available to you to create your classes. The standard library classes are, by and large, user defined types, and will be called so in the following.
The important takeaway from all this is that in order to know how to make meaningful calculations with a sequence of bits, the computer needs to know what type we associate with those bits. In other words, to interpret a "value" from a sequence of bits, the bit representation must be combined with type information. An "object" is a concrete example of a sequence of bits representing a value of a certain type.
Like lots of programming languages, you can create "named" objects, or *variables*, to store and reuse values. *variables* are convenient stores of useful values so that we can use them in subsequent work, change the stored values etc. If you ever owned an old fashioned calculator, where you could memorize some calculated values and restore them later, that's roughly the role of a variable.
### Declarations
Here is how you create variables in C++:
```c++
autox=1;// create an integer variable
autoy=1.0;// create a double variable
autoz=2U;// create an unsigned integer variable
```
A variable stores a value. If you have a value you want to store, assign it to a name of your choosing with the `auto` keyword before it.
Instead of the `=` sign above, we could braces `{` and `}` to set th initial values.
```c++
autox{1};// create an integer variable
autoy{1.0};// create a double variable
autoz{2U};// create an unsigned integer variable
```
Another alternative syntax for declarations is as follows. In these declarations, we explicitly tell the type of the values we want:
```c++
intx=1;
doubley=1.0;
unsignedintz=1U;
```
or using braces:
```c++
intx{1};
doubley{1.0};
unsignedintz{1U};
```
When using braces with explicit type specification above, the compiler can apply some additional checks. For instance, in `int x{1}`, we have told the
compiler that we are trying to create an integer. If we then try to initialize it with `1.4`, it will report it as an error. `int x = 1.4`, on the other hand,
will be accepted silently, converting the `1.4` in to the specified type, `int`. We will have a variable of `int` type with a value `1`. If we wrote `int i{1.4}`,
it is not a very sane thing to do. There are some possibilities:
- we wanted to write double, but wrote int by mistake
- we wanted to write 14 but typed an extra dot at the wrong place
- other equally weird scenarios, none of them is "exactly what we want"
It is useful if the compiler gives us an error in such situations. The brace `{`, `}` syntax for initialisation is therefore somewhat more restrictive, and
should be preferred. It also allows us to set variables to their default initial values:
```c++
unsignedlongcounter{};
```
In the above, we create an unsigned long integer type with its default initial value, $0$. Since we are telling the type directly, this can work, whereas `auto counter{}` would be meaningless, as `auto` infers the type from the initialiser expression. When using explicit typenames to declare a variable, we can completely skip the initial value, e.g., `double x;`. `x` is then created uninitialised. There are special cases where even such an expression will lead to `x` being nicely zero initialised, but it is much better to always initialise with a literal value or with the default value explicitly. It is not recommended to leave the initializer out for built in types. For user defined types, C++ offers so much control that the creater of that user defined type gets to decide whether `UserdefinedType X` should create `X` uninitialised, zero initialised or initialised to any other sensible value.
%% Cell type:markdown id: tags:
## Using variables in expresions
If you write an expression using variable names, when evaluating (finding the value of) that expression, the variable names are substituted by their values.
Let's also note here that the symbol `=` in C++ means "assign". `X = Y` means take the value of the entity on the right and put it in the object with the name `X`. We are not asking whether `X` is equal to `Y`. For that comparison, like in many other languages, C++ uses `==`. If `X` is an integer, `5 == X` is always a valid expression, which is either true or false, but `5 = X` is syntactically meaningless, because you can not give a new value to $5$.
%% Cell type:markdown id: tags:
Quiz: What does `X = X + 1` mean ?
A. X cancels out on both sides and we have 0 equals 1
B. Whatever value X had before, increase it by 1
%% Cell type:markdown id: tags:
**Aside:** Historically, the choice of `=` for assignment (so that we have to type `==` when we want to check if two values are equal) goes back to a time before keyboard inputs and the Fortran language. It is a small shift to get used to, and no other token proposed for this very important operation, `:=`, `->`, `<-` etc. consist of a single character. In a typical C++ program, you will need assignment far more often than equality comparison, so that most programmers support this notation.
%% Cell type:markdown id: tags:
## Static typing
C++ is a "statically typed" language. Every time you bring a variable to existence to store something for a while, the compiler must be given a "type" for the values to be stored in it. In C++, the type is associated with a variable at the point it enters the code, through the "declaration". Throughout the life of that variable, it has the same type. It is not possible for a variable to change from being a string to being an integer. This point is important to understand for people who have programmed in python and are learning C++. The following is perfectly acceptable in python:
```python
# Python code to make a contrast
v="1"+"2"+"3"
print(v)
v=int(v)
```
If you run the above in a Jupyter notebook with a python3 kernel, you can verify the type of the variable `v` after each statement, using `type(v)` after each statement. You will see that `v` becomes an `int` after the last line above, whereas it starts out being a `str`. Such a thing will never be allowed in C++. The type of an entity is set when you declare it. It can not be changed by assignment. Allowing that would make reasoning about properties of a variable extremely hard: the type of a variable in one given line of code might end up depending on the code path taken to reach a particular line. To emphasize it further, consider this:
```python
# Python code to make a contrast
v="1"+"2"+"3"
foriinrange(10):
print(v)
iflen(v)>7:
v=int(v)
v=v+v
```
How the variable `v` is changed in the line `v = v + v` depends on the type of `v`. If it is a string, "123", it will become "123123", by concatenation. If is an integer, 123, it will become 246. We have to consider the immediate history of the variable to know what will actually happen in the line `v = v + v`. Statically typed languages avoid this entire category of problems by making types of variables constants. In any C++ code, a variable which starts out as an object of type T will end its life as an object of type T.
%% Cell type:markdown id: tags:
Quiz: Does the variable declaration with `auto` affect the fact that C++ has compile time fixed types?
%% Cell type:markdown id: tags:
The answer is no. Because in every version of the variable declaration with `auto` you saw above, the compiler can infer and set a unique type for the object you create. The `auto` keyword just spares the human programmer from typing everything by hand, but `double X{1.5};` and `auto X{1.5};` are completely equivalent.
%% Cell type:markdown id: tags:
### Type promotion in expressions
The usual `+`, `-`, `*` and `/` operators can be used with the integral and floating point number types. For integral types, `/` is the integer division and `%` is the remainder operator. If the operands to the arithmetic operation have the same type, i.e., `int + int`, the result also has the same type. If the operands have different types, the smaller *type* is promoted to the larger *type*, so that the result ends up having the larger type (see below). Floating point types are to be considered "larger" for this purpose.
%% Cell type:code id: tags:
``` C++
auto m = 99;
auto y{23}, z{44};
auto x = y * z + m; // y * z + m is an int, like y, z and m. Therefore, x is an int
auto low = 10;
auto high = 13;
auto lq = high / low; // lq is of type int, and a value 1
auto lr = high % low; // lr is of type int and value 3
auto BIGVAL = 5000'000'000UL; // unsigned long
auto fq = 13.0 / low; // fq is type double and value 1.3
float f = 1.2;
auto v = f * BIGVAL; // v is a float
```
%% Cell type:code id: tags:
``` C++
low & high
```
%% Cell type:markdown id: tags:
The above result is because `low` is 1010 in binary, and `high` is 1101, and `low & high` is **bitwise and** operation on the two integers. Similarly there is *bitwise or*`|`, *bitwise exclusive or*`^` and *bitwise not*`~`.
**Exercise:** Use the showbits() function from earlier to print out the bits of `low`, `high` and `low & high`, to understand what the bitwise `&` operator did. Do the same for the other bitwise operators.
%% Cell type:markdown id: tags:
There are also logical operators working on boolean values *true* and *false*, `&&`, `||`, `^^` and `!`. For variables which are not of type `bool`, when they are used with these logical operations, they are converted to a true/false value. Typically, values were all bits are zero are interpreted as false, and every other value is regarded as true. Guess the output of the following cell before executing it!
%% Cell type:code id: tags:
``` C++
// This is valid code, although the cling interpreter seems to have problems with it.
low && high
```
%% Cell type:markdown id: tags:
Logical operators in C++ can be written alternatively as `and`, `or` etc. For example, `a && b` could be written as `a and b`, `a or b` can be written as `a or b`, for greater readability. Full table for alternative tokens can be found [here](https://en.cppreference.com/w/cpp/language/operator_alternative).
%% Cell type:code id: tags:
``` C++
low and high
```
%% Cell type:markdown id: tags:
### Fused assignment and arithmetic/bitwise operators
It is possible to shorten `a = a + b` to `a += b`, `a = a - b` to `a -= b` and so on for all arithmetic operators. The same can be done for bitwise operators, i.e., `a = a & b` can be written as `a &= b` and so on. By contrast, there is no fused assignment for logical `&&`, `||` operators.
%% Cell type:code id: tags:
``` C++
low &= high ;
low
```
%% Cell type:markdown id: tags:
As we have discussed before, `==` serves as the comparison for equality in C++. The expression `a == b` is either true or false, depending on the values of `a` and `b`. The expression `a = b` is quite different: this means "store the value of b in a \[converting the value to the type of a if necessary\]". Other comparison operators are `!=` for "not equal to", and `<`, `>`, `<=`, `>=`, for "less than", "greater than", "less than or equal to" and "greater than or equal to" respectively.
%% Cell type:markdown id: tags:
### Increment and decrement operators
Like C, the operators `++` and `--` are used to indicate increment and decrement of a given variable, respectively. Both operators can be placed before or after the variable name for built in numeric types, `int`, `long`, `short`, `char` (and unsigned counterparts), `double`, `float` and `long double`. For user defined types, the meaning of the operator is decided by the implementor of that type. In all cases, there is a slight difference in the meaning between the prefix (e.g., `++i`) and postfix (e.g., `i++`) forms. To understand that, we have to start recognising "value of an expression" and "side effects of an expression".
`2 * x` is an expression involving the variable `x`. If `x` is currently `1`, `2 * x` has a *value* of `2`. When we evaluate the expression `2 * x` there are no side effects on `x`. After we have evaluated `2 * x`, `x` is still `1`. But, `x = 4` is also an expression. Its value is the value of `x` after the assignment, i.e., `4`. But it also has a side effect. Whatever `x` was before, it has now changed to a value of `4`. Prefix and postfix `++` (or `--`) operators have the same side effect. They both increase a variable by 1. But they have different values. The value of a prefix `++` operator is the already incremented value, whereas the value of the postfix operator is the old value. With examples:
```c++
autoi{0};
autok=++i;// i is now 1, but so is k
cout<<i<<"\t"<<k<<"\n";
autol=i++;// i is now 2, but l is 1
cout<<i<<"\t"<<l<<"\n";
```
%% Cell type:code id: tags:
``` C++
{
auto i{0};
auto k = ++i; // i is now 1, but so is k
cout << i << "\t" << k << "\n";
auto l = i++; // i is now 2, but l is 1
cout << i << "\t" << l << "\n";
}
```
%% Cell type:markdown id: tags:
- Pre-in(de)crement : In(de)crease and use the new value
- Post-in(de)crement : In(de)crease, but use the old value
Typically, the implementation of the post-increment or post-decrement operators involves more steps (e.g., make a backup, change, return the backup value) than the prefix versions (change, return the changed value). It is therefore a good habbit to cultivate to always write `++i` by default, and only do `i++` when you have a compelling reason to do so.
**Aside:** The name "C++", for our language, is therefore not the best choice! It means, in its own syntax, "increment C, and then use the old C"! It should have been "++C"!