From a987984274eedfd30ef722543fc7dd632291cc4d Mon Sep 17 00:00:00 2001
From: Sandipan Mohanty <s.mohanty@fz-juelich.de>
Date: Thu, 11 May 2023 09:29:52 +0200
Subject: [PATCH] Add day 4 notebooks

---
 notebooks/Containers.ipynb | 669 +++++++++++++++++++++++++++++++++++++
 notebooks/STLMisc.ipynb    | 353 +++++++++++++++++++
 2 files changed, 1022 insertions(+)
 create mode 100644 notebooks/Containers.ipynb
 create mode 100644 notebooks/STLMisc.ipynb

diff --git a/notebooks/Containers.ipynb b/notebooks/Containers.ipynb
new file mode 100644
index 0000000..12398da
--- /dev/null
+++ b/notebooks/Containers.ipynb
@@ -0,0 +1,669 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "exciting-honey",
+   "metadata": {},
+   "source": [
+    "# Containers from the standard template library\n",
+    "\n",
+    "There are many ways to represent \"collections\" of objects in C++ : names of people, all prime numbers up to an integer, tables of merchandise with the associated prices ... The C++ standard library already gives us containers to cover a large number of use cases. \n",
+    "\n",
+    "All containers in C++ have a compile time fixed element type. The type of the elements is uniform through the whole the container. That type is highly customizable, but it can not vary from element to element, and it can not be changed after creating the container. This is because the element type becomes part of the name of the container-type as you will see.\n",
+    "\n",
+    "## `std::vector`\n",
+    "\n",
+    "The most frequently used container is `std::vector` representing a sequence of values stored consecutively in memory.\n",
+    "To use it, we need to include the header `vector`, i.e., write `#include <vector>` at the top somewhere. After that, we can create variables of the `vector` type."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "rubber-queue",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#pragma cling add_include_path(\"/p/project/training2312/local/include\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "emotional-texture",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <iostream>\n",
+    "#include <vector>\n",
+    "// ...\n",
+    "std::vector V{1, 2, 3, 4, 5, 6, 7, 8, 9};"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "lightweight-expression",
+   "metadata": {},
+   "source": [
+    "The elements of a vector are accessed using the indexing operator `[]`. The first element has index 0, the second has index 1 etc."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "contemporary-percentage",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::cout << V[0] << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "subjective-afghanistan",
+   "metadata": {},
+   "source": [
+    "The size of the vector is available using the member function `size()`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "purple-taylor",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::cout << \"There are \" << V.size() << \" elements in the vector\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "entire-rochester",
+   "metadata": {},
+   "source": [
+    "## Exercise:\n",
+    "You know (i) how to access an element. (ii) how many elements there are. (iii) how to write a for loop. \n",
+    "\n",
+    "Write a `for` loop to print all the elements of the vector."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aging-province",
+   "metadata": {},
+   "source": [
+    "For a non-constant `std::vector` object, element access using the indexing operator `[]` produces an L-value reference. It is possible to use, for example, `V[1]` on the left hand side in an assignment expression, although `V[1]` is not a name, but the result of an expression. The class `vector` is designed in such a way that the indexing produces an L-value reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "genuine-interface",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for (auto i = 0UL; i < V.size(); ++i) {\n",
+    "    V[i] = i * i * i;\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "phantom-lyric",
+   "metadata": {},
+   "source": [
+    "A very common pattern with vectors, or for that matter with any C++ container, is to perform an action for each element. This is perfectly suited for the \"range based for loop\"."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "heated-paint",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for (auto&& element : V) {\n",
+    "    std::cout << element << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "blank-narrow",
+   "metadata": {},
+   "source": [
+    "The range based for loop can be used when, what we want to do with the elements depends on the elements but does not depend on the index explicitly. Like in this example. Using `auto&&` to declare the loop variable, so that for each loop iteration, we create and work with a reference to an existing element in the vector rather than creating a copy. If we really need a copy, one can simply replace the `auto&&` with an `auto`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dried-cliff",
+   "metadata": {},
+   "source": [
+    "Above, when we created the vector with `std::vector V{1,2,3,4,5,6,7,8,9}`, we made a `vector` of `int` elements of size 9, storing the values 1 through 9. The type of the elements stored in a vector variable, like `V` here, is fixed at compile time. In this case it is `int` because we initialized it with a sequence of `int` variables. If the sequence of values in the initializer does not have a uniform type, we have an error!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "pressed-madness",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// std::vector notV{1, 2.0, 3}; // If you uncomment this, you may have to restart the kernel"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "coupled-stationery",
+   "metadata": {},
+   "source": [
+    "Once initialized like that, `V` can only store `int`s. Before C++17, one had to explicitly state the type of the elements, something like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "approved-ethiopia",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::vector<int> precxx17_V{1, 2, 3, 4, 5, 6, 7, 8, 9};"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "agricultural-yesterday",
+   "metadata": {},
+   "source": [
+    "That syntax is still valid, and is still useful in C++20, e.g., when you want to create a vector of integers without filling it with anything.\n",
+    "\n",
+    "The further down the memory lane you decide to go, the uglier the syntax you have to work with:\n",
+    "```c++\n",
+    "std::vector<int> precxx11_V(9);\n",
+    "for (int i=0; i<9; ++i) precxx11_V[i] = i;\n",
+    "```\n",
+    "Let's stay in the present."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "selective-tyler",
+   "metadata": {},
+   "source": [
+    "The angular brackets in typenames like `vector` contain the so called **template arguments**. In this instance, it specifies what type of element our vector is going to contain. Even when we do not explicitly specify the element type, like for the vecotr `V` above, it is still there. The compiler is doing some extra work and filling out some text for us: it looks at the initializer list, deduces the type of the elements, augments our attempt to initialize `std::vector` by replacing that with `std::vector<T>` with `T` being the inferred type of the elements. You can see that it has created vectors of exactly the same type for `V` and `precxx17_V` above: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "covered-ballet",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "typeof(V);\n",
+    "typeof(precxx17_V);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "tracked-algebra",
+   "metadata": {},
+   "source": [
+    "The `std::allocator<int>` you see in the output above is a utility `std::vector` uses to manage dynamic memory. The important point to remember is that the following are equivalent.\n",
+    "    \n",
+    "```c++\n",
+    "std::vector A{1, 2, 3, 4, 5};\n",
+    "std::vector<int> B{1, 2, 3, 4, 5};\n",
+    "```\n",
+    "The true name is in both cases, `std::vector<int, std::allocator<int> >`.\n",
+    "    \n",
+    "The compiler just does a bit more for us in the more recent language versions. The compiled binary code is identical. Consider the two following declarations: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "median-spirit",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::vector A{1, 2, 3};\n",
+    "std::vector B{1.0, 2.0, 3.0};\n",
+    "typeof(A);\n",
+    "typeof(B);"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "lovely-avenue",
+   "metadata": {},
+   "source": [
+    "Superficially, we specified only `std::vector` as the types for both `A` and `B`. But there is no such type as just plain `std::vector`. `std::vector` is what is called a class template. We should be writing the full name of the type as printed above. That could get tiring very quickly! That's why the compiler helps us. The mechanism by which the compiler infers the type of the elements of a vector here is called **Class Template Argument Deduction**, or CTAD. All standard library containers are CTAD enabled. Later in the course, you will learn how to teach the compiler how to do CTAD for your own hand written classes.\n",
+    "\n",
+    "CTAD can not work unless we have a non-empty initializer list. To create empty vectors, we have to explicitly write what type of elements a vector will contain. That can not be left to be determined at program execution time.\n",
+    "\n",
+    "**Take away:** there is always a fixed element type, even when you don't write it explicitly, and it is burried in the true name of the vector (or any other standard container)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "interesting-alexandria",
+   "metadata": {},
+   "source": [
+    "Although the type of the elements is a compile time constant, the size of a `vector` is not. They can grow at program execution time. In the following example, we create a vector of integers, and fill it with with new values in a loop as long as the new value is less than 1000. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "tired-lincoln",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::vector<double> nums; // empty vector, but the element type is double\n",
+    "auto curr = 1.;\n",
+    "while (curr < 1000.) {\n",
+    "    nums.push_back(curr);\n",
+    "    curr *= 1.1;\n",
+    "}\n",
+    "std::cout << \"Content of the nums vector ...\\n\";\n",
+    "for (auto&& m : nums) std::cout << m << \"\\n\";\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dated-elder",
+   "metadata": {},
+   "source": [
+    "Above, we iterated over the final `vector` using a range based for loop. We can do it in two other ways. One is of course the plain `for` loop over valid indexes. The other is using the so called \"iterators\". Iterators are usually defined for the container types. You can imagine them to be like little arrows which can point to one element along the sequence at any time. If you increment (decrement) the iterator, it moves to point to the next(previous) element. To see the element an iterator points to, you \"dereference\" an iterator. Containers usually have `begin()` and `end()` defined, so that `container.begin()` gives us an iterator pointing to the first element in the sequence, and `container.end()` gives us an iterator pointing to a non-existing element one past the last element in the sequence. They can be used like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "olympic-scotland",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for (auto it = nums.begin(); it != nums.end(); ++it) {\n",
+    "    std::cout << *it << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aerial-budapest",
+   "metadata": {},
+   "source": [
+    "Containers define their own iterator types, and it is possible to iterate over the elements using the iterators. `std::vector` is special in the sense that it also gives us indexed access to elements, so that a plain `for` loop is possible.\n",
+    "\n",
+    "```c++\n",
+    "    for (auto i = 0UL; i < nums.size(); ++i)\n",
+    "        std::cout << nums[i] << \"\\n\";\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "optimum-assumption",
+   "metadata": {},
+   "source": [
+    "We saw how to insert elements at the back of a vector. It is also possible to insert anywhere in the middle. To insert into a vector, we need an iterator to mark a position to the right of the insertion point. Say, we want to insert a new element $-1$ into our first vector `V`, before the first element. We can get an iterator to the first element using the function `V.begin()`. The insertion will look like this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "mexican-works",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "V.insert(V.begin(), -1);"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "union-checkout",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// To verify...\n",
+    "for (auto&& elem: V) std::cout << elem << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "confused-bulgarian",
+   "metadata": {},
+   "source": [
+    "How about inserting $100$ before the 5th element ? The fifth element is reached by advancing the iterator four times from the beginning. So, ... "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "italian-happiness",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "auto pos = V.begin();\n",
+    "std::advance(pos, 4);\n",
+    "V.insert(pos, 100);"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "sixth-generic",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// To verify...\n",
+    "for (auto&& elem: V) std::cout << elem << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "gross-london",
+   "metadata": {},
+   "source": [
+    "## `std::array`\n",
+    "\n",
+    "`std::array` is the proper way to make **fixed length arrays** in C++. In many ways it is like `std::vector`, but it does not have any operations which can change its size. No `push_back`, no insertions or deletions. Not just that, its size must be known to the compiler. This is enforced by requiring that the size is a template parameter, i.e., goes inside the angular brackets. Anything that goes into those angular brackets must be known at compilation time. This is how we create arrays in C++:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "executed-candy",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <array>\n",
+    "std::array<double, 10> masses; // presumably to store masses of 10 entities."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "inclusive-exemption",
+   "metadata": {},
+   "source": [
+    "If we have an explicit initializer list, the CTAD mechanism discussed above can infer both the type and the number of elements in an array. So, in real life, we can write the above as:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "helpful-albany",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::array masses{0.1, 0.2, 0.1, 0.1, 0.1, 0.8, 0.1, 0.2, 0.1, 0.1};"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "valid-drain",
+   "metadata": {},
+   "source": [
+    "Like a vector, you can access elements using the `[]` indexing operator, and the size using the `size()` member function. The 3 ways of iterating over a vector described above work for `std::array` as well."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "institutional-sauce",
+   "metadata": {},
+   "source": [
+    "## `std::list`\n",
+    "Not all containers allow index access. For instance, `std::list` is a doubly linked linked-list type. Let's create a linked list and a vector, fill them both with the same elements, and then compare the locations in memory of the elements in both cases:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ethical-attachment",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <list>\n",
+    "#include <vector>\n",
+    "\n",
+    "std::vector<double> vd;\n",
+    "std::list<double> ld;\n",
+    "for (auto x = 0.0; x < 1.0; x += 0.1) {\n",
+    "    vd.push_back(x);\n",
+    "    ld.push_back(x);\n",
+    "}\n",
+    "std::cout << \"Elements and their memory locations for vector: \\n\";\n",
+    "for (auto&& element : vd) std::cout << element << \"\\t\" << (size_t)(&element) << \"\\n\";\n",
+    "std::cout << \"Elements and their memory locations for list: \\n\";\n",
+    "for (auto&& element : ld) std::cout << element << \"\\t\" << (size_t)(&element) << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dangerous-aggregate",
+   "metadata": {},
+   "source": [
+    "You see that the elements of the vector are placed very regularly, in memory locations separated by 8 bytes, which is the size of a double. This means, the elements are stored in adjacent locations in memory with no gaps. In the case of the list, the elements are stored scattered in different areas of memory. The design of the list is very very different than the vector, although the standard library is written in such a way that code can be written in a generic way so as to work for both. Ask in the class for an in-depth discussion of what linked lists are, if you are curious. For now, just note that providing an indexing operator `[]` for a list would be possible, but very slow.\n",
+    "\n",
+    "For most part, you can pretend that `std::list<T>` does not exist and use `std::vector<T>` for all your needs for sequencial containers. For small arrays whose size does not change very often, e.g., an array storing the names of the week days, the number of neighbour sites in a lattice etc., one could use `std::array`, to emphasize the fact that the number of elements in those arrays is not going to change.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "sorted-spare",
+   "metadata": {},
+   "source": [
+    "## Associative containers `std::set`, `std::map`, `std::unordered_set`, `std::unordered_map`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "forward-western",
+   "metadata": {},
+   "source": [
+    "The containers `std::set` and `std::unordered_set` store one instance of each element. For instance, imagine that you are trying to generate a list of all the people who logged in to a computer during a day, irrespective of how often they logged in. You start with a vector or list or each username from the log. Just put all of them in an `std::set`. It will ignore a new insertion if that element is already there, and add it if it isn't."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ready-neighborhood",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <set>\n",
+    "std::vector<std::string> allusernames{\"gandalf\", \"cirdan\", \"galadriel\", \"gandalf\", \"arwen\", \"gandalf\"};\n",
+    "std::set<std::string> uniquenames;\n",
+    "for (auto&& elem : allusernames) uniquenames.insert(elem);\n",
+    "for (auto&& elem: uniquenames) std::cout << elem << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "floppy-tablet",
+   "metadata": {},
+   "source": [
+    "Notice that not only does `std::set` keep only unique entries, it sorts them too. `std::set` manages to keep track of unique elements by always maintaining a sorted sequence of elements. `std::unordered_set` is similar in that it also keeps only unique elements, but (shockingly!) it does not keep them in order! In the cell below, repeat the experiment above using `std::unordered_set` rather than `std::set`!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "certified-skiing",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// Repeat the above, but with std::unordered_set instead of std::set\n",
+    "#include <unordered_set>\n",
+    "std::vector<std::string> allusernames{\"gandalf\", \"cirdan\", \"galadriel\", \"gandalf\", \"arwen\", \"gandalf\"};\n",
+    "std::unordered_set<std::string> uniquenames;\n",
+    "for (auto&& elem : allusernames) uniquenames.insert(elem);\n",
+    "for (auto&& elem: uniquenames) std::cout << elem << \"\\n\";"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "hollywood-cliff",
+   "metadata": {},
+   "source": [
+    "The containers `std::map` and `std::unordered_map` are key -> value mappings. For instance, if we want to keep track of scores in a multi-player game. Conceptually, we will have something like\n",
+    "\n",
+    "| Team       | Points |\n",
+    "|-----------:|-------:|\n",
+    "| Griffindor | 203 |\n",
+    "| Hufflepuff | 212 |\n",
+    "| Ravenclaw  | 350 |\n",
+    "| Slytherin  | 363 |\n",
+    "\n",
+    "This is a mapping from strings to integers. If we had a vector where the indexes were strings instead of integral numbers, we could write something like `v[\"Griffindor\"] += 160`. That is the role of the containers `std::map` and `std::unordered_map`.\n",
+    "\n",
+    "```c++\n",
+    "std::map<std::string, int> score;\n",
+    "score[\"Griffindor\"] = 203;\n",
+    "score[\"Hufflepuff\"] = 212;\n",
+    "score[\"Ravenclaw\"] = 350;\n",
+    "score[\"Slytherin\"] = 363;\n",
+    "```\n",
+    "Let's try that ..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "amino-category",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <map>\n",
+    "#include <iostream>\n",
+    "std::map<std::string, int> score;\n",
+    "score[\"Griffindor\"] = 203;\n",
+    "score[\"Hufflepuff\"] = 212;\n",
+    "score[\"Ravenclaw\"] = 350;\n",
+    "score[\"Slytherin\"] = 363;\n",
+    "\n",
+    "for (auto&& [key, value] : score) {\n",
+    "    std::cout << key << \"\\t:\\t\" << value << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "veterinary-shanghai",
+   "metadata": {},
+   "source": [
+    "Notice that we created the `map` empty, and simply started assigning to particular keys. `map` is designed in such a way that when you access a key that was not previously present, you insert that key with the default value of the value type (0 for numeric types). We don't need to resize it beforehand or use special append operations like `push_back` for `vector`. This is convenient, but that comes at a price: if you now try to give Griffindor 160 more points and misspell ..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "introductory-queens",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "score[\"griffindor\"] += 160;\n",
+    "for (auto&& [key, value] : score) {\n",
+    "    std::cout << key << \"\\t:\\t\" << value << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "julian-speaking",
+   "metadata": {},
+   "source": [
+    "`map` simply assumed there is a new key called \"griffindor\" and created a new entry for it. Sometimes, we want to be strict about whether or not keys exist and only modify content rather than inserting a new element. You can do that by using the `at` function rather than the `[]` indexing operator in the lines where you want to prevent insertion..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "incoming-closing",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <map>\n",
+    "std::map<std::string, int> score{{\"Griffindor\", 203}, {\"Hufflepuff\", 212}, {\"Ravenclaw\", 350}, {\"Slytherin\", 363}};\n",
+    "\n",
+    "score.at(\"Griffindor\") += 160; // This is OK\n",
+    "// score.at(\"griffindor\") += 160; // This will throw an exception\n",
+    "for (auto&& [key, value] : score) {\n",
+    "    std::cout << key << \"\\t:\\t\" << value << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "verbal-messaging",
+   "metadata": {},
+   "source": [
+    "As with the `std::set`, `std::map` sorts its contents by the keys. Notice how we initialized the `map` with some key value pairs above. One should think of containers like `std::map` as a sequence of (key, value) pairs, and that is how we initialized above. In our range based for loop, we also get, for each position along the sequence a pair as the current element. The pair is represented by `std::pair`, and its parts are actually called `first` and `second`. Before C++17, one had to write the range for loop above a bit differently:\n",
+    "\n",
+    "```c++\n",
+    "for (auto&& entry : score) {\n",
+    "    std::cout << entry.first << \"\\t:\\t\" << entry.second << \"\\n\";\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "The syntax is still valid, but the newer syntax `auto [key, value] : score` allows us to choose what we want to call the components, so that we can attach better names which suit our context.\n",
+    "\n",
+    "For a small map like the above, the entire content is easily reviewed. If on the other hand, we had a map (or any of the other associative containers)  with 10000 entries, and we want to check if it contains a particular key, we can do that as follows:\n",
+    "\n",
+    "```c++\n",
+    "score.contains(\"Ravenclaw\"); // This will give you true or false\n",
+    "```\n",
+    "\n",
+    "Unfortunately, we can not try that in this interpreter, as the `contains` function was only introduced in C++20.\n",
+    "\n",
+    "Before C++20, we used to check for the existence of a particular key as follows. It makes if you think about it, but the new `contains` function is a more direct expression of intent."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "heard-forum",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "score.find(\"Ravenclaw\") != score.end()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "heard-inside",
+   "metadata": {},
+   "source": [
+    "As you can imagine, `std::unordered_map` can also be used in place of the `map` in the above, except that it will not keep its contents sorted. The unordered map is a type of container commonly known as hash tables, whereas the `map` is based on a B-tree. The python equivalent is a dictionary.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "supreme-claim",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "YourCXX",
+   "language": "c++",
+   "name": "yourcxx"
+  },
+  "language_info": {
+   "codemirror_mode": "c++",
+   "file_extension": ".c++",
+   "mimetype": "text/x-c++src",
+   "name": "c++"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/notebooks/STLMisc.ipynb b/notebooks/STLMisc.ipynb
new file mode 100644
index 0000000..d61068c
--- /dev/null
+++ b/notebooks/STLMisc.ipynb
@@ -0,0 +1,353 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "29ef6566",
+   "metadata": {},
+   "source": [
+    "# More Standard library utilities"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c38a3cb",
+   "metadata": {},
+   "source": [
+    "## Random Number generation\n",
+    "\n",
+    "The C++ standard library has a fairly elaborate and sophisticated machinery for generation of pseudo-random numbers. These are in the header `<random>`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cd20cf95",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#pragma cling add_include_path(\"/p/project/training2312/local/include\")\n",
+    "#include <iostream>\n",
+    "#include <random>\n",
+    "#include <map>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "73e9b17a",
+   "metadata": {},
+   "source": [
+    "Conceptually, the random number generation has two important components:\n",
+    "\n",
+    "    - A mathematical function to generate random uniformly distributed random bits\n",
+    "    - A mechanism to convert the randomly generated bits into random numbers distributed according to some distribution\n",
+    "    \n",
+    "The first part is the job of what are known as random \"engines\". These are mathematical sequences (therefore not really \"random\") where \n",
+    "\n",
+    "    - The correlation between subsequent values, second neighbour values, third, fourth ... neighbour values are (should be) all zero. (Independent)\n",
+    "    - The probability distribution of the generated values is the same along the generated sequence (Identically distributed)\n",
+    "    \n",
+    "The literature on pseudo-random number generation is quite rich with many high quality sequences. The C++ standard library random engines are based on *peer reviewed publications* on the subject at the time of the standardization of the random numbers in 2011. The available engines vary a lot in speed and randomness. The documentation of the available engines can be found at [the cppreference site](https://en.cppreference.com/w/cpp/numeric/random).\n",
+    "\n",
+    "Unless you have special requirements based on your research area, you could use the default random engine of your compiler vendor, imaginatively named `std::default_random_engine`. I have used the Mersenne Twister engine, available as `std::mt19937_64`, scientific research work on off-lattice all-atom Monte Carlo simulations of proteins for over 17 years. So, let's try it!\n",
+    "\n",
+    "How does one create a random number engine ?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c9bb7038",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::default_random_engine engine;"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "50f34f6c",
+   "metadata": {},
+   "source": [
+    "Now, just call this `engine` as if it were a function, with no arguments..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4ad955b0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "engine()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "263a6643",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// again!\n",
+    "engine()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "da4fd69f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// and again!\n",
+    "engine()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a2dc2ed1",
+   "metadata": {},
+   "source": [
+    "Every time you call `engine`, you get as the answer, the next number in its pseudo-random number sequence. The output is a 64 bit integer with 64 random bits. With this in hand, one could make a function so that instead of getting these long integers, we get uniformly distributed real numbers between 0 and 1. From there one can imagine generating different kinds of interesting distributions... normal distributions, Poisson distributions, Student's T distribution ... But that work has already been done for you!\n",
+    "\n",
+    "Here is the code you need to write to get an adaptor that converts the uniform random bits we got from the engine into 64 bit integers (`long`) distributed according to the Poisson distribution, with a mean $\\mu=9.5$.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "add4c131",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "const double mu = 9.5;\n",
+    "std::poisson_distribution<long> poisson{mu};"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a8a55864",
+   "metadata": {},
+   "source": [
+    "Our random variate with a Poisson distribution is now generated by calling `poisson` with the `engine` as its input!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2db2def8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "poisson(engine)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "113e5b34",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// again\n",
+    "poisson(engine)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ac50cab9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// and again!\n",
+    "poisson(engine)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f68e8f04",
+   "metadata": {},
+   "source": [
+    "Is it any good ? Let's create a histogram by using `std::map`, generate 100000 values and compare with what's expected from the Poisson distribution..."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "efad520e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::map<long, unsigned long> count; // our histogram\n",
+    "\n",
+    "const auto N = 1000000UL;\n",
+    "for (auto i = 0UL; i < N; ++i) {\n",
+    "    auto r = poisson(engine); \n",
+    "    // every time we execute the above, r will get a different value.\n",
+    "    // Those different values will be distributed according to the Poisson distribution. \n",
+    "    ++count[r]; // Increment the count for r\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e6a96459",
+   "metadata": {},
+   "source": [
+    "We should compare the result with the Poisson distribution, $P(n) = \\frac{\\mu^n}{n!} \\exp(-\\mu)$. The first part, $\\frac{\\mu^n}{n!}$ can be calculated as $\\frac{\\mu}{n}\\times\\frac{\\mu}{n-1}\\times\\dots\\frac{\\mu}{1}$. The exponential can then be multiplied later."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a5ce457f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "// For comparison, we need the Poisson distribution function mapping (n, mu) -> prob\n",
+    "double poisfunc(long n, double mu) {\n",
+    "    double ans{1.0};\n",
+    "    while (n > 0) {\n",
+    "        ans *= (mu/n);\n",
+    "        --n;\n",
+    "    }\n",
+    "    ans *= std::exp(-mu);\n",
+    "    return ans;\n",
+    "}\n",
+    "// Although I strongly recommend that you use the new C++ function style, i.e.,\n",
+    "// auto poisfunc(long n, double mu) -> double\n",
+    "// the Jupyter-cling interpreted environment chokes and dies upon seeing that.\n",
+    "// Let's be kind to this tool here. But, when your tools can do real C++, you should\n",
+    "// prefer the new syntax."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c20a155a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::cout << \"Outcome\\tFrequency\\tProbability\\tExpected\\n\";\n",
+    "for (auto&& [outcome, frequency] : count) {\n",
+    "    std::cout << outcome << \"\\t\" << frequency << \"\\t\" \n",
+    "        << frequency * 1.0 / N << \"\\t\" << poisfunc(outcome, mu) << \"\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0ba5ad3c",
+   "metadata": {},
+   "source": [
+    "For those of you who are not familiar with random experiments, and are worried about the \"discrepancies\" above: the experiment above is akin to the following. You know that a balanced coin toss should have 50-50 chance for heads and tails. If you do the experiment 10 times, do you *always* get 5 heads and 5 tails ? No. You might get 7-3 result once, 2-8 another time 5-5 another and so on. As the number of experiments becomes very very large, your results approach the theoretical values more and more. We performed one set of experiments above. Repeating it again will give us another set of frequencies. The larger the sample count, the more our observed frequenies resemble the theoretical expectations. Try generating $1000000$ samples instead of $1000$ and see if the numbers are closer!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1d71e6dc",
+   "metadata": {},
+   "source": [
+    "You can now imagine what it would be like to use Mersenne twister engine `std::mt19937_64` or the Ranlux engine `std::ranlux_48` instead of your system default. You can also imagine what you would do if, instead of a Poisson distribution, you need a Normal distribution (`std::normal_distribution`), binomial distribution (`std::binomial_distribution`), Chi squared distribution (`std::chi_squared_distribution`), Gamma distribution, Cauchy distribution etc. No library other than what is built into the C++ standard library is required for any of this."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6eb54088",
+   "metadata": {},
+   "source": [
+    "## Time measurements\n",
+    "\n",
+    "How long did it take our generator above to generate 1 million random numbers ? We frequently need to do such measurements, and the standard library has the required tools in [the chrono library](https://en.cppreference.com/w/cpp/header/chrono). Let's illustrate by timing our above mentioned loop and then explain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1333f30d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#include <chrono>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0816b36e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "{\n",
+    "    std::map<long, unsigned long> count; // our histogram\n",
+    "\n",
+    "    auto N = 1000000UL;\n",
+    "    auto t0 = std::chrono::steady_clock::now();\n",
+    "    for (auto i = 0UL; i < N; ++i) {\n",
+    "        auto r = poisson(engine); \n",
+    "        // every time we execute the above, r will get a different value.\n",
+    "        // Those different values will be distributed according to the Poisson distribution. \n",
+    "        ++count[r]; // Increment the count for r\n",
+    "    }\n",
+    "    auto t1 = std::chrono::steady_clock::now();\n",
+    "    std::cout << \"Generation of distributed random numbers took \"\n",
+    "        << std::chrono::duration<double>(t1 - t0).count() << \" seconds\\n\";\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c50f7b82",
+   "metadata": {},
+   "source": [
+    "There are different kinds of clocks in the `chrono` library. For most of your purposes, you can use the steady clock. The syntax is intuitive. The `now()` function gives us a time point. Subtracting them gives us a time duration, for which we chose `double` as the representation. The default unit for durations is seconds. So, we got our value in seconds. If we wanted to output milliseconds, we would do this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1af0b341",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "std::map<long, unsigned long> count; // our histogram\n",
+    "\n",
+    "auto N = 1000000UL;\n",
+    "auto t0 = std::chrono::steady_clock::now();\n",
+    "for (auto i = 0UL; i < N; ++i) {\n",
+    "    auto r = poisson(engine); \n",
+    "    // every time we execute the above, r will get a different value.\n",
+    "    // Those different values will be distributed according to the Poisson distribution. \n",
+    "    ++count[r]; // Increment the count for r\n",
+    "}\n",
+    "auto t1 = std::chrono::steady_clock::now();\n",
+    "std::cout << \"Generation of distributed random numbers took \"\n",
+    "    << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << \" milliseconds\\n\";"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ad5932f7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "YourCXX",
+   "language": "c++",
+   "name": "yourcxx"
+  },
+  "language_info": {
+   "codemirror_mode": "c++",
+   "file_extension": ".c++",
+   "mimetype": "text/x-c++src",
+   "name": "c++"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
-- 
GitLab