{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Think Vector\n", "\n", "<div class=\"dateauthor\">\n", "20 June 2022 | Jan H. Meinke\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Dot product" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Many compute kernels include loops. Let's take the dot (or scalar) product, for example, for two vectors *v* and *w*:\n", "\n", "$$s = \\sum_{i=0}^n v_i w_i $$\n", "\n", "We can write this as" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import numpy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [ "# Create a random number generator that uses the Mersenne-Twister bit generator.\n", "rng = numpy.random.Generator(numpy.random.MT19937())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "v = rng.uniform(size=1000)\n", "w = rng.uniform(size=1000)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "s = 0\n", "for i in range(len(v)):\n", " s += v[i] * w[i]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "print(\"v·w = %.3f\" % s)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "We can also think of this as a map operation followed by a reduction:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Map" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "A map operation applies an operation (a function) to each element of an array separately." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Maps can be expressed using list comprehensions\n", "vw = numpy.array([v[i] * w[i] for i in range(len(v))])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Reduce" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "A reduction takes the elements of an array and *reduces* them to a single value. A typical example of a reduction is the summation over all elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Now let's do the reduction\n", "s = 0\n", "for i in range(len(v)):\n", " s += vw[i]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "We didn't have to use numpy.array for these operations." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "print(\"v·w = %.3f\" % s)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "We are using numpy arrays, which define a number of operations on whole arrays, so we can think of them as vectors. The two operations above can be rewritten." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Map-Reduce with NumPy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Map-reduce is a very common pattern. We first transform the elements of an array and then reduce the resulting array to a single element." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Map\n", "vw = v * w\n", "# Reduce\n", "s = vw.sum()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "print(\"v·w = %.3f\" % s)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The dot product is such a common operation that numpy has a special function for it." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# Everything in one call\n", "s = v.dot(w)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "print(\"v·w = %.3f\" % s)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Since Python 3.6 this can also be written as ``v@w``." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Exercise:**\n", "\n", "Let's go back through the cells and and add the ``%%timeit`` magic at the top of each cell. Start with cell 3 and work your way back down." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The last one is not just the simplest, it's also the fastest." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## ufunc" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Functions that act on one array (or several arrays of the same shape) and return a vector of the same shape are called ``ufuncs``. When we wrote vw = v * w, we executed the ufunc \\__mul\\__. Functions, like ``dot`` that have a different output shape than input shape are called generalized ufuncs." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Stencils" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Stencils use the value of the neighboring elements to update the current position, e.g., " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "$$a[i, j] = \\frac{1}{4} (a[i - 1, j] + a[i + 1, j] + a[i, j - 1] + a[i, j + 1])$$\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## A system with fixed boundaries" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Let's look at an example. We start with a 2d array of random values and fix the left boundary to a value of 0 and the right boundary to a value of 1. We do not want to change these boundary values. The top and bottom boundaries are connected so that our system forms a cylinder (periodic boundary conditions along y)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "A_orig = numpy.random.random((10, 10))\n", "A_orig[:, 0] = 0\n", "A_orig[:, -1] = 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "A = A_orig.copy()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "B = A.copy()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Explicit nested for loop" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "for i in range(A.shape[0]):\n", " for j in range(1, A.shape[1] - 1):\n", " B[i, j] = 0.25 * (A[(i + 1) % A.shape[0], j] + A[i - 1, j] \n", " + A[i, (j + 1)] + A[i, j - 1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "plt.subplot(1, 3, 1)\n", "plt.imshow(A, interpolation=\"nearest\")\n", "plt.subplot(1, 3, 2)\n", "plt.imshow(B, interpolation=\"nearest\")\n", "plt.subplot(1, 3, 3)\n", "plt.imshow(A-B, interpolation=\"nearest\")\n", "print(\"|A-B| = %.3f\" % numpy.linalg.norm(A-B))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Now, we assign B to A and repeat. Note that |A-B| becomes smaller with each iteration unless you happen to run into an oscillating minimum." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "A = B.copy()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "[Repeat](#Explicit-nested-for-loop)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The operation for each *i* and *j* is independent since *A* is only read. This is an ideal candidate for parallel calculations and we can write it in form of vector operations." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Element-wise operation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "A = A_orig.copy()\n", "B = numpy.empty_like(A)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "B = 0.25*(numpy.roll(A, 1,axis=0) + numpy.roll(A, -1, axis=0) + numpy.roll(A, 1,axis=1) + numpy.roll(A, -1,axis=1))\n", "# Keep our boundaries at a fixed value\n", "B[:, 0] = 0\n", "B[:, -1] = 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "plt.subplot(1, 3, 1)\n", "plt.imshow(A, interpolation=\"nearest\")\n", "plt.subplot(1, 3, 2)\n", "plt.imshow(B, interpolation=\"nearest\")\n", "plt.subplot(1, 3, 3)\n", "plt.imshow(A-B, interpolation=\"nearest\")\n", "print(\"|A-B| = %.3f\" % numpy.linalg.norm(A-B))\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Exercises:**\n", "\n", "a) use `%%timeit` to compare the `for`-loop and the `numpy.roll()` versions of the stencil update\n", " \n", "b) Combine all the step of a single iteration in one cell, execute it multiple times, and watch how the system proceeds. Does the norm of the difference get smaller?\n", "\n", " Tip: If you use <Ctrl+Enter> to execute the cell, the cursor stays on the cell and you can execute again right away.\n", "\n", "c) Create a loop that terminates if the sum of the differences becomes smaller than some small value epsilon or the maximum number of allowed iterations has been reached. Plot the final state." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Programming exercise Mandelbrot" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "The Mandelbrot set is the set of points *c* in the complex plane for which" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "$$z_{i+1} = z_i^2 + c$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "does not diverge.\n", "\n", "The series diverges if $|z_i|>2$ for any *i*." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Since it is impracticable to calculate an infinite number of iterations, one usually sets an upper limit for the number of iterations (the maximum of *i* ), for example, 20. To get pretty pictures, we can map the value of *i* for which $|z_i|>2$ is true for the first time to a color." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "### Interlude complex numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For real numbers $\\sqrt{-1}$ is not defined, but if we just imagine (i.e., define) it to be $i:=\\sqrt{-1}$ we get complex numbers." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Complex numbers have two components a real and an imaginary part. The imaginary part is marked by *i*, e.g., 0 + 0i, 1 + 0.5i, 0.4 + 2i, or 0 + 3.14i. You can think of these as coordinates in a plane where the x-axis represents the real and the y-axis represents the imaginary part. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "### Complex numbers in a plane" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "outputs": [], "source": [ "C = numpy.array([0 + 0j, 1 + 0.5j, 0.4 + 2j, 3.14j])\n", "plt.scatter(C.real, C.imag)\n", "plt.xlabel(\"Real\")\n", "ax = plt.ylabel(\"Imaginary\")\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "### Complex arithmetic" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "If we want to add two complex numbers, we add the real and imaginary parts separately just like we would do for vectors, e.g., " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "tags": [] }, "source": [ "(1 + 0.5i) + (0.57 + 3.14i) = (1.57 + 3.64i)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "Multiplying to complex numbers is more interesting: " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "tags": [] }, "source": [ "(1 + 0.5i)(0.57 + 3.14i) = 0.57 + 3.14i + 0.285i + 1.57i<sup>2</sup>. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" }, "tags": [] }, "source": [ "Since $i^2 = -1$ this becomes 0.57 + 3.14i + 0.285i - 1.57 = -1 + 3.425i." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Lucky for us Python knows how to work with complex numbers. It uses `1j` to represent the imaginary number, e.g., `c = 0.4 + 2j`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [ "print(f\"(1 + 0.5j) + (0.57 + 3.14j) = {(1 + 0.5j) + (0.57 + 3.14j):.3f}\\n(1 + 0.5j) * (0.57 + 3.14j) = {(1 + 0.5j) * (0.57 + 3.14j):.3f}.\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" }, "tags": [] }, "source": [ "In short, we can use complex numbers just like any other numerical type. Here is a function that calculates the series and return the iteration at which $|z| > 2$:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Escape time algorithm" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "def escape_time(p, maxtime):\n", " \"\"\"Perform the Mandelbrot iteration until it's clear that p diverges\n", " or the maximum number of iterations has been reached.\n", " \n", " Parameters\n", " ----------\n", " p: complex\n", " point in the complex plane\n", " maxtime: int\n", " maximum number of iterations to perform before p is considered in \n", " the Mandelbrot set.\n", " \"\"\"\n", " z = 0j # This is a complex number in Python\n", " for i in range(maxtime):\n", " z = z ** 2 + p\n", " if abs(z) > 2:\n", " return i\n", " return maxtime" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "M = [[escape_time(re + im, 40) for re in numpy.linspace(-2.2, 1, 640)] for im in numpy.linspace(-1.2j, 1.2j, 480)]\n", "plt.imshow(M)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Now, it's your turn. Rewrite the above algorithm using NumPy vector operations." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Hints:** \n", "\n", "* You can use `numpy.meshgrid()` to generate your 2D array of points.\n", " \n", " If you have `x = numpy.array([-1.1, -1, 0, 1.2])` and `y = numpy([-0.5j, 0j, 0.75j])` and call `XX, YY = numpy.meshgrid(x, y)`, it returns two arrays of shape 3 by 4. The first one contains 3 rows where each row is a copy of x. The second one contains 4 columns where each colomn is a copy of y.\n", " \n", " Now you can add those two array to get points in the complex plane. `P = XX + YY`.\n", " \n", "* You somehow need to mask the points that already diverged in future iterations.\n", "* You don't have to put this in a function" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "HPC Python 2022 (local)", "language": "python", "name": "hpcpy22" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 4 }