import numpy
Think Vector
Dot product
Many compute kernels include loops. Let's take the dot (or scalar) product, for example, for two vectors v and w:
We can write this as
# Create a random number generator that uses the Mersenne-Twister bit generator.
rng = numpy.random.Generator(numpy.random.MT19937())
v = rng.uniform(size=1000)
w = rng.uniform(size=1000)
s = 0
for i in range(len(v)):
s += v[i] * w[i]
print("v·w = %.3f" % s)
We can also think of this as a map operation followed by a reduction:
Map
A map operation applies an operation (a function) to each element of an array separately.
# Maps can be expressed using list comprehensions
vw = numpy.array([v[i] * w[i] for i in range(len(v))])
Reduce
A reduction takes the elements of an array and reduces them to a single value. A typical example of a reduction is the summation over all elements.
# Now let's do the reduction
s = 0
for i in range(len(v)):
s += vw[i]
We didn't have to use numpy.array for these operations.
print("v·w = %.3f" % s)
We are using numpy arrays, which define a number of operations on whole arrays, so we can think of them as vectors. The two operations above can be rewritten.
Map-Reduce with NumPy
Map-reduce is a very common pattern. We first transform the elements of an array and then reduce the resulting array to a single element.
# Map
vw = v * w
# Reduce
s = vw.sum()
print("v·w = %.3f" % s)
The dot product is such a common operation that numpy has a special function for it.
# Everything in one call
s = v.dot(w)
print("v·w = %.3f" % s)
Since Python 3.6 this can also be written as v@w
.
Exercise:
Let's go back through the cells and and add the %%timeit
magic at the top of each cell. Start with cell 3 and work your way back down.
The last one is not just the simplest, it's also the fastest.
ufunc
Functions that act on one array (or several arrays of the same shape) and return a vector of the same shape are called ufuncs
. When we wrote vw = v * w, we executed the ufunc \mul\. Functions, like dot
that have a different output shape than input shape are called generalized ufuncs
.
Stencils
%matplotlib inline
import matplotlib.pyplot as plt
Stencils use the value of the neighboring elements to update the current position, e.g.,
A system with fixed boundaries
Let's look at an example. We start with a 2d array of random values and fix the left boundary to a value of 0 and the right boundary to a value of 1. We do not want to change these boundary values. The top and bottom boundaries are connected so that our system forms a cylinder (periodic boundary conditions along y).
A_orig = numpy.random.random((10, 10))
A_orig[:, 0] = 0
A_orig[:, -1] = 1
A = A_orig.copy()
B = A.copy()
Explicit nested for loop
for i in range(A.shape[0]):
for j in range(1, A.shape[1] - 1):
B[i, j] = 0.25 * (A[(i + 1) % A.shape[0], j] + A[i - 1, j]
+ A[i, (j + 1)] + A[i, j - 1])
plt.subplot(1, 3, 1)
plt.imshow(A, interpolation="nearest")
plt.subplot(1, 3, 2)
plt.imshow(B, interpolation="nearest")
plt.subplot(1, 3, 3)
plt.imshow(numpy.abs(A-B), interpolation="nearest")
print("|A-B| = %.3f" % numpy.linalg.norm(A-B))
Now, we assign B to A and repeat. Note that |A-B| becomes smaller with each iteration unless you happen to run into an oscillating minimum.
A = B.copy()
The operation for each i and j is independent since A is only read. This is an ideal candidate for parallel calculations and we can write it in form of vector operations.
Element-wise operation
A = A_orig.copy()
B = numpy.empty_like(A)
B = 0.25*(numpy.roll(A, 1,axis=0) + numpy.roll(A, -1, axis=0) + numpy.roll(A, 1,axis=1) + numpy.roll(A, -1,axis=1))
# Keep our boundaries at a fixed value
B[:, 0] = 0
B[:, -1] = 1
plt.subplot(1, 3, 1)
plt.imshow(A, interpolation="nearest")
plt.subplot(1, 3, 2)
plt.imshow(B, interpolation="nearest")
plt.subplot(1, 3, 3)
plt.imshow(numpy.abs(A-B), interpolation="nearest")
print("|A-B| = %.3f" % numpy.linalg.norm(A-B))
Exercises:
a) use %%timeit
to compare the for
-loop and the numpy.roll()
versions of the stencil update
b) Combine all the step of a single iteration in one cell, execute it multiple times, and watch how the system proceeds. Does the norm of the difference get smaller?
Tip: If you use <Ctrl+Enter> to execute the cell, the cursor stays on the cell and you can execute again right away.
c) Create a loop that terminates if the sum of the differences becomes smaller than some small value epsilon or the maximum number of allowed iterations has been reached. Plot the final state.
Programming exercise Mandelbrot
The Mandelbrot set is the set of points c in the complex plane for which
does not diverge.
The series diverges if for any i.
Since it is impracticable to calculate an infinite number of iterations, one usually sets an upper limit for the number of iterations (the maximum of i ), for example, 20. To get pretty pictures, we can map the value of i for which is true for the first time to a color.
Interlude complex numbers
For real numbers is not defined, but if we just imagine (i.e., define) it to be we get complex numbers.
Complex numbers have two components a real and an imaginary part. The imaginary part is marked by i, e.g., 0 + 0i, 1 + 0.5i, 0.4 + 2i, or 0 + 3.14i. You can think of these as coordinates in a plane where the x-axis represents the real and the y-axis represents the imaginary part.
Complex numbers in a plane
C = numpy.array([0 + 0j, 1 + 0.5j, 0.4 + 2j, 3.14j])
plt.scatter(C.real, C.imag)
plt.xlabel("Real")
ax = plt.ylabel("Imaginary")
Complex arithmetic
If we want to add two complex numbers, we add the real and imaginary parts separately just like we would do for vectors, e.g.,
(1 + 0.5i) + (0.57 + 3.14i) = (1.57 + 3.64i).
Multiplying to complex numbers is more interesting:
(1 + 0.5i)(0.57 + 3.14i) = 0.57 + 3.14i + 0.285i + 1.57i2.
Since this becomes 0.57 + 3.14i + 0.285i - 1.57 = -1 + 3.425i.
Lucky for us Python knows how to work with complex numbers. It uses 1j
to represent the imaginary number, e.g., c = 0.4 + 2j
.
print(f"(1 + 0.5j) + (0.57 + 3.14j) = {(1 + 0.5j) + (0.57 + 3.14j):.3f}\n(1 + 0.5j) * (0.57 + 3.14j) = {(1 + 0.5j) * (0.57 + 3.14j):.3f}.")
In short, we can use complex numbers just like any other numerical type. Here is a function that calculates the series and return the iteration at which :
Escape time algorithm
def escape_time(p, maxtime):
"""Perform the Mandelbrot iteration until it's clear that p diverges
or the maximum number of iterations has been reached.
Parameters
----------
p: complex
point in the complex plane
maxtime: int
maximum number of iterations to perform before p is considered in
the Mandelbrot set.
"""
z = 0j # This is a complex number in Python
for i in range(maxtime):
z = z ** 2 + p
if abs(z) > 2:
return i
return maxtime
M = [[escape_time(re + im, 40) for re in numpy.linspace(-2.2, 1, 640)] for im in numpy.linspace(-1.2j, 1.2j, 480)]
plt.imshow(M)
Now, it's your turn. Rewrite the above algorithm using NumPy vector operations.
Hints:
You can use
numpy.meshgrid()
to generate your 2D array of points.If you have
x = numpy.array([-1.1, -1, 0, 1.2])
andy = numpy([-0.5j, 0j, 0.75j])
and callXX, YY = numpy.meshgrid(x, y)
, it returns two arrays of shape 3 by 4. The first one contains 3 rows where each row is a copy of x. The second one contains 4 columns where each colomn is a copy of y.Now you can add those two array to get points in the complex plane.
P = XX + YY
.You somehow need to mask the points that already diverged in future iterations.
You don't have to put this in a function