{ "cells": [ { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "# Writing language bindings" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "<div class=\"dateauthor\">\n", "12 June 2024 | Jan H. Meinke\n", "</div>" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "## Why bindings?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "* Use existing optimized code as a library\n", "* Avoid overhead of calling a binary via `popen` and communication via pipes\n", "* Cleaner code" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Python is often used as a \"glue language\" that combines calls to compiled programs and libraries to get the job done. Many libraries provide Python bindings that allows you to call them from Python. Numpy, for example, can call vendor-optimized routines to perform linear algebra operation near peak machine performance, but others don't.\n", "\n", "Python also allows you to call other programs from within Python and pipe the in- and output. This often requires conversion of data, for example, into text and back, which can result in a lot of overhead.\n", "\n", "These are just two cases, when you might want to write your own Python bindings." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "There are wrapper programs such as [swig](http://www.swig.org/), [sip](https://www.riverbankcomputing.com/software/sip/intro), or [binder](http://cppbinder.readthedocs.io/en/latest/) that can help you generate bindings, but you will frequently need to tune the generated wrappers.\n", "\n", "In this notebook we'll use [cffi](https://cffi.readthedocs.io/en/latest/) to wrap a single function with simple data types. We use [Cython][] to wrap more complex C/C++ functions and even C++ classes and compare this with [PyBind11](https://pybind11.readthedocs.io/en/latest/). Finally we use [f2py](https://docs.scipy.org/doc/numpy/f2py/) to generate bindings for a Fortran code and compare it to wrapping the same Fortran code with [Cython][].\n", "\n", "[Cython]: http://cython.org" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "## Preparations\n", "\n", "Before we can look at the bindings, we need to build our libraries. Open a terminal, switch to the tutorial directory and run\n", "\n", "```bash\n", "./build.sh\n", "```\n", "\n", "Wait until the build has finished and then continue with this notebook.\n", "\n", "**Tip:** You can open a terminal from within JupyterLab by going to File->New->Terminal. To get the right environment in a terminal `source $PROJECT_training2421/hpcpy24`." ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "source": [ "## Ctypes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "While ``ctypes`` is a Python standard module it is not very convenient and I'm not going to talk about it. You can look at the [documentation](https://docs.python.org/3/library/ctypes.html) instead if you want to learn more." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Foreign function interface" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Let's start with a simple function signature. This function is in declared in a header file ``text_stats.h`` and it's part of the library ``libtext_stats.so``." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```c\n", "/** Counts the occurences of a string in a file.\n", " * \n", " * @param filename name of file to open\n", " * @param word string to look for in file\n", " *\n", " * @return number of occurences of word in file with filename\n", " */\n", "int word_frequency(char* filename, char* word);\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "A quick way to access this function is to use the module ``cffi``." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Foreign function interface\n", "### Calling word_frequency from Python" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "fragment" }, "tags": [] }, "outputs": [], "source": [ "import cffi" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "fragment" }, "tags": [] }, "outputs": [], "source": [ "ffi = cffi.FFI()\n", "ffi.cdef(\"\"\"\n", " int word_frequency(char* filename, char* word);\n", "\"\"\") # The definition is the same as in the header file.\n", "\n", "TS = ffi.dlopen(\"./code/text_stats/build/libtext_stats.so\")\n", "wc = TS.word_frequency(b\"test.txt\", b\"you\") # Need to use byte type in Python 3." ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "What if word_frequency had been written in Fortran?" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "source": [ "```Fortran\n", "function word_frequency(filename, word)\n", " implicit none\n", " character(len=*), intent(in) :: filename\n", " character(len=*), intent(in) :: word\n", " integer :: word_frequency\n", " ...\n", "end function word_frequency\n", "```" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "source": [ "We can access Fortran functions almost like C functions. The exact function name may differ, though. The default symbol \n", "when compiled with ifort or gfortran is ``word_frequency_``. This can be changed with the option `-fno-underscoring` (gcc) or `-assume nounderscore` (Intel).\n", "\n", "Here, we are using the gfortram. " ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "source": [ "### Exercise\n", "Use the terminal that you used earlier to run `build.sh` or open a new one. Make sure you are in the \n", "tutorial directory. Source `hpcpy24` using `source $PROJECT/hpcpy24`. Change into code/text_stats/ and compile \n", "the file word_frequency.F90 with the following command:\n", "\n", "```bash\n", "gfortran word_frequency.F90 -shared -O2 -o build/libwf.so -fPIC\n", "```\n", "\n", "```bash\n", "nm build/libwf.so | grep word_frequency\n", "```\n", "\n", "to check the symbol.\n", "\n", "Change the cell below to use libwf.so instead of libtext_stats.so. Don't forget to adjust the function declaration.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [ "ffi = cffi.FFI()\n", "ffi.cdef(\"\"\"\n", " int word_frequency(char* filename, char* word);\n", "\"\"\") # The definition is the same as in the header file.\n", "\n", "TS = ffi.dlopen(\"./code/text_stats/build/libtext_stats.so\")\n", "wc = TS.word_frequency(b\"test.txt\", b\"you\") # Need to use byte type in Python 3." ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "If you compiled the library with the option `-fno-underscoring`, you could use the original declaration without underscore with libwf.so.\n", "\n", "**Note**: There is no way to *reload* a library using cffi." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## ISO_C_BINDING" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Fortran 2003 improved the interoperability between Fortran and C with the `iso_c_binding`. It provides data kinds that are C compatible and the `bind` attribute. The function definition can be changed to" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```Fortran\n", "function word_frequency(filename, word) bind(C)\n", " use iso_c_binding\n", " implicit none\n", " character(kind=c_char, len=1), intent(in) :: filename\n", " character(kind=c_char, len=1), intent(in) :: word\n", " integer :: word_frequency\n", " ...\n", "end function word_frequency\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Now, the name of the function will always be `word_frequency`. `bind` takes as optional argument the name under which the function should be known to C: bind(c, name=\"wf\") would let us call the function as `wf(filename, word)` from C (and Python).\n", "\n", "To learn more about CFFI look at it's [documentation](https://cffi.readthedocs.io/en/latest/)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Cython" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%load_ext cython" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Cython generates C code that is compiled to an extension. It can trivially call C (and even C++) \n", "functions, which we can use to write Python bindings. But first an annotated example of a \"normal\"\n", "cython module. The following code will make the function `cysum` available to Cython and Python:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%%cython -a --compile-args=-w\n", "import numpy\n", "cimport numpy # Make C-style calls available\n", "cimport cython\n", "@cython.boundscheck(False) # Turn off boundary checks for array access.\n", "# Functions defined with `def` are visible from Python and Cython\n", "def cysum(numpy.ndarray[numpy.float_t, ndim=1] a): # Define the input type as 1d ndarray of floats\n", " \"\"\"Sum up the elements of a.\n", " \n", " Paramters\n", " ---------\n", " a : ndarray\n", " array to sum over\n", " \n", " Returns\n", " -------\n", " res: float\n", " sum of the elements of a\n", " \"\"\"\n", " cdef float res = 0.0; # Define a C-only variable using `cdef`\n", " cdef int n = len(a)\n", " for i in range(n):\n", " res += a[i]\n", " return res\n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's wrap `word_frequency` using cython. We need to pass compile and link arguments to the call as described in [Adding compiler options][CompilerOptions].\n", "\n", "[CompilerOptions]: ./Speeding%20up%20your%20code%20with%20Cython.ipynb#Adding-compiler-options\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "%%cython -I code/text_stats -L code/text_stats/build -l text_stats\n", "cdef extern from \"text_stats.h\":\n", " cpdef int word_frequency(char* filename, char* word)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "word_frequency(b\"text.txt\", b\"you\") # Need to use byte type in Python 3." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "**Note** Unfortunately, this doesn't work the way it's supposed to *inside a JupyterLab*. Although `-L` should add the path to the library to the search path of the linker, the linker still doesn't find the library. To make it work, I added the path to libtext_stats.so to the `LD_LIBRARY_PATH` when the kernel is loaded." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Adapting types" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The function `word_frequency` takes bytes instead of strings because the C function uses char\\*, which is mapped to bytes in Python 3. We would want a Python function that takes strings, though. We can do this by calling our C function from a Python function within Cython and take care of the argument conversion ourselves." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```Cython\n", "cdef extern from \"text_stats.h\":\n", " cdef int word_frequency(char* filename, char* word)\n", " \n", "def wordfrequency(filename, word):\n", " \"\"\"Counts the occurences of a string in a file.\"\"\"\n", " # We first need to decode the strings\n", " filenameb = filename.encode('UTF-8')\n", " wordb = word.encode('UTF-8')\n", " # Now we can convert them to C strings\n", " cdef char* filenamec = filenameb\n", " cdef char* wordc = wordb\n", " # And finally pass them to our C function\n", " return word_frequency(filenamec, wordc)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The function wordfrequency takes two Python strings and encodes them to get the proper byte representation that is then passed to the original C function. A complete implementation with doc string and compiler arguments can look like this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%%cython -I code/text_stats -L code/text_stats/build -ltext_stats\n", "cdef extern from \"text_stats.h\":\n", " cdef int word_frequency(char* filename, char* word)\n", " \n", "def wordfrequency(filename, word):\n", " \"\"\"Counts the occurences of a string in a file.\n", "\n", " Paramters\n", " ---------\n", " filename: string\n", " name of file to open\n", " word: string\n", " string to look for in file\n", "\n", " Returns\n", " -------\n", " ct: int \n", " number of occurences of word in file with filename\n", " \"\"\"\n", " # We first need to encode the strings\n", " filenameb = filename.encode('UTF-8')\n", " wordb = word.encode('UTF-8')\n", " # Now we can convert them to C strings\n", " cdef char* filenamec = filenameb\n", " cdef char* wordc = wordb\n", " # And finally pass them to our C function\n", " return word_frequency(filenamec, wordc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "wordfrequency(\"text.txt\", \"you\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "### Wrapping Fortran" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We can do almost the same thing as above, to wrap the Fortran function `word_frequency`. We don't have a header file, so we skip that part:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%%cython -Lcode/text_stats/build -lwf\n", "cdef extern:\n", " cdef int word_frequency_(char* filename, char* word)\n", " \n", "def wordfrequency2(filename, word):\n", " \"\"\"Counts the occurences of a string in a file.\"\"\"\n", " # We first need to decode the strings\n", " filenameb = filename.encode('UTF-8')\n", " wordb = word.encode('UTF-8')\n", " # Now we can convert them to C strings\n", " cdef char* filenamec = filenameb\n", " cdef char* wordc = wordb\n", " # And finally pass them to our C function\n", " return word_frequency_(filenamec, wordc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "wordfrequency2(\"text.txt\", \"you\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If we cannot/don't want to change our original source code by adding `bind` for example or using the kinds from iso_c_binding or we don't have access to the source code in the first place, we can write a wrapper in Fortran that includes the binding. Look [here](http://www.fortran90.org/src/best-practices.html#interfacing-with-c) for an example." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Wrapping C++" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We can use Cython to wrap C++ as well. Let's start with a simple 3d point class." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```c++\n", "#pragma once\n", "\n", "#include <vector>\n", "\n", "class Point3D{\n", "public:\n", " Point3D(const double x, const double y, const double z);\n", " Point3D(const std::vector<double> r);\n", " void translate(const double dx, const double dy, const double dz);\n", " void translate(const std::vector<double> dr);\n", " const std::vector<double> coordinates();\n", "private:\n", " double _x, _y, _z;\n", "};\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "For starters, I'm only going to wrap `Point3D(x, y, z)`, `translate(dx, dy, dz)`, and `coordinates()`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Cython needs to know that we are dealing with C++ now. Usually, this is added to the setup.py or .pxd file, but it can be done in a notebook cell, too. That's what the second line in the following cell is doing.\n", "\n", "First, you need to import the header and then define the functions that should be made available to Cython. Since these are methods of a class, you define that too using the **cppclass** keyword.\n", "\n", "An important part of C++ is the standard library. Cython comes with a number of prepared wrappers, for example, for std::vector. These are imported from libcpp." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "%%cython -Icode/point -Lcode/point/build -lpoint\n", "# distutils: language = c++\n", "from libcpp.vector cimport vector\n", "\n", "cdef import from \"point3d.h\":\n", " cdef cppclass Point3D:\n", " Point3D(double x, double y, double z)\n", " void translate(double dx, double dy, double dz)\n", " vector[double] coordinates()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "As before, so far these functions are only available to Cython. To make them available to Python, you have to write a wrapper. There are two new functions that are used to deal with classes: initialization is done in \\_\\_cinit\\_\\_ and there's a corresponding destructor function called \\_\\_dealloc\\_\\_." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "By convention a pointer to the object called `thisptr` is kept as part of the wrapper object." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "%%cython -Icode/point -Lcode/point/build -lpoint\n", "# distutils: language = c++\n", "from libcpp.vector cimport vector\n", "\n", "cdef import from \"point3d.h\":\n", " cdef cppclass Point3D:\n", " Point3D(double x, double y, double z)\n", " void translate(double dx, double dy, double dz)\n", " vector[double] coordinates()\n", " \n", "cdef class PyPoint3D: # This is an extension type (aka cdef class)\n", " cdef Point3D *thisptr\n", " \n", " def __cinit__(self, double x, double y, double z):\n", " self.thisptr = new Point3D(x, y, z)\n", " \n", " def __dealloc__(self):\n", " del self.thisptr" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "You can now construct an object of type PyPoint3D, which keeps a reference to it's copy of Point3D." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "origin = PyPoint3D(0,0,0)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Now, let's wrap the two functions as well." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "%%cython --compile-args=-Icode/point --link-args=-Lcode/point/build --link-args=-lpoint\n", "# distutils: language = c++\n", "from libcpp.vector cimport vector\n", "\n", "cdef import from \"point3d.h\":\n", " cdef cppclass Point3D:\n", " Point3D(double x, double y, double zwf)\n", " void translate(double dx, double dy, double dz)\n", " vector[double] coordinates()\n", " \n", "cdef class PyPoint3D:\n", " cdef Point3D *thisptr\n", " \n", " def __cinit__(self, x, y, z):\n", " self.thisptr = new Point3D(x, y, z)\n", " \n", " def __dealloc__(self):\n", " del self.thisptr\n", " \n", " def translate(self, dx, dy, dz):\n", " \"\"\"Move this point by (dx, dy, dz).\n", " \n", " Paramters\n", " ---------\n", " dx : float\n", " shift along x-axis\n", " dy : float\n", " shift along y-axis\n", " dz : float\n", " shift along z-axis\n", " \"\"\"\n", " self.thisptr.translate(dx, dy, dz)\n", " \n", " def coordinates(self):\n", " \"\"\"Get the coordinates of this point.\n", " \n", " Returns\n", " -------\n", " r : list\n", " coordinates of this point.\n", " \"\"\"\n", " return self.thisptr.coordinates()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Now, let's construct a point, shift it and return its coordinates." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "p = PyPoint3D(1,1,1)\n", "p.translate(-0.5, -0.5, -0.5)\n", "p.coordinates()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [ "t_point_cython = %timeit -o p = PyPoint3D(1,1,1); p.translate(-0.5, -0.5, -0.5);p.coordinates()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## PyBind11" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "A powerful alternative to Cython for writing bindings for C++ code is PyBind11. In contrast to Cython, which adds some additional keywords to Python, PyBind11 is a header-only library that make Python types available in C++ and allows you to write Python bindings in C++. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Let's start with the `word_frequency` example again. This is the PyBind11 code that wraps this function:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```c++\n", "#include <pybind11/pybind11.h>\n", "\n", "extern \"C\" {\n", " #include <text_stats.h>\n", "}\n", "\n", "namespace py = pybind11; // This is purely for convenience\n", "\n", "PYBIND11_MODULE(text_stats, m){\n", " m.doc() = \"Some functions that provide statistical information about a text.\";\n", " m.def(\"word_frequency\", &word_frequency, R\"doc(Counts the occurences of a string in a file.\n", "\n", "Paramters\n", "---------\n", "filename: string\n", " name of file to open\n", "word: string\n", " string to look for in file\n", "\n", "Returns\n", "-------\n", "ct: int \n", " number of occurences of word in file with filename \n", ")doc\");\n", "\n", "}\n", "\n", "```\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "### Compiling the extension" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "This code can be compiled like this:\n", "\n", "```bash\n", "g++ -O3 -shared -fpic -std=c++14 `python3-config --includes` `python -m pybind11 --includes` -I code/text_stats code/text_stats/text_stats_bind.cpp -o text_stats.so `python3-config --cflags --ldflags` -L code/text_stats/build -ltext_stats \n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "skip" }, "tags": [] }, "outputs": [], "source": [ "!g++ -O3 -shared -fpic -std=c++14 `python3-config --includes` `python -m pybind11 --includes` -I code/text_stats code/text_stats/text_stats_bind.cpp -o text_stats.so `python3-config --cflags --ldflags` -L code/text_stats/build -ltext_stats " ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "### Using the extension" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "fragment" }, "tags": [] }, "outputs": [], "source": [ "import text_stats" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "editable": true, "slideshow": { "slide_type": "fragment" }, "tags": [] }, "outputs": [], "source": [ "text_stats.word_frequency(\"text.txt\", \"you\")" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "Note that we didn't have to convert our string at all. It's done automatically by PyBind11." ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "### Wrapping a class with Pybind11" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "notes" }, "tags": [] }, "source": [ "PyBind11 can deal with classes, too. The following code wraps the Point3D class:" ] }, { "cell_type": "markdown", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "```c++\n", "#include <vector>\n", "#include <pybind11/pybind11.h>\n", "#include <pybind11/stl.h>\n", "#include <point3d.h>\n", "\n", "namespace py = pybind11; // This is purely for convenience\n", "\n", "PYBIND11_MODULE(points, m){\n", " m.doc() = \"A collection of functions and objects to deal with 3D points.\";\n", "\n", " py::class_<Point3D>(m, \"Point3D\")\n", " .def(py::init<double, double, double>())\n", " .def(py::init<std::vector<double>>())\n", " .def(\"translate\", py::overload_cast<double, double, double>(&Point3D::translate),\n", " R\"doc(Move this point by (dx, dy, dz).\n", " \n", " Parameters\n", " ----------\n", " dx : float\n", " shift along x-axis\n", " dy : float\n", " shift along y-axis\n", " dz : float\n", " shift along z-axis\n", " )doc\")\n", " .def(\"coordinates\", &Point3D::coordinates,\n", " R\"doc(Get the coordinates of this point.\n", " \n", " Returns\n", " -------\n", " r : list\n", " coordinates of this point.\n", " )doc\")\n", " .def(\"rotate\", &Point3D::rotate, R\"doc(Rotates this point about x, y, and z\n", " \n", " The rotation is performed as if this point was first rotated by alpha around the x-axis, \n", " then rotated by beta around the y-axis, and finally rotated by gamma around the z-axis.\n", " \n", " Parameters\n", " ----------\n", " alpha: float\n", " rotation around x-axis in rad\n", " beta: float \n", " rotation arounx y-axis in rad\n", " gamma: float \n", " rotation around z-axis in rad\n", " )doc\");\n", "}\n", "\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## F2Py" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The methods to wrap Fortran that we looked at so far relied on Fortran's C interface/compatibility. For Fortran 77/90/95 code you can also use `f2py` to generate bindings in a very convenient way. F2Py is distributed together with NumPy." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In Fortran, you can use a module to store the points." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```Fortran\n", "! points.f90\n", "module points\n", "implicit none\n", "\n", "real, allocatable, dimension(:) :: x\n", "real, allocatable, dimension(:) :: y\n", "real, allocatable, dimension(:) :: z\n", "\n", "contains\n", " subroutine init(N)\n", " integer, intent(in) :: N\n", " \n", " allocate(x(N))\n", " allocate(y(N))\n", " allocate(z(N))\n", " \n", " end subroutine init\n", "\n", " subroutine coordinates(idx, tx, ty, tz)\n", " integer, intent(inout) :: idx\n", " real, intent(out) :: tx\n", " real, intent(out) :: ty\n", " real, intent(out) :: tz\n", " tx = x(idx)\n", " ty = y(idx)\n", " tz = z(idx)\n", " end subroutine coordinates\n", "\n", " ...\n", " \n", " subroutine finalize\n", " deallocate (x, y, z)\n", " end subroutine finalize\n", " \n", "end module points\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: Remember that you can use '!' to call programs as if you were doing it from the terminal." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy\n", "numpy.__file__" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "buildlog = !f2py -c code/point/points.f90 -m points_f\n", "print('\\n'.join(buildlog[:8]))\n", "print('...')\n", "print('\\n'.join(buildlog[-1:]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "from points_f import points" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "points.init(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "points.set(1, 1.0, 1.0, 1.0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "points.translate(1, -0.5, -0.5, -0.5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "x, y, z = points.coordinates(1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Note that f2py honors the intent defined in the Fortran module. The subroutine coordinates takes one input value idx and has three \"return arguments\" tx, ty, tz that contain the coordinates of the particle idx. F2py converts this into a Python function that takes one argument and returns a tuple with three values." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Wrapping F77" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Fortran 77 doesn't know anything about intents nor modules, so how can we use f2py to generate nice Python bindings to older Fortran code?\n", "\n", "In Fortran 77, the translate function might be written like this:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```Fortran\n", " subroutine translate(idx, dx, dy, dz, x, y, z, N)\n", " implicit none\n", " integer idx, N\n", " real*8 dx, dy, dz\n", " real*8 x(N), y(N), z(N)\n", " x(idx) = x(idx) + dx\n", " y(idx) = y(idx) + dy\n", " z(idx) = z(idx) + dz\n", " end subroutine translate\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### cf2py comments for better bindings" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "We can use f2py comments to add the intents like this:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "```Fortran\n", " subroutine translate(idx, dx, dy, dz, x, y, z, N)\n", " implicit none\n", " integer idx, N\n", " real*8 dx, dy, dz\n", " real*8 x(N), y(N), z(N)\n", "cf2py intent(in) idx, N\n", "cf2py intent(in) dx, dy, dz\n", "cf2py intent(in,out) x, y, z\n", "cf2py depend(N) x, y, z\n", " x(idx) = x(idx) + dx\n", " y(idx) = y(idx) + dy\n", " z(idx) = z(idx) + dz\n", " end subroutine translate\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The Fortran compiler will ignore the comments, but f2py will use them to generate the proper wrapper." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although f2py is part of NumPy, little work has been done on it to improve support for modern Fortran. This will hopefully [change](https://www.youtube.com/watch?v=56M40Y2jl9Y)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "HPC Python 2024", "language": "python", "name": "hpcpy24" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 4 }