Skip to content
Snippets Groups Projects
Select Git revision
  • aef32cc8d1db8e437a3fdb69c546955be2e0438e
  • master default protected
  • enxhi_issue460_remove_TOAR-I_access
  • michael_issue459_preprocess_german_stations
  • sh_pollutants
  • develop protected
  • release_v2.4.0
  • michael_issue450_feat_load-ifs-data
  • lukas_issue457_feat_set-config-paths-as-parameter
  • lukas_issue454_feat_use-toar-statistics-api-v2
  • lukas_issue453_refac_advanced-retry-strategy
  • lukas_issue452_bug_update-proj-version
  • lukas_issue449_refac_load-era5-data-from-toar-db
  • lukas_issue451_feat_robust-apriori-estimate-for-short-timeseries
  • lukas_issue448_feat_load-model-from-path
  • lukas_issue447_feat_store-and-load-local-clim-apriori-data
  • lukas_issue445_feat_data-insight-plot-monthly-distribution
  • lukas_issue442_feat_bias-free-evaluation
  • lukas_issue444_feat_choose-interp-method-cams
  • 414-include-crps-analysis-and-other-ens-verif-methods-or-plots
  • lukas_issue384_feat_aqw-data-handler
  • v2.4.0 protected
  • v2.3.0 protected
  • v2.2.0 protected
  • v2.1.0 protected
  • Kleinert_etal_2022_initial_submission
  • v2.0.0 protected
  • v1.5.0 protected
  • v1.4.0 protected
  • v1.3.0 protected
  • v1.2.1 protected
  • v1.2.0 protected
  • v1.1.0 protected
  • IntelliO3-ts-v1.0_R1-submit
  • v1.0.0 protected
  • v0.12.2 protected
  • v0.12.1 protected
  • v0.12.0 protected
  • v0.11.0 protected
  • v0.10.0 protected
  • IntelliO3-ts-v1.0_initial-submit
41 results

postprocessing_plotting.py

Blame
  • 02_NumPy_concepts.ipynb 12.21 KiB

    NumPy - an HPC perspective

    19 Nov 2020 | Olav Zimmermann

    Python is an interpreted language and as such it is extremely flexible, allowing to define everything, including code itself, at runtime. This entails that the Python interpreter uses a lot of magic behind the scenes (e.g. type inferencing) to keep things as simple and productive as possible for the programmer. This flexibility comes at the price of a markedly reduced runtime speed compared to compiled languages such as C++. Many problems encoded in Python do not require the full range of flexibility and could easily trade it for improved runtime performance without giving up the development performance which is the key for Python's success.

    Background

    NumPy is a library, written in C, to enable fast numerical computing in Python. Since its inception in 1995, NumPy has become first choice for numeric data processing in python. The main sacrifice to enable its high speed is the restriction to containers with uniform memory layout, i.e., arrays that contain items of uniform static datatype.

    Despite this apparently severe restriction NumPy is quite versatile and a full introduction into NumPy would require a course of its own. Hence we will focus on the concepts that allowed its wide adoption by developers and HPC vendors alike. For an overview of the available functionality the numpy website https://www.numpy.org is a good starting point.

    Note: NumPy is not a part of the vanilla Python distribution and needs to be installed separately. Then it can be used like any other Python library.

    The numpy BLAS and LAPACK bindings

    NumPy does not implement everything itself. Instead it uses the fact that with BLAS (low level routines), and LAPACK (high level routines, based on BLAS), two widely adopted interfaces for linear algebra routines exist and that for almost all distributions and operation systems at least one library suporting the BLAS/LAPACK interface exists.

    While not mandatory, NumPy can be linked against high performance linear algebra libraries such as Intel's MKL, AMD's BLIS or IBM's ESSL folowing the BLAS/LAPACK APIs. These libraries can give you speed-ups of up to two orders of magnitude compared to the standard LAPACK library. In fact a number of distributions provide numpy implementations that are statically linked to Intel's MKL and when building, NumPy gives a preference to the optimized libraries (MKL, BLIS) over the slower, more hardware independent libraries (OpenBLAS, ATLAS).

    Notes:

    • Not all routines of MKL are necessarily faster than those in OpenBLAS, e.g. random number generation.
    • By default NumPy prefers Intel MKL over AMD BLIS, even on AMD processors. MKL works with AMD processors, but by default does not use the fast AVX2 and AVX512 BLAS routines that it uses when it has detected Intel hardware but only the much slower SSE2 routines. While there are hacks to change that behaviour it may therefore be more reasonable to change the order of preference for the different BLAS libraries on AMD platforms.

    You can check which BLAS library has been linked during your build of NumPy by calling the show_config() function.

    import numpy as np
    print(np.__version__)
    print(np.show_config())

    ndarray

    The main datatype in NumPy is an n-dimensional array object called ndarray. By default it has C-array memory layout (last index fastest, 0 based indices) and a uniform fixed element type defined at instantiation of the ndarray. Element sizes range from 1 byte (bool, uint8) to 16 byte (complex256) with the default being float64.

    introspection, shape

    Several meta data of an ndarray can be obtained among them a.ndim,a.size,a.nbytes,a.flags,type(a). The array interface enables in-depth introspection such as obtaining the address of an object, e.g. for debugging purposes.

    myarr=np.array([[1,2,3],[4,5,6]])
    print(f" myarr is a {myarr.ndim}-dimensional {type(myarr)} of type {myarr.dtype} and shape {myarr.shape}. It starts at adress {hex(myarr.__array_interface__['data'][0])}.")
    print(f"Each of its {myarr.size} {myarr.dtype}-elements has a size of {myarr.dtype.itemsize} byte therefore it has {myarr.nbytes} bytes in total and its strides (in bytes) are {myarr.strides}.")
    print("The flags stored for this array are:")
    print(myarr.flags)

    NumPy's ndarray as a foundation

    ndarrays support important concepts to enable vectorized processing of data, such as broadcasting, ufuncs, and reductions. Why just use this for numeric data? Almost anything can be cast into an ndarray and also labeled access is possible.

    dt=np.dtype([('name','S10'),('income',np.int16),\
                 ('revenue',([('q1',np.float32),('q2',np.float32),('q3',np.float32),('q4',np.float32)]))])
    f=np.loadtxt('test.dat',dtype=dt, skiprows=2,delimiter=',')
    print(f)
    print(f"this array has {f.ndim} dimension(s)")

    However, note that most functionality in NumPy is geared towards arrays of uniform numerical type. The above data type indicates that the entire row is treated as a single (albeit very complicated) type and the data in the table thereby become a one-dimensional array. If you need spreadsheet-like functionality, NumPy's 'structured arrays' or 'record type' arrays that are based on ndarrays are not adequate. Luckily ndarrays can be extended in functionality and even reimplemented in a compatible way.

    Design features of the ndarray API

    Due to its dual interface in Python and C and its often (not always) regular memory layout ndarrays can relatively easy be mapped to data types described in other language contexts.

    A couple of clever design choices adds to the popularity of ndarrays as a standard API for high performance computing in Python:

    • narrays support both C-style and Fortran-style memory layouts, as well as memory alignment and view interfaces.
    • A Mixin class NDArrayOperatorsMixin provides all the necessary standard python interfaces like __add__, __lt__ etc. making custom array containers more convenient to write.
    • ndarray.__array_ufunc__ and __array_function__ are interfaces that split the function API from the excution engine. By overriding them, these functions allow to intercept calls to ufuncs and non-ufuncs and to dispatch these calls to alternative implementations. Several important packages already use this approach.

    Packages that implement the ndarray interface:

    package/moduleextension(s)other HPC relevant characteristics
    cuPycuPy array: subset of ndarray functionalityndarrays on GPUs
    pandaspandas dataframe: labeled data, non-numerical data, multi-indexing, groupingfast CSV-parser
    cuDFcuDF dataframe: subset of pandas dataframe functionalitypandas dataframes on GPUs
    daskdask array: only subset of ndarray functionalitytiled ndarrays larger than main memory, distributed processing on multiple nodes
    daskdask dataframe: only subset of pandas dataframe functionalitytiled dataframes larger than main memory, distributed processing on multiple nodes
    dask-cuDFcuDF dataframe: subset of pandas dataframe functionalitytiled dataframes on multiple GPUs and multiple nodes
    sparsendarray functionality on sparse arrays (COO layout)

    Notable packages that do not implement the ndarray interface are numpy.matrix, scipy.sparse, pytorch, JAX while tensorflow has an experimental interface.

    End