Skip to content
Snippets Groups Projects
Commit 8ef8f334 authored by Jan Meinke's avatar Jan Meinke
Browse files

Wrap up notebook

parent a01794c8
Branches main
No related tags found
No related merge requests found
%% Cell type:markdown id:4d4d8c6e-50c4-412a-9d09-c8e9aa8c5c52 tags:
# Miscellaneous
<div class="dateauthor">
14 June 2024 | Olav Zimmermann
</div>
%% Cell type:markdown id:5291e2ec-7168-40c1-8348-4495f1126940 tags:
## Python as an HPC language
%% Cell type:markdown id:b7a6c503-493e-45e9-b9b7-ec170561ddfe tags:
### benefits
- established as an alternative to C++, Fortran, etc. for many HPC use cases
- maintains high development performance
- speed improvements without learning another language
- sometimes module replacement just works
- many ways to start and expand
- many HPC tools/frameworks for Python are open source
- not only more speed but also higher (energy) efficiency
%% Cell type:markdown id:726e2aa8-ce42-430f-a418-138627e0b5d4 tags:
### no free lunch: challenges
- many tricks of the trade
- not implemented, implemented differently
- parallel: flops/byte bottleneck, overhead, deadlocks, resilience
- task based computing: debugging
- mixed language: environment, not many tools (debuggers, profilers)
- licenses, longevity, security
- infrastructure access: heterogeneity, schedulers, resilience
%% Cell type:markdown id:4a01e44c-d039-4a11-a5dd-420676807709 tags:
## Things we did not cover but which may be are worth looking at...
## Tools:
- Profilers: [scalene](https://github.com/plasma-umass/scalene), [Intel Advisor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/advisor.html)
- Debuggers: [Linaro DDT]( https://docs.linaroforge.com/24.0.1/html/forge/ddt/get_started_ddt/python_debugging.html)
- Parallelisation frameworks: [Joblib](https://joblib.readthedocs.io/en/stable/), [Ray](https://docs.ray.io/en/latest/), [RAPIDS](https://rapids.ai/), [Legion](https://legion.stanford.edu/)([cuNumeric](https://github.com/nv-legate/cunumeric))
- Combinations: [ipython on MPI](https://ipyparallel.readthedocs.io/en/latest/reference/mpi.html), [Dask on CUDA](https://docs.rapids.ai/api/dask-cuda/stable/), [Dask on Ray](https://docs.ray.io/en/latest/ray-more-libs/dask-on-ray.html), etc.
- [Pandas](https://pandas.pydata.org/) and its HPC derivatives: [cuDF](https://github.com/rapidsai/cudf), [modin](https://github.com/modin-project/modin), [vaex](https://vaex.io/)
- ML frameworks and derivatives: [TensorFlow](https://www.tensorflow.org/), [PyTorch](https://pytorch.org/), [JAX](https://github.com/google/jax), [HeAT](https://github.com/helmholtz-analytics/heat) (=Helmholtz Analytics Toolkit)
- IDEs: [VS Code](https://code.visualstudio.com/), [PyCharm](https://www.jetbrains.com/pycharm/), etc.
%% Cell type:markdown id:c6984029-51b8-4186-a30f-6a3a00d59e54 tags:
## big data:
### have a look into:
- indexing, hashing, precalculation for random access
- compression, memory mapped files for faster data availability
- [HDF5](https://docs.h5py.org/en/stable/), [NetCDF](https://github.com/Unidata/netcdf4-python), [SIONlib](https://www.fz-juelich.de/en/ias/jsc/services/user-support/jsc-software-tools/sionlib), [MPI-I/O](https://mpi4py.readthedocs.io/en/stable/tutorial.html#mpi-io) for parallel file access
- see also https://www.fz-juelich.de/en/ias/jsc/education/training-courses/training-materials/course-material-parallel-i-o-and-portable-data-formats
- scalable database management systems for complex data (many with python API):
- object-relational *,
- array *,
- graph *,
- in-memory *,
- key-value stores,
- object *
- etc. (*= database management system)
%% Cell type:markdown id:7416f9c5-5919-4192-ae7a-30f6fb9eda72 tags:
## perhaps the next big waves on the HPC Python horizon:
- [Python 3.13](https://docs.python.org/3.13/whatsnew/3.13.html) (due June 2024) will allow to build it either with no GIL or with a JIT compiler
- Python compilers and compiled Python-like languages: [codon](https://github.com/exaloop/codon), [bend](https://github.com/HigherOrderCO/Bend), [Taichi](https://github.com/taichi-dev/taichi), [Mojo](https://www.modular.com/mojo)
- AI assisted coding: [ChatGPT](https://realpython.com/chatgpt-coding-mentor-python/), [Github Copilot](https://realpython.com/github-copilot-python/), [HPC-GPT for HPC programming](https://dl.acm.org/doi/fullHtml/10.1145/3624062.3624172)....
- HPC on cloud(s)
There are many HPC Python pages on the web including other courses and tutorials:
https://abpcomputing.web.cern.ch/guides/hpc_python/
%% Cell type:code id:8569bac9-b9fd-4e1f-b398-16f3c474cf3a tags:
``` python
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment