From 551cde26dfb8b42fd2f8ff3de46611a28fe5511c Mon Sep 17 00:00:00 2001 From: Jens Henrik Goebbert <j.goebbert@fz-juelich.de> Date: Wed, 24 Apr 2024 08:41:19 +0200 Subject: [PATCH] update dask --- .../2_dask/1-Introduction-to-Dask.ipynb | 52 - .../2_dask/1_dask_MonteCarloPi.ipynb | 374 ++++++++ .../2_dask/2_dask_example.ipynb | 892 ------------------ 3 files changed, 374 insertions(+), 944 deletions(-) delete mode 100644 day2_hpcenv/4_parallel-programming/2_dask/1-Introduction-to-Dask.ipynb create mode 100644 day2_hpcenv/4_parallel-programming/2_dask/1_dask_MonteCarloPi.ipynb delete mode 100644 day2_hpcenv/4_parallel-programming/2_dask/2_dask_example.ipynb diff --git a/day2_hpcenv/4_parallel-programming/2_dask/1-Introduction-to-Dask.ipynb b/day2_hpcenv/4_parallel-programming/2_dask/1-Introduction-to-Dask.ipynb deleted file mode 100644 index d603fb5..0000000 --- a/day2_hpcenv/4_parallel-programming/2_dask/1-Introduction-to-Dask.ipynb +++ /dev/null @@ -1,52 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "linear-bangkok", - "metadata": {}, - "source": [ - "### HIGH THROUGHPUT COMPUTING WITH DASK\n", - "\n", - "**Organisers:** Alan O’Cais, David Swenson \n", - "**Website:** https://www.cecam.org/workshop-details/1022\n", - "\n", - "**Synopsis:**\n", - "High-throughput (task-based) computing is a flexible approach to parallelisation. It involves splitting a problem into loosely-coupled tasks. A scheduler then orchestrates the parallel execution of those tasks, allowing programs to adaptively scale their resource usage. E-CAM has extended the data-analytics framework Dask with a capable and efficient library to handle such workloads. This workshop will be held as a series of virtual seminars/tutorials on tools in the Dask HPC ecosystem.\n", - "\n", - "**Programme:**\n", - "- 21 January 2021, 3pm CET (2pm UTC): Dask - a flexible library for parallel computing in Python\n", - " - YouTube link: https://youtu.be/Tl8rO-baKuY\n", - " - GitHub Repo: https://github.com/jacobtomlinson/dask-video-tutorial-2020\n", - "\n", - "- 4 February 2021, 3pm CET (2pm UTC): Dask-Jobqueue - a library that integrates Dask with standard HPC queuing systems, such as SLURM or PBS\n", - " - YouTube link: https://youtu.be/iNxhHXzmJ1w\n", - " - GitHub Repo: https://github.com/ExaESM-WP4/workshop-Dask-Jobqueue-cecam-2021-02\n", - "\n", - "- 11 February 2021, 3pm CET (2pm UTC) : Jobqueue-Features - a library that enables functionality aimed at enhancing scalability\n", - " - YouTube link: https://youtu.be/FpMua8iJeTk\n", - " - GitHub Repo: https://github.com/E-CAM/jobqueue_features_workshop_materials" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.5" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/day2_hpcenv/4_parallel-programming/2_dask/1_dask_MonteCarloPi.ipynb b/day2_hpcenv/4_parallel-programming/2_dask/1_dask_MonteCarloPi.ipynb new file mode 100644 index 0000000..d81f977 --- /dev/null +++ b/day2_hpcenv/4_parallel-programming/2_dask/1_dask_MonteCarloPi.ipynb @@ -0,0 +1,374 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Dask local cluster example" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What is Dask? (https://docs.dask.org/en/latest/)\n", + "\n", + "* combine a blocked algorithm approach\n", + "* with dynamic and memory aware task scheduling\n", + "* to realise a parallel out-of-core NumPy clone\n", + "* optimized for interactive computational workloads\n", + "\n", + "-----------------------------------" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### WORKSHOP on DASK - HIGH THROUGHPUT COMPUTING WITH DASK\n", + "\n", + "**Organisers:** Alan O’Cais, David Swenson \n", + "**Website:** https://www.cecam.org/workshop-details/1022\n", + "\n", + "**Synopsis:**\n", + "High-throughput (task-based) computing is a flexible approach to parallelisation. It involves splitting a problem into loosely-coupled tasks. A scheduler then orchestrates the parallel execution of those tasks, allowing programs to adaptively scale their resource usage. E-CAM has extended the data-analytics framework Dask with a capable and efficient library to handle such workloads. This workshop will be held as a series of virtual seminars/tutorials on tools in the Dask HPC ecosystem.\n", + "\n", + "**Programme:**\n", + "- 21 January 2021, 3pm CET (2pm UTC): Dask - a flexible library for parallel computing in Python\n", + " - YouTube link: https://youtu.be/Tl8rO-baKuY\n", + " - GitHub Repo: https://github.com/jacobtomlinson/dask-video-tutorial-2020\n", + "\n", + "- 4 February 2021, 3pm CET (2pm UTC): Dask-Jobqueue - a library that integrates Dask with standard HPC queuing systems, such as SLURM or PBS\n", + " - YouTube link: https://youtu.be/iNxhHXzmJ1w\n", + " - GitHub Repo: https://github.com/ExaESM-WP4/workshop-Dask-Jobqueue-cecam-2021-02\n", + "\n", + "- 11 February 2021, 3pm CET (2pm UTC) : Jobqueue-Features - a library that enables functionality aimed at enhancing scalability\n", + " - YouTube link: https://youtu.be/FpMua8iJeTk\n", + " - GitHub Repo: https://github.com/E-CAM/jobqueue_features_workshop_materials\n", + " \n", + "------------------------------------" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Example problem: Monte-Carlo estimate of $\\pi$\n", + "\n", + "<img src=\"https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif\" width=\"25%\" align=left alt=\"PI monte-carlo estimate\"/>\n", + "\n", + "## Problem description\n", + "\n", + "Suppose we want to estimate the number $\\pi$ using a [Monte-Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods), i.e. obtain a numerical estimate based on a random sampling approach, and that we want at least single precision floating point accuracy.\n", + "\n", + "We take advantage of the fact that the area of a quarter circle with unit radius is $\\pi/4$ and that hence the probability of a randomly chosen point inside a unit square to lie within that circle is $\\pi/4$ as well.\n", + "\n", + "So for N randomly chosen pairs $(x, y)$ with $x\\in[0, 1)$ and $y\\in[0, 1)$ we count the number $N_{circ}$ of pairs that also satisfy $(x^2 + y^2) < 1$ and estimage $\\pi \\approx 4 \\cdot N_{circ} / N$." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Monte-Carlo estimate with NumPy on a single CPU\n", + "\n", + "* NumPy is the fundamental package for scientific computing with Python (https://numpy.org/).\n", + "* It contains a powerful n-dimensional array object and useful random number capabilities." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import numpy" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "def calculate_pi_single(size_in_bytes):\n", + " \n", + " \"\"\"Calculate pi using a Monte Carlo method.\"\"\"\n", + " \n", + " rand_array_shape = (int(size_in_bytes / 8 / 2), 2)\n", + " \n", + " # 2D random array with positions (x, y)\n", + " xy = numpy.random.uniform(low=0.0, high=1.0, size=rand_array_shape)\n", + " \n", + " # check if position (x, y) is in unit circle\n", + " xy_inside_circle = (xy ** 2).sum(axis=1) < 1\n", + "\n", + " # pi is the fraction of points in circle x 4\n", + " pi = 4 * xy_inside_circle.sum() / xy_inside_circle.size\n", + "\n", + " print(f\"\\nfrom {xy.nbytes / 1e9} GB randomly chosen positions\")\n", + " print(f\" pi estimate: {pi}\")\n", + " print(f\" pi error: {abs(pi - numpy.pi)}\\n\")\n", + " \n", + " return pi" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Let's calculate...\n", + "\n", + "Observe how the error decreases with an increasing number of randomly chosen positions!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "%time pi = calculate_pi_single(size_in_bytes=10_000_000) # 10 MB\n", + "%time pi = calculate_pi_single(size_in_bytes=100_000_000) # 100 MB\n", + "%time pi = calculate_pi_single(size_in_bytes=1_000_000_000) # 1 GB" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Are we already better than single precision floating point resolution?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "numpy.finfo(numpy.float32)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## We won't be able to scale the problem to several Gigabytes or Terabytes!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Problems\n", + "\n", + "* slowness of the numpy-only single CPU approach! (we could scale the problem using the [multiprocessing](https://docs.python.org/3.8/library/multiprocessing.html) and/or [threading](https://docs.python.org/3.8/library/threading.html) libraries)\n", + "* frontend/login node compute resources are shared and CPU, memory (and IO bandwidth) user demands will collide" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Monte-Carlo estimate with Dask on multiple CPUs\n", + "\n", + "We define a Dask cluster with 8 CPUs and 24 GB of memory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import dask.distributed" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "cluster = dask.distributed.LocalCluster(\n", + " n_workers=1, threads_per_worker=8, memory_limit=24e9,\n", + " ip=\"0.0.0.0\"\n", + ")\n", + "\n", + "client = dask.distributed.Client(cluster)\n", + "client" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Use dask.array for randomly chosen positions" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import numpy, dask.array" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "def calculate_pi_dask(size_in_bytes, number_of_chunks):\n", + " \n", + " \"\"\"Calculate pi using a Monte Carlo method.\"\"\"\n", + " \n", + " array_shape = (int(size_in_bytes / 8 / 2), 2)\n", + " chunk_size = (int(array_shape[0] / number_of_chunks), 2)\n", + " \n", + " # 2D random positions array using dask.array\n", + " xy = dask.array.random.uniform(\n", + " low=0.0, high=1.0, size=array_shape,\n", + " # specify chunk size, i.e. task number\n", + " chunks=chunk_size )\n", + " \n", + " xy_inside_circle = (xy ** 2).sum(axis=1) < 1\n", + "\n", + " pi = 4 * xy_inside_circle.sum() / xy_inside_circle.size\n", + " \n", + " # start Dask calculation\n", + " pi = pi.compute()\n", + "\n", + " print(f\"\\nfrom {xy.nbytes / 1e9} GB randomly chosen positions\")\n", + " print(f\" pi estimate: {pi}\")\n", + " print(f\" pi error: {abs(pi - numpy.pi)}\\n\")\n", + " display(xy)\n", + " \n", + " return pi" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Let's calculate again...\n", + "Observe the wall time decreases of the 1 Gigabyte and 10 Gigabyte random sample $\\pi$ estimates!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "%time pi = calculate_pi_dask(size_in_bytes=1_000_000_000, number_of_chunks=10) # 1 GB" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "%time pi = calculate_pi_dask(size_in_bytes=10_000_000_000, number_of_chunks=100) # 10 GB" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Let's go larger than memory...\n", + "Because Dask splits the computation into single managable tasks, we can scale up easily!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "%time pi = calculate_pi_dask(size_in_bytes=100_000_000_000, number_of_chunks=250) # 100 GB" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Are we now better than single precision floating point resolution?\n", + "Not at all, if we require an order of magnitude better..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "numpy.finfo(numpy.float32)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## We could increase the local cluster CPU resources...\n", + "However, the above Dask cluster size is always limited by the memory/CPU resources of a single compute node." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# %time pi = calculate_pi(size_in_bytes=1_000_000_000_000, number_of_chunks=2_500) # 1 TB" + ] + } + ], + "metadata": { + "anaconda-cloud": {}, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/day2_hpcenv/4_parallel-programming/2_dask/2_dask_example.ipynb b/day2_hpcenv/4_parallel-programming/2_dask/2_dask_example.ipynb deleted file mode 100644 index 23e4668..0000000 --- a/day2_hpcenv/4_parallel-programming/2_dask/2_dask_example.ipynb +++ /dev/null @@ -1,892 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Dask local cluster example" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## What is Dask? (https://docs.dask.org/en/latest/)\n", - "\n", - "* combine a blocked algorithm approach\n", - "* with dynamic and memory aware task scheduling\n", - "* to realise a parallel out-of-core NumPy clone\n", - "* optimized for interactive computational workloads\n", - "\n", - "-----------------------------------" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Example problem: Monte-Carlo estimate of $\\pi$\n", - "\n", - "<img src=\"https://upload.wikimedia.org/wikipedia/commons/8/84/Pi_30K.gif\" width=\"25%\" align=left alt=\"PI monte-carlo estimate\"/>\n", - "\n", - "## Problem description\n", - "\n", - "Suppose we want to estimate the number $\\pi$ using a [Monte-Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods), i.e. obtain a numerical estimate based on a random sampling approach, and that we want at least single precision floating point accuracy.\n", - "\n", - "We take advantage of the fact that the area of a quarter circle with unit radius is $\\pi/4$ and that hence the probability of a randomly chosen point inside a unit square to lie within that circle is $\\pi/4$ as well.\n", - "\n", - "So for N randomly chosen pairs $(x, y)$ with $x\\in[0, 1)$ and $y\\in[0, 1)$ we count the number $N_{circ}$ of pairs that also satisfy $(x^2 + y^2) < 1$ and estimage $\\pi \\approx 4 \\cdot N_{circ} / N$." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Monte-Carlo estimate with NumPy on a single CPU\n", - "\n", - "* NumPy is the fundamental package for scientific computing with Python (https://numpy.org/).\n", - "* It contains a powerful n-dimensional array object and useful random number capabilities." - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "def calculate_pi_single(size_in_bytes):\n", - " \n", - " \"\"\"Calculate pi using a Monte Carlo method.\"\"\"\n", - " \n", - " rand_array_shape = (int(size_in_bytes / 8 / 2), 2)\n", - " \n", - " # 2D random array with positions (x, y)\n", - " xy = numpy.random.uniform(low=0.0, high=1.0, size=rand_array_shape)\n", - " \n", - " # check if position (x, y) is in unit circle\n", - " xy_inside_circle = (xy ** 2).sum(axis=1) < 1\n", - "\n", - " # pi is the fraction of points in circle x 4\n", - " pi = 4 * xy_inside_circle.sum() / xy_inside_circle.size\n", - "\n", - " print(f\"\\nfrom {xy.nbytes / 1e9} GB randomly chosen positions\")\n", - " print(f\" pi estimate: {pi}\")\n", - " print(f\" pi error: {abs(pi - numpy.pi)}\\n\")\n", - " \n", - " return pi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Let's calculate...\n", - "\n", - "Observe how the error decreases with an increasing number of randomly chosen positions!" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "from 0.01 GB randomly chosen positions\n", - " pi estimate: 3.1451904\n", - " pi error: 0.0035977464102070478\n", - "\n", - "CPU times: user 25 ms, sys: 8.79 ms, total: 33.8 ms\n", - "Wall time: 31.3 ms\n", - "\n", - "from 0.1 GB randomly chosen positions\n", - " pi estimate: 3.14238272\n", - " pi error: 0.0007900664102069577\n", - "\n", - "CPU times: user 224 ms, sys: 44.5 ms, total: 269 ms\n", - "Wall time: 261 ms\n", - "\n", - "from 1.0 GB randomly chosen positions\n", - " pi estimate: 3.141662784\n", - " pi error: 7.01304102070921e-05\n", - "\n", - "CPU times: user 1.94 s, sys: 424 ms, total: 2.37 s\n", - "Wall time: 2.28 s\n" - ] - } - ], - "source": [ - "%time pi = calculate_pi_single(size_in_bytes=10_000_000) # 10 MB\n", - "%time pi = calculate_pi_single(size_in_bytes=100_000_000) # 100 MB\n", - "%time pi = calculate_pi_single(size_in_bytes=1_000_000_000) # 1 GB" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Are we already better than single precision floating point resolution?" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "numpy.finfo(numpy.float32)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## We won't be able to scale the problem to several Gigabytes or Terabytes!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Problems\n", - "\n", - "* slowness of the numpy-only single CPU approach! (we could scale the problem using the [multiprocessing](https://docs.python.org/3.8/library/multiprocessing.html) and/or [threading](https://docs.python.org/3.8/library/threading.html) libraries)\n", - "* frontend/login node compute resources are shared and CPU, memory (and IO bandwidth) user demands will collide" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Monte-Carlo estimate with Dask on multiple CPUs\n", - "\n", - "We define a Dask cluster with 8 CPUs and 24 GB of memory." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "import dask.distributed" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "<div>\n", - " <div style=\"width: 24px; height: 24px; background-color: #e1e1e1; border: 3px solid #9D9D9D; border-radius: 5px; position: absolute;\"> </div>\n", - " <div style=\"margin-left: 48px;\">\n", - " <h3 style=\"margin-bottom: 0px;\">Client</h3>\n", - " <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Client-7f9fc6c4-5433-11ed-8324-3cecef1f6772</p>\n", - " <table style=\"width: 100%; text-align: left;\">\n", - "\n", - " <tr>\n", - " \n", - " <td style=\"text-align: left;\"><strong>Connection method:</strong> Cluster object</td>\n", - " <td style=\"text-align: left;\"><strong>Cluster type:</strong> distributed.LocalCluster</td>\n", - " \n", - " </tr>\n", - "\n", - " \n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Dashboard: </strong> <a href=\"http://134.94.0.100:8787/status\" target=\"_blank\">http://134.94.0.100:8787/status</a>\n", - " </td>\n", - " <td style=\"text-align: left;\"></td>\n", - " </tr>\n", - " \n", - "\n", - " </table>\n", - "\n", - " \n", - " <details>\n", - " <summary style=\"margin-bottom: 20px;\"><h3 style=\"display: inline;\">Cluster Info</h3></summary>\n", - " <div class=\"jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output\">\n", - " <div style=\"width: 24px; height: 24px; background-color: #e1e1e1; border: 3px solid #9D9D9D; border-radius: 5px; position: absolute;\">\n", - " </div>\n", - " <div style=\"margin-left: 48px;\">\n", - " <h3 style=\"margin-bottom: 0px; margin-top: 0px;\">LocalCluster</h3>\n", - " <p style=\"color: #9D9D9D; margin-bottom: 0px;\">7914e7e8</p>\n", - " <table style=\"width: 100%; text-align: left;\">\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Dashboard:</strong> <a href=\"http://134.94.0.100:8787/status\" target=\"_blank\">http://134.94.0.100:8787/status</a>\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Workers:</strong> 1\n", - " </td>\n", - " </tr>\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Total threads:</strong> 8\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Total memory:</strong> 22.35 GiB\n", - " </td>\n", - " </tr>\n", - " \n", - " <tr>\n", - " <td style=\"text-align: left;\"><strong>Status:</strong> running</td>\n", - " <td style=\"text-align: left;\"><strong>Using processes:</strong> True</td>\n", - "</tr>\n", - "\n", - " \n", - " </table>\n", - "\n", - " <details>\n", - " <summary style=\"margin-bottom: 20px;\">\n", - " <h3 style=\"display: inline;\">Scheduler Info</h3>\n", - " </summary>\n", - "\n", - " <div style=\"\">\n", - " <div>\n", - " <div style=\"width: 24px; height: 24px; background-color: #FFF7E5; border: 3px solid #FF6132; border-radius: 5px; position: absolute;\"> </div>\n", - " <div style=\"margin-left: 48px;\">\n", - " <h3 style=\"margin-bottom: 0px;\">Scheduler</h3>\n", - " <p style=\"color: #9D9D9D; margin-bottom: 0px;\">Scheduler-96886d6c-baf6-48eb-95dc-2cca09abbe70</p>\n", - " <table style=\"width: 100%; text-align: left;\">\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Comm:</strong> tcp://134.94.0.100:42495\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Workers:</strong> 1\n", - " </td>\n", - " </tr>\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Dashboard:</strong> <a href=\"http://134.94.0.100:8787/status\" target=\"_blank\">http://134.94.0.100:8787/status</a>\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Total threads:</strong> 8\n", - " </td>\n", - " </tr>\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Started:</strong> Just now\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Total memory:</strong> 22.35 GiB\n", - " </td>\n", - " </tr>\n", - " </table>\n", - " </div>\n", - " </div>\n", - "\n", - " <details style=\"margin-left: 48px;\">\n", - " <summary style=\"margin-bottom: 20px;\">\n", - " <h3 style=\"display: inline;\">Workers</h3>\n", - " </summary>\n", - "\n", - " \n", - " <div style=\"margin-bottom: 20px;\">\n", - " <div style=\"width: 24px; height: 24px; background-color: #DBF5FF; border: 3px solid #4CC9FF; border-radius: 5px; position: absolute;\"> </div>\n", - " <div style=\"margin-left: 48px;\">\n", - " <details>\n", - " <summary>\n", - " <h4 style=\"margin-bottom: 0px; display: inline;\">Worker: 0</h4>\n", - " </summary>\n", - " <table style=\"width: 100%; text-align: left;\">\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Comm: </strong> tcp://134.94.0.100:40747\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Total threads: </strong> 8\n", - " </td>\n", - " </tr>\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Dashboard: </strong> <a href=\"http://134.94.0.100:46353/status\" target=\"_blank\">http://134.94.0.100:46353/status</a>\n", - " </td>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Memory: </strong> 22.35 GiB\n", - " </td>\n", - " </tr>\n", - " <tr>\n", - " <td style=\"text-align: left;\">\n", - " <strong>Nanny: </strong> tcp://134.94.0.100:33715\n", - " </td>\n", - " <td style=\"text-align: left;\"></td>\n", - " </tr>\n", - " <tr>\n", - " <td colspan=\"2\" style=\"text-align: left;\">\n", - " <strong>Local directory: </strong> /tmp/dask-worker-space/worker-pxxiovlw\n", - " </td>\n", - " </tr>\n", - "\n", - " \n", - "\n", - " \n", - "\n", - " </table>\n", - " </details>\n", - " </div>\n", - " </div>\n", - " \n", - "\n", - " </details>\n", - "</div>\n", - "\n", - " </details>\n", - " </div>\n", - "</div>\n", - " </details>\n", - " \n", - "\n", - " </div>\n", - "</div>" - ], - "text/plain": [ - "<Client: 'tcp://134.94.0.100:42495' processes=1 threads=8, memory=22.35 GiB>" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "cluster = dask.distributed.LocalCluster(\n", - " n_workers=1, threads_per_worker=8, memory_limit=24e9,\n", - " ip=\"0.0.0.0\"\n", - ")\n", - "\n", - "client = dask.distributed.Client(cluster)\n", - "client" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Use dask.array for randomly chosen positions" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy, dask.array" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [], - "source": [ - "def calculate_pi_dask(size_in_bytes, number_of_chunks):\n", - " \n", - " \"\"\"Calculate pi using a Monte Carlo method.\"\"\"\n", - " \n", - " array_shape = (int(size_in_bytes / 8 / 2), 2)\n", - " chunk_size = (int(array_shape[0] / number_of_chunks), 2)\n", - " \n", - " # 2D random positions array using dask.array\n", - " xy = dask.array.random.uniform(\n", - " low=0.0, high=1.0, size=array_shape,\n", - " # specify chunk size, i.e. task number\n", - " chunks=chunk_size )\n", - " \n", - " xy_inside_circle = (xy ** 2).sum(axis=1) < 1\n", - "\n", - " pi = 4 * xy_inside_circle.sum() / xy_inside_circle.size\n", - " \n", - " # start Dask calculation\n", - " pi = pi.compute()\n", - "\n", - " print(f\"\\nfrom {xy.nbytes / 1e9} GB randomly chosen positions\")\n", - " print(f\" pi estimate: {pi}\")\n", - " print(f\" pi error: {abs(pi - numpy.pi)}\\n\")\n", - " display(xy)\n", - " \n", - " return pi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Let's calculate again...\n", - "Observe the wall time decreases of the 1 Gigabyte and 10 Gigabyte random sample $\\pi$ estimates!" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "from 1.0 GB randomly chosen positions\n", - " pi estimate: 3.141517184\n", - " pi error: 7.546958979309792e-05\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "<table>\n", - " <tr>\n", - " <td>\n", - " <table>\n", - " <thead>\n", - " <tr>\n", - " <td> </td>\n", - " <th> Array </th>\n", - " <th> Chunk </th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " \n", - " <tr>\n", - " <th> Bytes </th>\n", - " <td> 0.93 GiB </td>\n", - " <td> 95.37 MiB </td>\n", - " </tr>\n", - " \n", - " <tr>\n", - " <th> Shape </th>\n", - " <td> (62500000, 2) </td>\n", - " <td> (6250000, 2) </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Count </th>\n", - " <td> 1 Graph Layer </td>\n", - " <td> 10 Chunks </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Type </th>\n", - " <td> float64 </td>\n", - " <td> numpy.ndarray </td>\n", - " </tr>\n", - " </tbody>\n", - " </table>\n", - " </td>\n", - " <td>\n", - " <svg width=\"75\" height=\"170\" style=\"stroke:rgb(0,0,0);stroke-width:1\" >\n", - "\n", - " <!-- Horizontal lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"25\" y2=\"0\" style=\"stroke-width:2\" />\n", - " <line x1=\"0\" y1=\"12\" x2=\"25\" y2=\"12\" />\n", - " <line x1=\"0\" y1=\"24\" x2=\"25\" y2=\"24\" />\n", - " <line x1=\"0\" y1=\"36\" x2=\"25\" y2=\"36\" />\n", - " <line x1=\"0\" y1=\"48\" x2=\"25\" y2=\"48\" />\n", - " <line x1=\"0\" y1=\"60\" x2=\"25\" y2=\"60\" />\n", - " <line x1=\"0\" y1=\"72\" x2=\"25\" y2=\"72\" />\n", - " <line x1=\"0\" y1=\"84\" x2=\"25\" y2=\"84\" />\n", - " <line x1=\"0\" y1=\"96\" x2=\"25\" y2=\"96\" />\n", - " <line x1=\"0\" y1=\"108\" x2=\"25\" y2=\"108\" />\n", - " <line x1=\"0\" y1=\"120\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Vertical lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"0\" y2=\"120\" style=\"stroke-width:2\" />\n", - " <line x1=\"25\" y1=\"0\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Colored Rectangle -->\n", - " <polygon points=\"0.0,0.0 25.412616514582485,0.0 25.412616514582485,120.0 0.0,120.0\" style=\"fill:#ECB172A0;stroke-width:0\"/>\n", - "\n", - " <!-- Text -->\n", - " <text x=\"12.706308\" y=\"140.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" >2</text>\n", - " <text x=\"45.412617\" y=\"60.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" transform=\"rotate(-90,45.412617,60.000000)\">62500000</text>\n", - "</svg>\n", - " </td>\n", - " </tr>\n", - "</table>" - ], - "text/plain": [ - "dask.array<uniform, shape=(62500000, 2), dtype=float64, chunksize=(6250000, 2), chunktype=numpy.ndarray>" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CPU times: user 83.3 ms, sys: 17.7 ms, total: 101 ms\n", - "Wall time: 686 ms\n" - ] - } - ], - "source": [ - "%time pi = calculate_pi_dask(size_in_bytes=1_000_000_000, number_of_chunks=10) # 1 GB" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "from 10.0 GB randomly chosen positions\n", - " pi estimate: 3.141718944\n", - " pi error: 0.00012629041020684184\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "<table>\n", - " <tr>\n", - " <td>\n", - " <table>\n", - " <thead>\n", - " <tr>\n", - " <td> </td>\n", - " <th> Array </th>\n", - " <th> Chunk </th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " \n", - " <tr>\n", - " <th> Bytes </th>\n", - " <td> 9.31 GiB </td>\n", - " <td> 95.37 MiB </td>\n", - " </tr>\n", - " \n", - " <tr>\n", - " <th> Shape </th>\n", - " <td> (625000000, 2) </td>\n", - " <td> (6250000, 2) </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Count </th>\n", - " <td> 1 Graph Layer </td>\n", - " <td> 100 Chunks </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Type </th>\n", - " <td> float64 </td>\n", - " <td> numpy.ndarray </td>\n", - " </tr>\n", - " </tbody>\n", - " </table>\n", - " </td>\n", - " <td>\n", - " <svg width=\"75\" height=\"170\" style=\"stroke:rgb(0,0,0);stroke-width:1\" >\n", - "\n", - " <!-- Horizontal lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"25\" y2=\"0\" style=\"stroke-width:2\" />\n", - " <line x1=\"0\" y1=\"6\" x2=\"25\" y2=\"6\" />\n", - " <line x1=\"0\" y1=\"12\" x2=\"25\" y2=\"12\" />\n", - " <line x1=\"0\" y1=\"18\" x2=\"25\" y2=\"18\" />\n", - " <line x1=\"0\" y1=\"25\" x2=\"25\" y2=\"25\" />\n", - " <line x1=\"0\" y1=\"31\" x2=\"25\" y2=\"31\" />\n", - " <line x1=\"0\" y1=\"37\" x2=\"25\" y2=\"37\" />\n", - " <line x1=\"0\" y1=\"43\" x2=\"25\" y2=\"43\" />\n", - " <line x1=\"0\" y1=\"50\" x2=\"25\" y2=\"50\" />\n", - " <line x1=\"0\" y1=\"56\" x2=\"25\" y2=\"56\" />\n", - " <line x1=\"0\" y1=\"62\" x2=\"25\" y2=\"62\" />\n", - " <line x1=\"0\" y1=\"68\" x2=\"25\" y2=\"68\" />\n", - " <line x1=\"0\" y1=\"75\" x2=\"25\" y2=\"75\" />\n", - " <line x1=\"0\" y1=\"81\" x2=\"25\" y2=\"81\" />\n", - " <line x1=\"0\" y1=\"87\" x2=\"25\" y2=\"87\" />\n", - " <line x1=\"0\" y1=\"93\" x2=\"25\" y2=\"93\" />\n", - " <line x1=\"0\" y1=\"100\" x2=\"25\" y2=\"100\" />\n", - " <line x1=\"0\" y1=\"106\" x2=\"25\" y2=\"106\" />\n", - " <line x1=\"0\" y1=\"112\" x2=\"25\" y2=\"112\" />\n", - " <line x1=\"0\" y1=\"120\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Vertical lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"0\" y2=\"120\" style=\"stroke-width:2\" />\n", - " <line x1=\"25\" y1=\"0\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Colored Rectangle -->\n", - " <polygon points=\"0.0,0.0 25.412616514582485,0.0 25.412616514582485,120.0 0.0,120.0\" style=\"fill:#8B4903A0;stroke-width:0\"/>\n", - "\n", - " <!-- Text -->\n", - " <text x=\"12.706308\" y=\"140.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" >2</text>\n", - " <text x=\"45.412617\" y=\"60.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" transform=\"rotate(-90,45.412617,60.000000)\">625000000</text>\n", - "</svg>\n", - " </td>\n", - " </tr>\n", - "</table>" - ], - "text/plain": [ - "dask.array<uniform, shape=(625000000, 2), dtype=float64, chunksize=(6250000, 2), chunktype=numpy.ndarray>" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CPU times: user 564 ms, sys: 56.4 ms, total: 621 ms\n", - "Wall time: 4.43 s\n" - ] - } - ], - "source": [ - "%time pi = calculate_pi_dask(size_in_bytes=10_000_000_000, number_of_chunks=100) # 10 GB" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Let's go larger than memory...\n", - "Because Dask splits the computation into single managable tasks, we can scale up easily!" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "from 100.0 GB randomly chosen positions\n", - " pi estimate: 3.14160807168\n", - " pi error: 1.541809020677576e-05\n", - "\n" - ] - }, - { - "data": { - "text/html": [ - "<table>\n", - " <tr>\n", - " <td>\n", - " <table>\n", - " <thead>\n", - " <tr>\n", - " <td> </td>\n", - " <th> Array </th>\n", - " <th> Chunk </th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " \n", - " <tr>\n", - " <th> Bytes </th>\n", - " <td> 93.13 GiB </td>\n", - " <td> 381.47 MiB </td>\n", - " </tr>\n", - " \n", - " <tr>\n", - " <th> Shape </th>\n", - " <td> (6250000000, 2) </td>\n", - " <td> (25000000, 2) </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Count </th>\n", - " <td> 1 Graph Layer </td>\n", - " <td> 250 Chunks </td>\n", - " </tr>\n", - " <tr>\n", - " <th> Type </th>\n", - " <td> float64 </td>\n", - " <td> numpy.ndarray </td>\n", - " </tr>\n", - " </tbody>\n", - " </table>\n", - " </td>\n", - " <td>\n", - " <svg width=\"75\" height=\"170\" style=\"stroke:rgb(0,0,0);stroke-width:1\" >\n", - "\n", - " <!-- Horizontal lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"25\" y2=\"0\" style=\"stroke-width:2\" />\n", - " <line x1=\"0\" y1=\"6\" x2=\"25\" y2=\"6\" />\n", - " <line x1=\"0\" y1=\"12\" x2=\"25\" y2=\"12\" />\n", - " <line x1=\"0\" y1=\"18\" x2=\"25\" y2=\"18\" />\n", - " <line x1=\"0\" y1=\"24\" x2=\"25\" y2=\"24\" />\n", - " <line x1=\"0\" y1=\"31\" x2=\"25\" y2=\"31\" />\n", - " <line x1=\"0\" y1=\"37\" x2=\"25\" y2=\"37\" />\n", - " <line x1=\"0\" y1=\"44\" x2=\"25\" y2=\"44\" />\n", - " <line x1=\"0\" y1=\"50\" x2=\"25\" y2=\"50\" />\n", - " <line x1=\"0\" y1=\"56\" x2=\"25\" y2=\"56\" />\n", - " <line x1=\"0\" y1=\"62\" x2=\"25\" y2=\"62\" />\n", - " <line x1=\"0\" y1=\"69\" x2=\"25\" y2=\"69\" />\n", - " <line x1=\"0\" y1=\"75\" x2=\"25\" y2=\"75\" />\n", - " <line x1=\"0\" y1=\"82\" x2=\"25\" y2=\"82\" />\n", - " <line x1=\"0\" y1=\"88\" x2=\"25\" y2=\"88\" />\n", - " <line x1=\"0\" y1=\"94\" x2=\"25\" y2=\"94\" />\n", - " <line x1=\"0\" y1=\"100\" x2=\"25\" y2=\"100\" />\n", - " <line x1=\"0\" y1=\"107\" x2=\"25\" y2=\"107\" />\n", - " <line x1=\"0\" y1=\"113\" x2=\"25\" y2=\"113\" />\n", - " <line x1=\"0\" y1=\"120\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Vertical lines -->\n", - " <line x1=\"0\" y1=\"0\" x2=\"0\" y2=\"120\" style=\"stroke-width:2\" />\n", - " <line x1=\"25\" y1=\"0\" x2=\"25\" y2=\"120\" style=\"stroke-width:2\" />\n", - "\n", - " <!-- Colored Rectangle -->\n", - " <polygon points=\"0.0,0.0 25.412616514582485,0.0 25.412616514582485,120.0 0.0,120.0\" style=\"fill:#8B4903A0;stroke-width:0\"/>\n", - "\n", - " <!-- Text -->\n", - " <text x=\"12.706308\" y=\"140.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" >2</text>\n", - " <text x=\"45.412617\" y=\"60.000000\" font-size=\"1.0rem\" font-weight=\"100\" text-anchor=\"middle\" transform=\"rotate(-90,45.412617,60.000000)\">6250000000</text>\n", - "</svg>\n", - " </td>\n", - " </tr>\n", - "</table>" - ], - "text/plain": [ - "dask.array<uniform, shape=(6250000000, 2), dtype=float64, chunksize=(25000000, 2), chunktype=numpy.ndarray>" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CPU times: user 3.73 s, sys: 374 ms, total: 4.1 s\n", - "Wall time: 38.8 s\n" - ] - } - ], - "source": [ - "%time pi = calculate_pi_dask(size_in_bytes=100_000_000_000, number_of_chunks=250) # 100 GB" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Are we now better than single precision floating point resolution?\n", - "Not at all, if we require an order of magnitude better..." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "numpy.finfo(numpy.float32)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## We could increase the local cluster CPU resources...\n", - "However, the above Dask cluster size is always limited by the memory/CPU resources of a single compute node." - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "# %time pi = calculate_pi(size_in_bytes=1_000_000_000_000, number_of_chunks=2_500) # 1 TB" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "------------------------------------\n", - "\n", - "### More on Dask - HIGH THROUGHPUT COMPUTING WITH DASK\n", - "\n", - "**Organisers:** Alan O’Cais, David Swenson \n", - "**Website:** https://www.cecam.org/workshop-details/1022\n", - "\n", - "**Synopsis:**\n", - "High-throughput (task-based) computing is a flexible approach to parallelisation. It involves splitting a problem into loosely-coupled tasks. A scheduler then orchestrates the parallel execution of those tasks, allowing programs to adaptively scale their resource usage. E-CAM has extended the data-analytics framework Dask with a capable and efficient library to handle such workloads. This workshop will be held as a series of virtual seminars/tutorials on tools in the Dask HPC ecosystem.\n", - "\n", - "**Programme:**\n", - "- 21 January 2021, 3pm CET (2pm UTC): Dask - a flexible library for parallel computing in Python\n", - " - YouTube link: https://youtu.be/Tl8rO-baKuY\n", - " - GitHub Repo: https://github.com/jacobtomlinson/dask-video-tutorial-2020 \n", - " \n", - "4 February 2021, 3pm CET (2pm UTC): Dask-Jobqueue - a library that integrates Dask with standard HPC queuing systems, such as SLURM or PBS\n", - " - YouTube link: https://youtu.be/iNxhHXzmJ1w\n", - " - GitHub Repo: https://github.com/ExaESM-WP4/workshop-Dask-Jobqueue-cecam-2021-02 \n", - " \n", - "- 11 February 2021, 3pm CET (2pm UTC) : Jobqueue-Features - a library that enables functionality aimed at enhancing scalability\n", - " - YouTube link: https://youtu.be/FpMua8iJeTk\n", - " - GitHub Repo: https://github.com/E-CAM/jobqueue_features_workshop_materials" - ] - } - ], - "metadata": { - "anaconda-cloud": {}, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} -- GitLab