Skip to content
Snippets Groups Projects
Introduction-to-Pandas--master.ipynb 673 KiB
Newer Older
Andreas Herten's avatar
Andreas Herten committed
       "      <td>True</td>\n",
       "      <td>0.220000</td>\n",
       "      <td>42.040000</td>\n",
       "      <td>42.838333</td>\n",
       "      <td>0.583333</td>\n",
       "      <td>...</td>\n",
       "      <td>7.226667</td>\n",
       "      <td>132.061667</td>\n",
       "      <td>4.806585e+07</td>\n",
       "      <td>816298.000000</td>\n",
       "      <td>7.215000</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>2.891667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.333333</td>\n",
       "      <td>3.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>73.601667</td>\n",
       "      <td>10.0</td>\n",
       "      <td>True</td>\n",
       "      <td>0.168333</td>\n",
       "      <td>19.628333</td>\n",
       "      <td>20.313333</td>\n",
       "      <td>0.191667</td>\n",
       "      <td>...</td>\n",
       "      <td>2.725000</td>\n",
       "      <td>48.901667</td>\n",
       "      <td>4.975288e+07</td>\n",
       "      <td>818151.000000</td>\n",
       "      <td>7.210000</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.986667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>5.333333</td>\n",
       "      <td>3.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>43.990000</td>\n",
       "      <td>10.0</td>\n",
       "      <td>True</td>\n",
       "      <td>0.138333</td>\n",
       "      <td>12.810000</td>\n",
       "      <td>13.305000</td>\n",
       "      <td>0.135000</td>\n",
       "      <td>...</td>\n",
       "      <td>1.426667</td>\n",
       "      <td>27.735000</td>\n",
       "      <td>5.511165e+07</td>\n",
       "      <td>820465.666667</td>\n",
       "      <td>7.253333</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.745000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.333333</td>\n",
       "      <td>3.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>31.225000</td>\n",
       "      <td>10.0</td>\n",
       "      <td>True</td>\n",
       "      <td>0.116667</td>\n",
       "      <td>9.325000</td>\n",
       "      <td>9.740000</td>\n",
       "      <td>0.088333</td>\n",
       "      <td>...</td>\n",
       "      <td>1.066667</td>\n",
       "      <td>19.353333</td>\n",
       "      <td>5.325783e+07</td>\n",
       "      <td>819558.166667</td>\n",
       "      <td>7.288333</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.275000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5.333333</td>\n",
       "      <td>3.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>24.896667</td>\n",
       "      <td>10.0</td>\n",
       "      <td>True</td>\n",
       "      <td>0.140000</td>\n",
       "      <td>7.468333</td>\n",
       "      <td>7.790000</td>\n",
       "      <td>0.070000</td>\n",
       "      <td>...</td>\n",
       "      <td>0.771667</td>\n",
       "      <td>14.950000</td>\n",
       "      <td>6.075634e+07</td>\n",
       "      <td>815307.666667</td>\n",
       "      <td>7.225000</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.496667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>5.333333</td>\n",
       "      <td>3.0</td>\n",
       "      <td>8.0</td>\n",
       "      <td>20.215000</td>\n",
       "      <td>10.0</td>\n",
       "      <td>True</td>\n",
       "      <td>0.106667</td>\n",
       "      <td>6.165000</td>\n",
       "      <td>6.406667</td>\n",
       "      <td>0.051667</td>\n",
       "      <td>...</td>\n",
       "      <td>0.630000</td>\n",
       "      <td>12.271667</td>\n",
       "      <td>6.060652e+07</td>\n",
       "      <td>815456.333333</td>\n",
       "      <td>7.201667</td>\n",
       "      <td>112500.0</td>\n",
       "      <td>1.265738e+09</td>\n",
       "      <td>1.5</td>\n",
       "      <td>1.5</td>\n",
       "      <td>0.990000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>6 rows × 21 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "             id  Tasks/Node  Threads/Task  Runtime Program / s  Scale  \\\n",
       "Nodes                                                                   \n",
       "1      5.333333         3.0           8.0           185.023333   10.0   \n",
       "2      5.333333         3.0           8.0            73.601667   10.0   \n",
       "3      5.333333         3.0           8.0            43.990000   10.0   \n",
       "4      5.333333         3.0           8.0            31.225000   10.0   \n",
       "5      5.333333         3.0           8.0            24.896667   10.0   \n",
       "6      5.333333         3.0           8.0            20.215000   10.0   \n",
       "\n",
       "       Plastic  Avg. Neuron Build Time / s  Min. Edge Build Time / s  \\\n",
       "Nodes                                                                  \n",
       "1         True                    0.220000                 42.040000   \n",
       "2         True                    0.168333                 19.628333   \n",
       "3         True                    0.138333                 12.810000   \n",
       "4         True                    0.116667                  9.325000   \n",
       "5         True                    0.140000                  7.468333   \n",
       "6         True                    0.106667                  6.165000   \n",
       "\n",
       "       Max. Edge Build Time / s  Min. Init. Time / s  ...  Presim. Time / s  \\\n",
       "Nodes                                                 ...                     \n",
       "1                     42.838333             0.583333  ...          7.226667   \n",
       "2                     20.313333             0.191667  ...          2.725000   \n",
       "3                     13.305000             0.135000  ...          1.426667   \n",
       "4                      9.740000             0.088333  ...          1.066667   \n",
       "5                      7.790000             0.070000  ...          0.771667   \n",
       "6                      6.406667             0.051667  ...          0.630000   \n",
       "\n",
       "       Sim. Time / s  Virt. Memory (Sum) / kB  Local Spike Counter (Sum)  \\\n",
       "Nodes                                                                      \n",
       "1         132.061667             4.806585e+07              816298.000000   \n",
       "2          48.901667             4.975288e+07              818151.000000   \n",
       "3          27.735000             5.511165e+07              820465.666667   \n",
       "4          19.353333             5.325783e+07              819558.166667   \n",
       "5          14.950000             6.075634e+07              815307.666667   \n",
       "6          12.271667             6.060652e+07              815456.333333   \n",
       "\n",
       "       Average Rate (Sum)  Number of Neurons  Number of Connections  \\\n",
       "Nodes                                                                 \n",
       "1                7.215000           112500.0           1.265738e+09   \n",
       "2                7.210000           112500.0           1.265738e+09   \n",
       "3                7.253333           112500.0           1.265738e+09   \n",
       "4                7.288333           112500.0           1.265738e+09   \n",
       "5                7.225000           112500.0           1.265738e+09   \n",
       "6                7.201667           112500.0           1.265738e+09   \n",
       "\n",
       "       Min. Delay  Max. Delay  Unaccounted Time / s  \n",
       "Nodes                                                \n",
       "1             1.5         1.5              2.891667  \n",
       "2             1.5         1.5              1.986667  \n",
       "3             1.5         1.5              1.745000  \n",
       "4             1.5         1.5              1.275000  \n",
       "5             1.5         1.5              1.496667  \n",
       "6             1.5         1.5              0.990000  \n",
       "\n",
       "[6 rows x 21 columns]"
      ]
     },
     "execution_count": 97,
Andreas Herten's avatar
Andreas Herten committed
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(\"Nodes\").mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Pivoting\n",
    "\n",
    "* Combine categorically-similar columns\n",
    "* Creates hierarchical index\n",
    "* Respected during plotting!\n",
    "* A pivot table has three *layers*; if confused, think about these questions\n",
    "    - `index`: »What's on the `x` axis?«\n",
    "    - `values`: »What value do I want to plot?«\n",
    "    - `columns`: »What categories do I want [to be in the legend]?«\n",
    "* All can be populated from base data frame\n",
    "* Might be aggregated, if needed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
Andreas Herten's avatar
Andreas Herten committed
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "df_demo[\"H\"] = [(-1)**n for n in range(5)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
Andreas Herten's avatar
Andreas Herten committed
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>H</th>\n",
       "      <th>-1</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>F</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>-3.918282</th>\n",
       "      <td>NaN</td>\n",
       "      <td>7.389056</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>-2.504068</th>\n",
       "      <td>NaN</td>\n",
       "      <td>1.700594</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>-1.918282</th>\n",
       "      <td>NaN</td>\n",
       "      <td>0.515929</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>-0.213769</th>\n",
       "      <td>0.972652</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0.518282</th>\n",
       "      <td>2.952492</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "H                -1         1\n",
       "F                            \n",
       "-3.918282       NaN  7.389056\n",
       "-2.504068       NaN  1.700594\n",
       "-1.918282       NaN  0.515929\n",
       "-0.213769  0.972652       NaN\n",
       " 0.518282  2.952492       NaN"
      ]
     },
     "execution_count": 99,
Andreas Herten's avatar
Andreas Herten committed
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_pivot = df_demo.pivot_table(\n",
    "    index=\"F\",\n",
    "    values=\"G\",\n",
    "    columns=\"H\"\n",
    ")\n",
    "df_pivot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
Andreas Herten's avatar
Andreas Herten committed
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_pivot.plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "exercise": "task",
Andreas Herten's avatar
Andreas Herten committed
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Task 7\n",
    "<a name=\"task7\"></a>\n",
Andreas Herten's avatar
Andreas Herten committed
    "\n",
    "* Create a pivot table based on the NEST `df` data frame\n",
    "* Let the `x` axis show the number of nodes; display the values of the simulation time `\"Sim. Time / s\"` for the tasks per node and threas per task configurations\n",
    "* Please plot a bar plot\n",
    "* Done? [pollev.com/aherten538](https://pollev.com/aherten538)"
Andreas Herten's avatar
Andreas Herten committed
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
Andreas Herten's avatar
Andreas Herten committed
   "metadata": {
    "exercise": "solution",
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 864x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df.pivot_table(\n",
    "    index=[\"Nodes\"],\n",
    "    columns=[\"Tasks/Node\", \"Threads/Task\"],\n",
    "    values=\"Sim. Time / s\",\n",
    ").plot(kind=\"bar\", figsize=(12, 4));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "exercise": "task",
Andreas Herten's avatar
Andreas Herten committed
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<a name=\"taskb\"></a>\n",
    "\n",
Andreas Herten's avatar
Andreas Herten committed
    "* Bonus task\n",
    "    - Use `Sim. Time / s` and `Presim. Time / s` as values to show\n",
    "    - Show a stack of those two values inside the pivot table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## The End\n",
    "\n",
    "* Pandas works on data frames\n",
    "* Slice frames to your likings\n",
    "* Plot frames\n",
    "    - Together with Matplotlib, Seaborn, others\n",
    "* Pivot tables are next level greatness\n",
    "* Thanks for being here! 😍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "exercise": "task"
   },
   "source": [
    "<span class=\"feedback\">Tell me what you think about this tutorial! <a href=\"mailto:a.herten@fz-juelich.de\">a.herten@fz-juelich.de</a></span>"
Andreas Herten's avatar
Andreas Herten committed
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}