Introduction-to-Pandas.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "title-slide",
    "slideshow": {
     "id": "title-slide",
     "slide_type": "slide"
    }
   },
   "source": [
    "# *Introduction to* Data Analysis and Plotting with Pandas\n",
    "## JSC Tutorial\n",
    "\n",
    "Andreas Herten, Forschungszentrum Jülich, 26 February 2019"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## My Motivation\n",
    "\n",
    "* I like Python\n",
    "* I like plotting data\n",
    "* I like sharing\n",
    "* I think Pandas is awesome and you should use it too"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Tutorial Setup\n",
    "\n",
    "* 60 minutes (we might do this again for some advanced stuff if you want to)\n",
    "* Alternating between lecture and hands-on\n",
    "* Direct feedback via https://www.polleverywhere.com/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "* Please open Jupyter Notebook of this session\n",
    "    - … either on your **local machine** (which has Pandas)  \n",
    "    Download here\n",
    "    - … or on the **JSC Jupyter service** at https://jupyter-jsc.fz-juelich.de/\n",
    "        - Either `pip install --user pandas seaborn` once in a shell and `cp $PROJECT_cjsc/herten1/pandas/notebook.ipynb ~/`\n",
    "        - Or \n",
    "            1. `ln -s $PROJECT_cjsc/herten1/pandas ~/.local/share/jupyter/kernels/` and \n",
    "            2. `cp $PROJECT_cjsc/herten1/pandas/notebook-with-kernel.ipynb ~/`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## About Pandas\n",
    "\n",
    "<img style=\"float: right; max-width: 200px;\" width=\"200px\" src=\"img/adorable-animal-animal-photography-1661535.jpg\" />\n",
    "\n",
    "* Python package (Python 2, Python 3)\n",
    "* For data analysis\n",
    "* With data structures (multi-dimensional table; time series), operations\n",
    "* Name from »**Pan**el **Da**ta« (multi-dimensional time series in economics)\n",
    "* Since 2008\n",
    "* https://pandas.pydata.org/\n",
    "* Install [via PyPI](https://pypi.org/project/pandas/): `pip install pandas`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Pandas Cohabitation\n",
    "\n",
    "* Pandas works great together with other established Python tools\n",
    "    * [Jupyter Notebooks](https://jupyter.org/)\n",
    "    * Plotting with [`matplotlib`](https://matplotlib.org/)\n",
    "    * Modelling with [`statsmodels`](https://www.statsmodels.org/stable/index.html), [`scikit-learn`](https://scikit-learn.org/)\n",
    "    * Nicer plots with [`seaborn`](https://seaborn.pydata.org/), [`altair`](https://altair-viz.github.io/), [`plotly`](https://plot.ly/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## First Steps"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import pandas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0.24.1'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.__version__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\u001b[0;31mClass docstring:\u001b[0m\n",
       "    pandas - a powerful data analysis and manipulation library for Python\n",
       "    =====================================================================\n",
       "    \n",
       "    **pandas** is a Python package providing fast, flexible, and expressive data\n",
       "    structures designed to make working with \"relational\" or \"labeled\" data both\n",
       "    easy and intuitive. It aims to be the fundamental high-level building block for\n",
       "    doing practical, **real world** data analysis in Python. Additionally, it has\n",
       "    the broader goal of becoming **the most powerful and flexible open source data\n",
       "    analysis / manipulation tool available in any language**. It is already well on\n",
       "    its way toward this goal.\n",
       "    \n",
       "    Main Features\n",
       "    -------------\n",
       "    Here are just a few of the things that pandas does well:\n",
       "    \n",
       "      - Easy handling of missing data in floating point as well as non-floating\n",
       "        point data.\n",
       "      - Size mutability: columns can be inserted and deleted from DataFrame and\n",
       "        higher dimensional objects\n",
       "      - Automatic and explicit data alignment: objects can be explicitly aligned\n",
       "        to a set of labels, or the user can simply ignore the labels and let\n",
       "        `Series`, `DataFrame`, etc. automatically align the data for you in\n",
       "        computations.\n",
       "      - Powerful, flexible group by functionality to perform split-apply-combine\n",
       "        operations on data sets, for both aggregating and transforming data.\n",
       "      - Make it easy to convert ragged, differently-indexed data in other Python\n",
       "        and NumPy data structures into DataFrame objects.\n",
       "      - Intelligent label-based slicing, fancy indexing, and subsetting of large\n",
       "        data sets.\n",
       "      - Intuitive merging and joining data sets.\n",
       "      - Flexible reshaping and pivoting of data sets.\n",
       "      - Hierarchical labeling of axes (possible to have multiple labels per tick).\n",
       "      - Robust IO tools for loading data from flat files (CSV and delimited),\n",
       "        Excel files, databases, and saving/loading data from the ultrafast HDF5\n",
       "        format.\n",
       "      - Time series-specific functionality: date range generation and frequency\n",
       "        conversion, moving window statistics, moving window linear regressions,\n",
       "        date shifting and lagging, etc."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%pdoc pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## DataFrames\n",
    "### It's all about DataFrames\n",
    "\n",
    "* Main data containers of Pandas\n",
    "    - Linear: `Series`\n",
    "    - Multi Dimension: `DataFrame`\n",
    "* `Series` is *only* special case of `DataFrame`\n",
    "* → Talk about `DataFrame`s, mention some special `Series` cases"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## DataFrames\n",
    "### Construction\n",
    "\n",
    "* To show features of `DataFrame`, let's construct one!\n",
    "* Many construction possibilities\n",
    "    - From lists, dictionaries, `numpy` objects\n",
    "    - From CSV, HDF5, JSON, Excel, HTML, fixed-width files\n",
    "    - From pickled Pandas data\n",
    "    - From clipboard\n",
    "    - *From Feather, Parquest, SAS, SQL, Google BigQuery, STATA*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## DataFrames\n",
    "\n",
    "### Examples, finally"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "ages  = [41, 56, 56, 57, 39, 59, 43, 56, 38, 60]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>59</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    0\n",
       "0  41\n",
       "1  56\n",
       "2  56\n",
       "3  57\n",
       "4  39\n",
       "5  59\n",
       "6  43\n",
       "7  56\n",
       "8  38\n",
       "9  60"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(ages)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    0\n",
       "0  41\n",
       "1  56\n",
       "2  56"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_ages = pd.DataFrame(ages)\n",
    "df_ages.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* Let's add names to ages; put everything into a `dict()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'Names': ['Liu', 'Rowland', 'Rivers', 'Waters', 'Rice', 'Fields', 'Kerr', 'Romero', 'Davis', 'Hall'], 'Ages': [41, 56, 56, 57, 39, 59, 43, 56, 38, 60]}\n"
     ]
    }
   ],
   "source": [
    "data = {\n",
    "    \"Names\": [\"Liu\", \"Rowland\", \"Rivers\", \"Waters\", \"Rice\", \"Fields\", \"Kerr\", \"Romero\", \"Davis\", \"Hall\"],\n",
    "    \"Ages\": ages\n",
    "}\n",
    "print(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Names</th>\n",
       "      <th>Ages</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Liu</td>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Rowland</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Rivers</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Waters</td>\n",
       "      <td>57</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Names  Ages\n",
       "0      Liu    41\n",
       "1  Rowland    56\n",
       "2   Rivers    56\n",
       "3   Waters    57"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample = pd.DataFrame(data)\n",
    "df_sample.head(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "* Two columns now; one for names, one for ages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Names', 'Ages'], dtype='object')"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* `DataFrame` always have indexes; auto-generated or custom"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=10, step=1)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "* Make `Names` be index with `.set_index()`\n",
    "* `inplace=True` will modifiy the parent frame (*I don't like it*)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Ages</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Names</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Liu</th>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Rowland</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Rivers</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Waters</th>\n",
       "      <td>57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Rice</th>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Fields</th>\n",
       "      <td>59</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kerr</th>\n",
       "      <td>43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Romero</th>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Davis</th>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hall</th>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Ages\n",
       "Names        \n",
       "Liu        41\n",
       "Rowland    56\n",
       "Rivers     56\n",
       "Waters     57\n",
       "Rice       39\n",
       "Fields     59\n",
       "Kerr       43\n",
       "Romero     56\n",
       "Davis      38\n",
       "Hall       60"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.set_index(\"Names\", inplace=True)\n",
    "df_sample"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* Some more operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Ages</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>10.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>50.500000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>9.009255</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>38.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>41.500000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>56.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>56.750000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>60.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Ages\n",
       "count  10.000000\n",
       "mean   50.500000\n",
       "std     9.009255\n",
       "min    38.000000\n",
       "25%    41.500000\n",
       "50%    56.000000\n",
       "75%    56.750000\n",
       "max    60.000000"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Names</th>\n",
       "      <th>Liu</th>\n",
       "      <th>Rowland</th>\n",
       "      <th>Rivers</th>\n",
       "      <th>Waters</th>\n",
       "      <th>Rice</th>\n",
       "      <th>Fields</th>\n",
       "      <th>Kerr</th>\n",
       "      <th>Romero</th>\n",
       "      <th>Davis</th>\n",
       "      <th>Hall</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Ages</th>\n",
       "      <td>41</td>\n",
       "      <td>56</td>\n",
       "      <td>56</td>\n",
       "      <td>57</td>\n",
       "      <td>39</td>\n",
       "      <td>59</td>\n",
       "      <td>43</td>\n",
       "      <td>56</td>\n",
       "      <td>38</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Names  Liu  Rowland  Rivers  Waters  Rice  Fields  Kerr  Romero  Davis  Hall\n",
       "Ages    41       56      56      57    39      59    43      56     38    60"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Liu', 'Rowland', 'Rivers', 'Waters', 'Rice', 'Fields', 'Kerr',\n",
       "       'Romero', 'Davis', 'Hall'],\n",
       "      dtype='object', name='Names')"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sample.T.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "* Also: Arithmetic operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Ages</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Names</th>\n",