Commit 24abe2d6 authored by Andreas Herten's avatar Andreas Herten
Browse files

Add README

parent d64a5f9b
Pipeline #100004 passed with stage
in 6 seconds
# JSC Tutorial: Data Analysis and Plotting with Pandas
Repository for a small-ish internal tutorial held at 26 February 2019 at JSC.
Material to be found at
http://herten1.pages.jsc.fz-juelich.de/jsc-pandas-introduction/
## Setup
One **master** Notebook is used to generate three Sub-Notebooks:
1. Slides
2. Exercises: Tasks
3. Exercises: Solutions
The slides Notebook is then converted to a HTML presentation (and also to a static PDF); all material is served to Gitlab pages via CI.
In case you're interested in the details, read on.
### Splitting Notebooks
To have one single Notebook file and don't deal with diverging content, `Introducton-to-Pandas--master.ipynb` is the **master** Notebook which contains all the information. All of it!
Cell metadata specifies if a Notebook cell should be treated specially. *Special* could be:
* A cell should end up in the presentation Notebook (default)
* A cell should end up in the tasks Notebook
* A cell should end up in the solution Notebook
* … and combinations of these
Since Notebooks are just JSON, I wrote a small parser in Python which suits my needs: `notebook-task-filter.py`.
The script can be launched and is reasonably well documentation. It works by providing *tags* of cells to `keep` and *tags* of cells to `remove`. For instance,
```bash
./notebook-task-filter.py $< --keep task --keep solution --remove nopresentation
```
would look into Notebook cells and keep those which are *tagged* `task` or `solution` and remove those which are tagged `nopresentation`. One special tag for removal exists, `all`, which removes everything except what's marked to keep. A tag in the sense used here is a value to a JSON key in the cell's metadata, which is per default `"exercise"` (but can be selected via `--basekey`. Example:
```json
{
"exercise": "task"
}
// or
{
"exercise": ["notask", "nopresentation"]
}
```
You can provide more than one tag by making them into a list.
Have a look at the provided `Makefile` to see how the individual files for this tutorial have been generated.
If you think this script is useful and want to collaborate on it, let me know and I'll make it into a dedicated repository.
### Presentation from Notebooks
Via `jupyter nbconvert --to slides`, Jupyter Notebooks can be converted to HTML-based slideshows using [reveal.js](https://revealjs.com/). For each cell, one can select in the *Cell Inspector* if the cell should be a `Slide`, `Sub-Slide`, or a `Fragment`.
This works reasonably well, but to add the Jülich design to the reveal.js HTML some more steps are needed. While they are used here (see `Makefile`), please refer to Jan's repository regarding the [reveal.js Jülich Theme](https://gitlab.version.fz-juelich.de/JanMeinke/revealjstheme-juelich).
A PDF version of the slides is generated with [`decktape`](https://github.com/astefanutti/decktape), a NPM package which uses a headless Chromium instance to generate the PDF pages. It's slow and doesn't look 100 % like the presented slides, but it's the best I could find.
### Gitlab Pages
A Gitlab Shared Runner is used to serve material to the public web page of the tutorial. The static `index.html` is to be found its own (orphan) branch at `pages`. The files are copied from the repository to the public web page as indicated in the CI configuration file `.gitlab-ci.yml`.
Ideally, I'd only check in the `--master.ipynb` Notebook and let the CI create all other material. But there's so many wild dependencies (NPM!!1), I'd rather do it myself.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment