Skip to content
Snippets Groups Projects
Commit 7c35698b authored by Andreas Herten's avatar Andreas Herten
Browse files

Add forgotten imports

parent 20db0005
Branches
Tags
No related merge requests found
......@@ -180,6 +180,7 @@
"cell_type": "code",
"execution_count": 2,
"metadata": {
"exercise": "task",
"slideshow": {
"slide_type": "fragment"
}
......@@ -4379,6 +4380,7 @@
"cell_type": "code",
"execution_count": 56,
"metadata": {
"exercise": "task",
"slideshow": {
"slide_type": "fragment"
}
......
This diff is collapsed.
Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
# *Introduction to* Data Analysis and Plotting with Pandas
## JSC Tutorial
Andreas Herten, Forschungszentrum Jülich, 26 February 2019
%% Cell type:markdown id: tags:
**Version: Tasks**
%% Cell type:markdown id: tags:
## Outline
* [Task 1](#task1)
* [Task 2](#task2)
* [Task 3](#task3)
* [Task 4](#task4)
* [Task 5](#task5)
* [Task 6](#task6)
* [Task 7](#task7)
* [Bonus Task](#taskb)
%% Cell type:code id: tags:
``` python
import pandas as pd
```
%% Cell type:markdown id: tags:
## Task 1
<a name="task1"></a>
* Create data frame with
- 10 names of dinosaurs,
- their favourite prime number,
- and their favourite color
* Play around with the frame
* Tell me on poll when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
Jupyter Notebook 101:
* Execute cell: `shift+enter`
* New cell in front of current cell: `a`
* New cell after current cell: `b`
%% Cell type:code id: tags:
``` python
happy_dinos = {
"Dinosaur Name": [],
"Favourite Prime": [],
"Favourite Color": []
}
#df_dinos =
```
%% Cell type:markdown id: tags:
## Task 2
<a name="task2"></a>
* Read in `nest-data.csv` to `DataFrame`; call it `df`
* Get to know it and play a bit with it
* Tell me when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:code id: tags:
``` python
!cat nest-data.csv | head -3
```
%% Output
id,Nodes,Tasks/Node,Threads/Task,Runtime Program / s,Scale,Plastic,Avg. Neuron Build Time / s,Min. Edge Build Time / s,Max. Edge Build Time / s,Min. Init. Time / s,Max. Init. Time / s,Presim. Time / s,Sim. Time / s,Virt. Memory (Sum) / kB,Local Spike Counter (Sum),Average Rate (Sum),Number of Neurons,Number of Connections,Min. Delay,Max. Delay
5,1,2,4,420.42,10,true,0.29,88.12,88.18,1.14,1.20,17.26,311.52,46560664.00,825499,7.48,112500,1265738500,1.5,1.5
5,1,4,4,200.84,10,true,0.15,46.03,46.34,0.70,1.01,7.87,142.97,46903088.00,802865,7.03,112500,1265738500,1.5,1.5
%% Cell type:markdown id: tags:
## Task 3
<a name="task3"></a>
* Add a column to the Nest data frame called `Virtual Processes` which is the total number of threads across all nodes (i.e. the product of threads per task and tasks per node and nodes)
* Remember to tell me when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:code id: tags:
``` python
import matplotlib.pyplot as plt
%matplotlib inline
```
%% Cell type:markdown id: tags:
## Task 4
<a name="task4"></a>
* Sort the data frame by the virtual proccesses
* Plot `"Presim. Time / s"` and `"Sim. Time / s"` of our data frame `df` as a function of the virtual processes
* Use a dashed, red line for `"Presim. Time / s"`, a blue line for `"Sim. Time / s"` (see [API description](https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot))
* Don't forget to label your axes and to add a legend
* Submit when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 5
<a name="task5"></a>
Use the NEST data frame `df` to:
1. Make the virtual processes the index of the data frame (`.set_index()`)
2. Plot `"Presim. Program / s"` and `"Sim. Time / s`" individually
3. Plot them onto one common canvas!
4. Make them have the same line colors and styles as before
5. Add a legend, add missing labels
* Done? Tell me! [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 6
<a name="task6"></a>
* To your `df` NEST data frame, add a column with the unaccounted time (`Unaccounted Time / s`), which is the difference of program runtime, average neuron build time, minimal edge build time, minimal initialization time, presimulation time, and simulation time.
(*I know this is technically not super correct, but it will do for our example.*)
* Plot a stacked bar plot of all these columns (except for program runtime) over the virtual processes
* Remember: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 7
<a name="task7"></a>
* Create a pivot table based on the NEST `df` data frame
* Let the `x` axis show the number of nodes; display the values of the simulation time `"Sim. Time / s"` for the tasks per node and threas per task configurations
* Please plot a bar plot
* Done? [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
<a name="taskb"></a>
* Bonus task
- Use `Sim. Time / s` and `Presim. Time / s` as values to show
- Show a stack of those two values inside the pivot table
%% Cell type:markdown id: tags:
<span class="feedback">Tell me what you think about this tutorial! <a href="mailto:a.herten@fz-juelich.de">a.herten@fz-juelich.de</a></span>
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment