Skip to content
Snippets Groups Projects
Commit 7c35698b authored by Andreas Herten's avatar Andreas Herten
Browse files

Add forgotten imports

parent 20db0005
No related branches found
No related tags found
No related merge requests found
......@@ -180,6 +180,7 @@
"cell_type": "code",
"execution_count": 2,
"metadata": {
"exercise": "task",
"slideshow": {
"slide_type": "fragment"
}
......@@ -4379,6 +4380,7 @@
"cell_type": "code",
"execution_count": 56,
"metadata": {
"exercise": "task",
"slideshow": {
"slide_type": "fragment"
}
......
Source diff could not be displayed: it is too large. Options to address this: view the blob.
Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
# *Introduction to* Data Analysis and Plotting with Pandas
## JSC Tutorial
Andreas Herten, Forschungszentrum Jülich, 26 February 2019
%% Cell type:markdown id: tags:
**Version: Tasks**
%% Cell type:markdown id: tags:
## Outline
* [Task 1](#task1)
* [Task 2](#task2)
* [Task 3](#task3)
* [Task 4](#task4)
* [Task 5](#task5)
* [Task 6](#task6)
* [Task 7](#task7)
* [Bonus Task](#taskb)
%% Cell type:code id: tags:
``` python
import pandas as pd
```
%% Cell type:markdown id: tags:
## Task 1
<a name="task1"></a>
* Create data frame with
- 10 names of dinosaurs,
- their favourite prime number,
- and their favourite color
* Play around with the frame
* Tell me on poll when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
Jupyter Notebook 101:
* Execute cell: `shift+enter`
* New cell in front of current cell: `a`
* New cell after current cell: `b`
%% Cell type:code id: tags:
``` python
happy_dinos = {
"Dinosaur Name": [],
"Favourite Prime": [],
"Favourite Color": []
}
#df_dinos =
```
%% Cell type:markdown id: tags:
## Task 2
<a name="task2"></a>
* Read in `nest-data.csv` to `DataFrame`; call it `df`
* Get to know it and play a bit with it
* Tell me when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:code id: tags:
``` python
!cat nest-data.csv | head -3
```
%% Output
id,Nodes,Tasks/Node,Threads/Task,Runtime Program / s,Scale,Plastic,Avg. Neuron Build Time / s,Min. Edge Build Time / s,Max. Edge Build Time / s,Min. Init. Time / s,Max. Init. Time / s,Presim. Time / s,Sim. Time / s,Virt. Memory (Sum) / kB,Local Spike Counter (Sum),Average Rate (Sum),Number of Neurons,Number of Connections,Min. Delay,Max. Delay
5,1,2,4,420.42,10,true,0.29,88.12,88.18,1.14,1.20,17.26,311.52,46560664.00,825499,7.48,112500,1265738500,1.5,1.5
5,1,4,4,200.84,10,true,0.15,46.03,46.34,0.70,1.01,7.87,142.97,46903088.00,802865,7.03,112500,1265738500,1.5,1.5
%% Cell type:markdown id: tags:
## Task 3
<a name="task3"></a>
* Add a column to the Nest data frame called `Virtual Processes` which is the total number of threads across all nodes (i.e. the product of threads per task and tasks per node and nodes)
* Remember to tell me when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:code id: tags:
``` python
import matplotlib.pyplot as plt
%matplotlib inline
```
%% Cell type:markdown id: tags:
## Task 4
<a name="task4"></a>
* Sort the data frame by the virtual proccesses
* Plot `"Presim. Time / s"` and `"Sim. Time / s"` of our data frame `df` as a function of the virtual processes
* Use a dashed, red line for `"Presim. Time / s"`, a blue line for `"Sim. Time / s"` (see [API description](https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot))
* Don't forget to label your axes and to add a legend
* Submit when you're done: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 5
<a name="task5"></a>
Use the NEST data frame `df` to:
1. Make the virtual processes the index of the data frame (`.set_index()`)
2. Plot `"Presim. Program / s"` and `"Sim. Time / s`" individually
3. Plot them onto one common canvas!
4. Make them have the same line colors and styles as before
5. Add a legend, add missing labels
* Done? Tell me! [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 6
<a name="task6"></a>
* To your `df` NEST data frame, add a column with the unaccounted time (`Unaccounted Time / s`), which is the difference of program runtime, average neuron build time, minimal edge build time, minimal initialization time, presimulation time, and simulation time.
(*I know this is technically not super correct, but it will do for our example.*)
* Plot a stacked bar plot of all these columns (except for program runtime) over the virtual processes
* Remember: [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
## Task 7
<a name="task7"></a>
* Create a pivot table based on the NEST `df` data frame
* Let the `x` axis show the number of nodes; display the values of the simulation time `"Sim. Time / s"` for the tasks per node and threas per task configurations
* Please plot a bar plot
* Done? [pollev.com/aherten538](https://pollev.com/aherten538)
%% Cell type:markdown id: tags:
<a name="taskb"></a>
* Bonus task
- Use `Sim. Time / s` and `Presim. Time / s` as values to show
- Show a stack of those two values inside the pivot table
%% Cell type:markdown id: tags:
<span class="feedback">Tell me what you think about this tutorial! <a href="mailto:a.herten@fz-juelich.de">a.herten@fz-juelich.de</a></span>
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment