Skip to content
Snippets Groups Projects
Commit a85fa92b authored by Jakob Fritz's avatar Jakob Fritz
Browse files

Changed path of cache to be available across institutes (not only institute-sections)

parent 459aff35
No related branches found
No related tags found
No related merge requests found
Pipeline #270593 passed
%% Cell type:code id: tags:
``` python
from os.path import dirname, join
import os
import sys
import plotly.io as pio
from datetime import datetime
sys.path.insert(0, os.path.abspath("../.."))
from rsemonitor import filereader
from rsemonitor import dbhandler
from rsemonitor import helper_func
from rsemonitor.plot_funcs import (
plot_number_pubs,
plot_correlation_kw_kw,
plot_correlation_inst_kw,
)
from rsemonitor.website_funcs import printmd
%load_ext ipycache
```
%% Cell type:code id: tags:
``` python
# Define config options for plotly
px_config = dict(
{
"scrollZoom": True,
"modeBarButtonsToRemove": ["lasso2d"],
"toImageButtonOptions": {
"format": "png",
"height": 450,
"width": 650,
"scale": 4,
},
}
) # enable scroll/finger-zoom#
# Set renderer for plotly to use CDN for .js-files
pio.renderers.default = "plotly_mimetype+notebook_connected"
```
%% Cell type:code id: tags:
``` python
main_dir = dirname(dirname(os.getcwd()))
settings = filereader.yamlreader(join(main_dir, "rsemonitor", "settings.yml"))
variables = filereader.yamlreader(join(main_dir, "rsemonitor", "variables.yml"))
institutes = filereader.yamlreader(join(main_dir, "rsemonitor", "institutes.yml"))
institute = "TEMPLATE_INSTITUTE"
institute_section = "TEMPLATE_INSTITUTE_SECTION"
min_pubyear = 2012
inst_iids = helper_func.get_cid(institutes[institute])
inst_sec_iids = helper_func.get_cid(institutes[institute][institute_section])
# Load DB
con, cur = dbhandler.get_db(filename=join(main_dir, settings["db_file"]))
```
%% Cell type:code id: tags:
``` python
now = datetime.now().year
```
%% Cell type:code id: tags:
``` python
printmd(f"# Monitor for {institute_section}")
```
%% Cell type:markdown id: tags:
## Number of publications over time
Numbers shown here are not cummulative over years.
The grey bars show the total number of publications for each year. The lines indicate the number of publications containing a specific keyword. As a single publication can contain multiple keywords, the lines are independent and cannot be summed up.
%% Cell type:code id: tags:
``` python
%%cache --silent number_plots_fzj.pkl
%%cache --silent ../cache_number_plots_FZJ.pkl
printmd("### FZJ")
printmd("Publications from the whole JuSER publication database.")
plot_number_pubs(
con,
cur,
list_iid="all",
min_pubyear=min_pubyear,
px_config=px_config,
plot_title="FZJ",
)
```
%% Cell type:code id: tags:
``` python
%%cache --silent number_plots_{institute}.pkl
%%cache --silent ../cache_number_plots_{institute}.pkl
printmd(f"### {institute}")
printmd(
"Publications where at least a single institute-section participated in. "
+ f"So publications where any institute-section from {institute} "
+ "collaborated with any institute(-section) are also counted here."
)
plot_number_pubs(
con,
cur,
list_iid=inst_iids,
min_pubyear=min_pubyear,
px_config=px_config,
plot_title=institute,
)
```
%% Cell type:code id: tags:
``` python
printmd(f"### {institute_section}")
printmd(
f"Publications where {institute_section} participated in. "
+ f"Publications where {institute_section} collaborated with any "
+ "other institute-section are also counted here."
)
plot_number_pubs(
con,
cur,
list_iid=inst_sec_iids,
min_pubyear=min_pubyear,
px_config=px_config,
plot_title=institute_section,
)
```
%% Cell type:markdown id: tags:
## Correlation of keywords
This plot shows which keywords often occur together.
The plot is to be read as follows: The publications in JuSER are filtered by wether they contain the keyword on the left. Then it is counted how many of those publications also contain another keyword. These are shown on top. The fields are colored according to the percentage of publications not only containing the left keyword, but also the keyword above.
The plot is asymmetric as the number of publications containing the keyword on the left is used as denominator.
Hovering on fields shows additional information (such as total numbers).
%% Cell type:code id: tags:
``` python
%%cache --silent kw_corr_fzj.pkl
%%cache --silent ../cache_kw_corr_FZJ.pkl
printmd("### FZJ")
printmd("Publications from the whole publication database JuSER.")
plot_correlation_kw_kw(
con,
cur,
list_iid="all",
min_pubyear=min_pubyear,
px_config=px_config,
plot_title="FZJ",
)
```
%% Cell type:code id: tags:
``` python
%%cache --silent kw_corr_{institute}.pkl
%%cache --silent ../cache_kw_corr_{institute}.pkl
printmd(f"### {institute}")
printmd(f"Limited to the publications. where {institute} participated in.")
plot_correlation_kw_kw(
con,
cur,
list_iid=inst_iids,
min_pubyear=min_pubyear,
px_config=px_config,
plot_title=institute,
)
```
%% Cell type:code id: tags:
``` python
printmd(f"### {institute_section}")
printmd(f"Limited to the publications. where {institute_section} participated in.")
plot_correlation_kw_kw(
con,
cur,
list_iid=inst_sec_iids,
min_pubyear=min_pubyear,
px_config=px_config,
plot_title=institute_section,
)
```
%% Cell type:markdown id: tags:
## Correlation of institutes and keywords
Similarly to the plots shown above, also these plots are created by filtering according to the information on the left and then counting how many of the filtered publications satisfy the criterium from the top.
For the plots on the left (keywords vs. Institute(-sections)): Each row sums up to at least 100%. More than 100% are possible, as multiple institutes (or institute-sections) can be involved in a single publication. A publication from multiple institutes is counted towards each of the participating institutes, leading to a sum of more than 100%.
For the plots on the right (institute(-sections) vs. keywords): The rows do not sum up to any given value. This is because publications could contain all keywords, but also could contain none of the shown keywords.
Hovering on fields shows additional information (such as total numbers).
%% Cell type:code id: tags:
``` python
%%cache --silent kw_inst_corr_fzj.pkl
%%cache --silent ../cache_kw_inst_corr_FZJ.pkl
printmd("### FZJ")
printmd(
"These plots contain all publications from JuSER publication database. "
+ "The additional row/column 'Other' contains only publications where none "
+ "of the institutes mentioned before was involved."
)
plot_correlation_inst_kw(
con,
cur,
inst_dict=institutes,
min_pubyear=min_pubyear,
include_others=True,
px_config=px_config,
plot_title="FZJ",
)
```
%% Cell type:code id: tags:
``` python
%%cache --silent kw_inst_corr_{institute}.pkl
%%cache --silent ../cache_kw_inst_corr_{institute}.pkl
printmd(f"### {institute}")
printmd(
"These plots contain only publications from JuSER publication database "
+ f"where any institute-section from {institute} was involved. "
+ "Therefore, the entry 'Other' is ommited."
)
plot_correlation_inst_kw(
con,
cur,
inst_dict=institutes[institute],
min_pubyear=min_pubyear,
include_others=False,
px_config=px_config,
plot_title=institute,
)
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment