Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
MLAir
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
esde
machine-learning
MLAir
Commits
e88fd61b
Commit
e88fd61b
authored
5 years ago
by
lukas leufen
Browse files
Options
Downloads
Patches
Plain Diff
add join settings to switch between daily and hourly data
parent
4122d1df
Branches
Branches containing commit
Tags
Tags containing commit
2 merge requests
!37
include new development
,
!36
include using of hourly data
Pipeline
#29067
passed
5 years ago
Stage: test
Stage: pages
Stage: deploy
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
src/join.py
+18
-9
18 additions, 9 deletions
src/join.py
src/join_settings.py
+11
-0
11 additions, 0 deletions
src/join_settings.py
with
29 additions
and
9 deletions
src/join.py
+
18
−
9
View file @
e88fd61b
...
@@ -8,8 +8,9 @@ import pandas as pd
...
@@ -8,8 +8,9 @@ import pandas as pd
import
datetime
as
dt
import
datetime
as
dt
from
typing
import
Iterator
,
Union
,
List
,
Dict
from
typing
import
Iterator
,
Union
,
List
,
Dict
from
src
import
helpers
from
src
import
helpers
from
src.join_settings
import
join_settings
join_url_base
=
'
https://join.fz-juelich.de/services/rest/surfacedata/
'
#
join_url_base = 'https://join.fz-juelich.de/services/rest/surfacedata/'
str_or_none
=
Union
[
str
,
None
]
str_or_none
=
Union
[
str
,
None
]
...
@@ -21,7 +22,7 @@ class EmptyQueryResult(Exception):
...
@@ -21,7 +22,7 @@ class EmptyQueryResult(Exception):
def
download_join
(
station_name
:
Union
[
str
,
List
[
str
]],
stat_var
:
dict
,
station_type
:
str
=
None
,
def
download_join
(
station_name
:
Union
[
str
,
List
[
str
]],
stat_var
:
dict
,
station_type
:
str
=
None
,
network_name
:
str
=
None
)
->
[
pd
.
DataFrame
,
pd
.
DataFrame
]:
network_name
:
str
=
None
,
sampling
:
str
=
"
daily
"
)
->
[
pd
.
DataFrame
,
pd
.
DataFrame
]:
"""
"""
read data from JOIN/TOAR
read data from JOIN/TOAR
...
@@ -29,6 +30,7 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
...
@@ -29,6 +30,7 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
:param stat_var: key as variable like
'
O3
'
, values as statistics on keys like
'
mean
'
:param stat_var: key as variable like
'
O3
'
, values as statistics on keys like
'
mean
'
:param station_type: set the station type like
"
traffic
"
or
"
background
"
, can be none
:param station_type: set the station type like
"
traffic
"
or
"
background
"
, can be none
:param network_name: set the measurement network like
"
UBA
"
or
"
AIRBASE
"
, can be none
:param network_name: set the measurement network like
"
UBA
"
or
"
AIRBASE
"
, can be none
:param sampling: sampling rate of the downloaded data, either set to daily or hourly (default daily)
:returns:
:returns:
- df - data frame with all variables and statistics
- df - data frame with all variables and statistics
- meta - data frame with all meta information
- meta - data frame with all meta information
...
@@ -36,8 +38,11 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
...
@@ -36,8 +38,11 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
# make sure station_name parameter is a list
# make sure station_name parameter is a list
station_name
=
helpers
.
to_list
(
station_name
)
station_name
=
helpers
.
to_list
(
station_name
)
# get data connection settings
join_url_base
,
headers
=
join_settings
(
sampling
)
# load series information
# load series information
vars_dict
=
load_series_information
(
station_name
,
station_type
,
network_name
)
vars_dict
=
load_series_information
(
station_name
,
station_type
,
network_name
,
join_url_base
,
headers
)
# download all variables with given statistic
# download all variables with given statistic
data
=
None
data
=
None
...
@@ -49,10 +54,10 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
...
@@ -49,10 +54,10 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
# create data link
# create data link
opts
=
{
'
base
'
:
join_url_base
,
'
service
'
:
'
stats
'
,
'
id
'
:
vars_dict
[
var
],
'
statistics
'
:
stat_var
[
var
],
opts
=
{
'
base
'
:
join_url_base
,
'
service
'
:
'
stats
'
,
'
id
'
:
vars_dict
[
var
],
'
statistics
'
:
stat_var
[
var
],
'
sampling
'
:
'
daily
'
,
'
capture
'
:
0
,
'
min_data_length
'
:
1460
}
'
sampling
'
:
sampling
,
'
capture
'
:
0
,
'
min_data_length
'
:
1460
}
# load data
# load data
data
=
get_data
(
opts
)
data
=
get_data
(
opts
,
headers
)
# correct namespace of statistics
# correct namespace of statistics
stat
=
_correct_stat_name
(
stat_var
[
var
])
stat
=
_correct_stat_name
(
stat_var
[
var
])
...
@@ -70,30 +75,34 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
...
@@ -70,30 +75,34 @@ def download_join(station_name: Union[str, List[str]], stat_var: dict, station_t
raise
EmptyQueryResult
(
"
No data found in JOIN.
"
)
raise
EmptyQueryResult
(
"
No data found in JOIN.
"
)
def
get_data
(
opts
:
Dict
)
->
Union
[
Dict
,
List
]:
def
get_data
(
opts
:
Dict
,
headers
:
Dict
)
->
Union
[
Dict
,
List
]:
"""
"""
Download join data using requests framework. Data is returned as json like structure. Depending on the response
Download join data using requests framework. Data is returned as json like structure. Depending on the response
structure, this can lead to a list or dictionary.
structure, this can lead to a list or dictionary.
:param opts: options to create the request url
:param opts: options to create the request url
:param headers: additional headers information like authorization, can be empty
:return: requested data (either as list or dictionary)
:return: requested data (either as list or dictionary)
"""
"""
url
=
create_url
(
**
opts
)
url
=
create_url
(
**
opts
)
response
=
requests
.
get
(
url
)
response
=
requests
.
get
(
url
,
headers
=
headers
)
return
response
.
json
()
return
response
.
json
()
def
load_series_information
(
station_name
:
List
[
str
],
station_type
:
str_or_none
,
network_name
:
str_or_none
)
->
Dict
:
def
load_series_information
(
station_name
:
List
[
str
],
station_type
:
str_or_none
,
network_name
:
str_or_none
,
join_url_base
:
str
,
headers
:
Dict
)
->
Dict
:
"""
"""
List all series ids that are available for given station id and network name.
List all series ids that are available for given station id and network name.
:param station_name: Station name e.g. DEBW107
:param station_name: Station name e.g. DEBW107
:param station_type: station type like
"
traffic
"
or
"
background
"
:param station_type: station type like
"
traffic
"
or
"
background
"
:param network_name: measurement network of the station like
"
UBA
"
or
"
AIRBASE
"
:param network_name: measurement network of the station like
"
UBA
"
or
"
AIRBASE
"
:param join_url_base: base url name to download data from
:param headers: additional headers information like authorization, can be empty
:return: all available series for requested station stored in an dictionary with parameter name (variable) as key
:return: all available series for requested station stored in an dictionary with parameter name (variable) as key
and the series id as value.
and the series id as value.
"""
"""
opts
=
{
"
base
"
:
join_url_base
,
"
service
"
:
"
series
"
,
"
station_id
"
:
station_name
[
0
],
"
station_type
"
:
station_type
,
opts
=
{
"
base
"
:
join_url_base
,
"
service
"
:
"
series
"
,
"
station_id
"
:
station_name
[
0
],
"
station_type
"
:
station_type
,
"
network_name
"
:
network_name
}
"
network_name
"
:
network_name
}
station_vars
=
get_data
(
opts
)
station_vars
=
get_data
(
opts
,
headers
)
vars_dict
=
{
item
[
3
].
lower
():
item
[
0
]
for
item
in
station_vars
}
vars_dict
=
{
item
[
3
].
lower
():
item
[
0
]
for
item
in
station_vars
}
return
vars_dict
return
vars_dict
...
...
This diff is collapsed.
Click to expand it.
src/join_settings.py
0 → 100644
+
11
−
0
View file @
e88fd61b
def
join_settings
(
sampling
=
"
daily
"
):
if
sampling
==
"
daily
"
:
TOAR_SERVICE_URL
=
'
https://join.fz-juelich.de/services/rest/surfacedata/
'
headers
=
{}
elif
sampling
==
"
hourly
"
:
TOAR_SERVICE_URL
=
'
https://join.fz-juelich.de/services/rest/surfacedata/
'
headers
=
{
"
Authorization
"
:
"
Token 12345
"
}
else
:
raise
NameError
(
f
"
Given sampling
{
sampling
}
is not supported, choose from either daily or hourly sampling.
"
)
return
TOAR_SERVICE_URL
,
headers
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment