get numbers for status of the TOAR-II database
We decided to have a 2-weekly news item on toar-data.org to report about the status of the TOAR-II database.
The actual message is provided here:
https://gitlab.jsc.fz-juelich.de/hedgedoc/s63lZ3DTTjOrHxy5iD24LQ#
Numbers can be taken from the database_statistics REST API (updated every night):
https://toar-data.fz-juelich.de/api/v2/database_statistics/
The number of stations can be taken from the database (as explained for UBA in the following):
getting number of stations from UBA:
SELECT count(distinct(station_id)) FROM timeseries WHERE id IN (SELECT timeseries_id FROM timeseries_timeseries_roles WHERE role_id=1);
determine correspondent role_id for a network:
The role code for a ressource_provider is 5.
At the moment, we have the following ressource_providers in the database:
SELECT * FROM timeseries_roles WHERE role=5;
id | role | status | contact_id
----+------+--------+------------
1 | 5 | 0 | 39
2 | 5 | 0 | 40
5 | 5 | 0 | 5
26 | 5 | 0 | 56
27 | 5 | 0 | 57
814 | 5 | 0 | 522
Information for role with id=1:
SELECT * FROM contacts WHERE id=39;
id | person_id | organisation_id
----+-----------+-----------------
39 | 0 | 35
SELECT * FROM organisations WHERE id=35;
id | name | longname | kind | city | postcode | street_address | homepage | country | contact_url
----+------+-----------------+------+------+----------+----------------+----------+---------+-------------------------
35 | UBA | Umweltbundesamt | 1 | | | | | 84 | mailto:immission@uba.de
This means, that we can get the ids of all timeseries from UBA with the following query:
SELECT timeseries_id FROM timeseries_timeseries_roles WHERE role_id=1;
The above query can be used as a subquery to count distinct UBA stations.
assignment of roles to networks
network | id |
---|---|
German UBA | 1 |
Open AQ | 5 |
AirBase (EEA) | 26 |
NAPS (EC) | 27 |
AQS (EPA) | 29 |
CAPMoN (ECCC) | 814 |
Determine yearly coverage (example Open AQ -- attention: empty time series were ingested!):
SELECT min(data_start_date),max(data_start_date) FROM timeseries WHERE id IN (SELECT timeseries_id FROM timeseries_timeseries_roles WHERE role_id=5) AND data_start_date>'1900-01-01';
Determine species (example Open AQ -- attention: empty time series were ingested!):
SELECT name FROM variables WHERE id IN (SELECT distinct(variable_id) FROM timeseries WHERE id IN (SELECT timeseries_id FROM timeseries_timeseries_roles WHERE role_id=5) AND data_start_date>'1900-01-01');