Skip to content
Snippets Groups Projects
Commit b0edc2b4 authored by Carsten Hinz's avatar Carsten Hinz
Browse files

removed a number of prints

parent 210cba66
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
### Get Dataset from request
This cell imports all required packages and sets up the logging as well as the required information for the requests to the TOAR-DB.
We will receive a warning as the lateral resolution of 1.9° does not result in a natural number of cells. Therefore, it is slightly adopted.
%% Cell type:code id: tags:
``` python
from datetime import datetime as dt
from pathlib import Path
import pandas as pd
from toargridding.grids import RegularGrid
from toargridding.toar_rest_client import (
AnalysisServiceDownload,
STATION_LAT,
STATION_LON,
)
from toargridding.metadata import Metadata, TimeSample, AnalysisRequestResult, Coordinates
from toargridding.variables import Coordinate
from toargridding.contributors import contributions_manager_by_id
```
%% Cell type:markdown id: tags:
#### Setup of logging
In the next step we setup the logging, i.e. the level of information that is displayed as output.
We start with a default setup and restrict the output to information and more critical output like warnings and errors.
%% Cell type:code id: tags:
``` python
from toargridding.defaultLogging import toargridding_defaultLogging
import logging
logger = toargridding_defaultLogging()
logger.addShellLogger(logging.INFO)
logger.logExceptions()
```
%% Cell type:markdown id: tags:
#### selection of data and grid:
We need to select a temporal aggregation by selecting one week of data with a daily sampling.
With this sampling we define our metadata for the request. As variable we select ozone and as statistical aggregation a mean.
We also select an offline post processing of the data: we average the mean calculated for all timeseries at a station to create a single value.
Last but not least, we select a data quality flag with 'AllOK', requesting that all data points need to pass the quality tests of the provider and the automatic tests of the TOAR data infrastructure.
Other options exclude the tests of performed by TOAR.
The last step is the definition of the grid. We select a resolution of 2° in latitude and 2.5° in longitude.
%% Cell type:code id: tags:
``` python
time = TimeSample(dt(2016,3,1), dt(2016,3,3), "daily")
metadata = Metadata.construct("mole_fraction_of_ozone_in_air", time, "mean", "meanTSbyStation", data_quality_flag="AllOK")
my_grid = RegularGrid(2.0, 2.5)
```
%% Cell type:markdown id: tags:
#### Setting up the analysis
We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal and statistical aggregated data.
Besides the url of the service, we also need to setup two directories on our computer:
- one to save the data provided by the analysis service (called cache)
- a second to store our gridded dataset (called results)
Those will be found in the directory examples/cache and examples/results.
%% Cell type:code id: tags:
``` python
endpoint = "https://toar-data-testing.fz-juelich.de/api/v2/analysis/statistics/"
#starts in directory [path/to/toargridding]/examples
toargridding_base_path = Path(".")
cache_dir = toargridding_base_path / "cache"
data_download_dir = toargridding_base_path / "results"
cache_dir.mkdir(exist_ok=True)
data_download_dir.mkdir(exist_ok=True)
analysis_service = AnalysisServiceDownload(endpoint, cache_dir, data_download_dir, use_downloaded=True)
```
%% Cell type:markdown id: tags:
#### Download of data and writing to disc:
In the next step we want to download the data and store them to disc.
To obtain the contributors for this dataset, we need to create a dedicated file. This can be uploaded to the TOAR database to obtain a preformatted list of contributors. The required recipe can be found in the global metadata of the netCDF file.
The request the database can take several minutes. This duration is also dependent on the overall usage of the services. The `get_data` function checks every 5minutes, if the data are ready for download. After 30min this cell stops the execution. Simply restart this cell to continue checking for the results.
%% Cell type:code id: tags:
``` python
# this cell can run longer than 30minutes
data = analysis_service.get_data(metadata)
# create contributors endpoint and write result to metadata
contrib = contributions_manager_by_id(metadata.get_id(), data_download_dir)
contrib.extract_contributors_from_data_frame(data.stations_data)
metadata.contributors_metadata_field = contrib.setup_contributors_endpoint_for_metadata()
ds = my_grid.as_xarray(data)
```
%% Cell type:markdown id: tags:
### Visual inspection
%% Cell type:markdown id: tags:
We are working here with raw data and also want to visualize the station positions. Therefore, we want to distinguish stations that have valid data and those without valid data.
%% Cell type:code id: tags:
``` python
#calculation of coordinates for plotting
#especially separation of coordinates with results and without results.
clean_coords = data.stations_coords
all_na = data.stations_data.isna().all(axis=1)
clean_coords = all_na.to_frame().join(clean_coords)[["latitude", "longitude"]]
all_na_coords = clean_coords[all_na]
not_na_coords = clean_coords[~all_na]
```
%% Cell type:code id: tags:
``` python
print(ds)
```
%% Cell type:markdown id: tags:
In the next step we prepare a function for plotting the gridded data to a world map. The flag *discrete* influences the creation of the color bar. The *plot_stations* flag allows including the station positions into the map.
%% Cell type:code id: tags:
``` python
import cartopy.crs as ccrs
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import numpy as np
#definition of plotting function
def plot_cells(data, stations, na_stations, discrete=True, plot_stations=False):
fig = plt.figure(figsize=(9, 18))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()
gl = ax.gridlines(draw_labels=True)
gl.top_labels = False
gl.left_labels = False
gl.xlocator = mticker.FixedLocator(data.longitude.values)
gl.ylocator = mticker.FixedLocator(data.latitude.values)
cmap = mpl.cm.viridis
if discrete:
print(np.unique(data.values))
#print(np.unique(data.values))
bounds = np.arange(8)
norm = mpl.colors.BoundaryNorm(bounds, cmap.N, extend="both")
ticks = np.arange(bounds.size + 1)[:-1] + 0.5
ticklables = bounds
im = plt.pcolormesh(
data.longitude,
data.latitude,
data,
transform=ccrs.PlateCarree(),
cmap=cmap,
shading="nearest",
norm=norm,
)
cb = fig.colorbar(im, ax=ax, shrink=0.2, aspect=25)
cb.set_ticks(ticks)
cb.set_ticklabels(ticklables)
im = plt.pcolormesh(
data.latitude,
data.longitude,
data,
transform=ccrs.PlateCarree(),
cmap=cmap,
shading="nearest",
norm=norm,
)
else:
print(data)
#print(data)
im = plt.pcolormesh(
data.latitude,
data.longitude,
data,
transform=ccrs.PlateCarree(),
cmap=cmap,
shading="nearest",
)
cb = fig.colorbar(im, ax=ax, shrink=0.2, aspect=25)
if plot_stations:
plt.scatter(na_stations["longitude"], na_stations["latitude"], s=1, c="k")
plt.scatter(stations["longitude"], stations["latitude"], s=1, c="r")
plt.tight_layout()
unit = data.attrs["units"]
plt.title(f"global ozone [{unit}] at {data.time.values} {data.time.units}")
```
%% Cell type:markdown id: tags:
Now we do the actual plotting. We select a single time from the dataset. To obtain two maps: 1) the mean ozone concentration per grid point and second the number of stations contributing to a grid point.
%% Cell type:code id: tags:
``` python
#example visualization for two time points
#print(not_na_coords)
timestep = 2
time = ds.time[timestep]
selected_data = ds.sel(time=time)
print(selected_data)
#print(selected_data)
plot_cells(selected_data["mean"], not_na_coords, all_na_coords, discrete=False, plot_stations=False)
plt.show()
plot_cells(selected_data["n"], not_na_coords, all_na_coords, discrete=True)
plt.show()
n_observations = ds["n"].sum(["latitude", "longitude"])
plt.plot(ds.time, n_observations)
plt.title("Number of Contributions per Time Point")
plt.xlabel("Time")
plt.ylabel("Number of contributions")
print(np.unique(ds["n"]))
```
%% Cell type:markdown id: tags:
Last but not least: We print the data and metadata of the dataset. Especially a look into the metadata can be interesting.
%% Cell type:code id: tags:
``` python
print(data)
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment