notebook metadata changes...

a38ece8c · Carsten Hinz · c2ec039c · a38ece8c
Commit a38ece8c authored 6 months ago by Carsten Hinz
--- a/examples/01_produce_data_one_week.ipynb
+++ b/examples/01_produce_data_one_week.ipynb
@@ -9,7 +9,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -20,7 +20,9 @@
    "from toargridding.toar_rest_client import AnalysisServiceDownload, Connection\n",
    "from toargridding.grids import RegularGrid\n",
    "from toargridding.gridding import get_gridded_toar_data\n",
-    "from toargridding.metadata import TimeSample"
+    "from toargridding.metadata import TimeSample\n",
+    "\n",
+    "import xarray as xr"
   ]
  },
  {
@@ -34,7 +36,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -91,7 +93,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [

 %% Cell type:markdown id: tags:

 This cell does only technical stuff by including the required of packages

 %% Cell type:code id: tags:

 ``` python
 import logging
 from datetime import datetime as dt
 from pathlib import Path

 from toargridding.toar_rest_client import AnalysisServiceDownload, Connection
 from toargridding.grids import RegularGrid
 from toargridding.gridding import get_gridded_toar_data
 from toargridding.metadata import TimeSample
+
+import xarray as xr
 ```

 %% Cell type:markdown id: tags:

 In the next step we setup the logging, i.e. the level of information that is displayed as output.

 We start with a default setup and restrict the output to information and more critical output.

 %% Cell type:code id: tags:

 ``` python
 from toargridding.defaultLogging import toargridding_defaultLogging

 #setup of logging
 logger = toargridding_defaultLogging()
 logger.addShellLogger(logging.INFO)
 logger.logExceptions()
 ```

 %% Cell type:markdown id: tags:

 ### Selection of data and grid:

 Here, we come to an interesting point: The selection of the data and the definition of the grid.

 As variable we select ozon by providing the name according to the CF convention.
 As time sampling we select the interval of one week and select a daily sampling.
 And we want to calculate the mean for each sampling point. i.e. this will produce the daily mean for each timeseries.

 Last but not least we create a Cartesian grid by providing the desired resolutions.
 This will result in a warning, as the lateral resolution does not create an integer number of bins (180/1.9=94.74). Therefore, the grid slightly increases the resolution to have grid points with a constant distance.

 %% Cell type:code id: tags:

 ``` python

 variable = ["mole_fraction_of_ozone_in_air"]
 time_sampling = TimeSample( start=dt(2000,1,1), end=dt(2000,1,8), sampling="daily")
 statistics = [ "mean" ]

 grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5 )
 ```

 %% Cell type:markdown id: tags:

 ### Setting up the analysis

 We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal aggregated data.
 Besides the url of the service, we also need to setup two directories on our computer:
 - one to save the data provided by the analysis service (called cache)
 - a second to store our gridded dataset (called results)
 Those will be found in the directory examples/cache and examples/results.

 %% Cell type:code id: tags:

 ``` python
 stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
 cache_basepath = Path("cache")
 result_basepath = Path("results")
 cache_basepath.mkdir(exist_ok=True)
 result_basepath.mkdir(exist_ok=True)
 analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
 ```

 %% Cell type:markdown id: tags:

 ### execution of toargridding.
 Now we want to request the data from the server and create the binned dataset.
 Therefore, we call the function `get_gridded_toar_data` with everything we have prepared until now.

 The request will be submitted to the analysis service, which will process the request. On our side, we will check every 5minutes, if the processing is finished. After 30minutes, we will stop those requests.
 A restart of this cell allows to continue the look-up, if the data are available.

 The obtained data are stored in the cache directory. Before submitting a request, toargridding checks his cache, if the data have already been downloaded.

 %% Cell type:code id: tags:

 ``` python
 print(f"\nProcessing request:")
 print(f"--------------------")
 datasets, metadatas = get_gridded_toar_data(
    analysis_service=analysis_service,
    grid=grid,
    time=time_sampling,
    variables=variable,
    stats=statistics,
    contributors_path=result_basepath
 )
 ```

 %% Cell type:markdown id: tags:

 ### Saving of results
 last but not least, we want to save our dataset as netCDF file.

 This part is done offline. Please note, that the file name for the gridded data also contains the date of creation.

 %% Cell type:code id: tags:

 ``` python
 for dataset, metadata in zip(datasets, metadatas):
    dataset.to_netcdf(result_basepath / f"{metadata.get_id()}_{grid.get_id()}.nc")
    print(metadata.get_id())
 ```