restructured notebooks and added some fixes to the README

simplified example 03 to only show how additional arguments are passed and removed parts already covered in example 02.

restructured notebooks and added some fixes to the README
c119114f · Carsten Hinz · 5530b926 · c119114f · c119114f · c119114f
Commit c119114f authored 4 months ago by Carsten Hinz
--- a/README.md
+++ b/README.md
@@ -6,16 +6,19 @@ Visualization of the mean ozon concentration per grid cell on the 3rd January 20

 # About 

-The TOARgridding projects data from the TOAR database (https://toar-data.fz-juelich.de/) onto a grid. 
-The request to the database includes a statistical analysis of the requested value.
-The mean and standard deviation of all stations within a cell are computed. 
+The TOARgridding tool projects data from the TOAR-II database (https://toar-data.fz-juelich.de/) onto a grid. 
+The user can select the
+- variable, 
+- statistical aggregation,
+- temporal extends,
+- equidistant rectangular lat-lon grid of custom resolution,
+- and (optional) filtering according to the station metadata.

 This tool handles the request to the database over the REST API and the subsequent processing.
 The results of the gridding are provided as [xarray datasets](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html) for subsequent processing and visualization by the user.
 While this project provides ready to use examples, it is intended as a library to be used in dedicated analysis scripts. Furthermore, the long term goal is to provide the gridding as a service over a RESTful API.

-This project is in beta with the intended basic functionalities. 
-The documentation and this README are work in progress.
+This project is in beta with the intended basic functionalities.

 # Table of Content

@@ -24,7 +27,7 @@ The documentation and this README are work in progress.
 # Requirements

 This project requires python 3.10 or higher.
-TBD, see pyproject.toml
+For more information see pyproject.toml.

 This package relies on [netCDF](https://www.unidata.ucar.edu/software/netcdf/) and [HDF5](https://www.hdfgroup.org/) for saving data. You might need to install those as dependencies on your operation system.

@@ -34,7 +37,7 @@ The visualization on a map in one example relies on [cartopy](https://scitools.o
 # Installation

 Move to the folder you want to download this project to.
-We now need to download the source code from the [repository](https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/-/tree/dev?ref_type=heads) either as ZIP file or via git:
+We now need to download the source code from the [repository](https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/) either as ZIP file or via git:

 ## 1) Download with GIT
 Clone the project from its git repository:
@@ -95,13 +98,13 @@ After activating the virtual environment the notebooks can be run by calling
 ```
 as pointed out previously.

-## Retrieving data and visualization
+## 00: Retrieving data and visualization
 ```bash
 jupyter notebook examples/00_download_and_visualization.ipynb
 ```
-The aim of this first example is the creation of a gridded dataset and the visualization of one data point.
-For the visualization, cartopy is required, which requires dependencies, which might not be installed by `pip`. If you are experiencing any issues, do not hesitate to continue with the next examples.
-If you are still curious for the results, we have uploaded the resulting map as title image of our readme.
+The aim of this first example is the creation of a gridded dataset and the visualization of one time point.
+For the visualization, `cartopy` is required, which might need dependencies, that might not be installed by `pip`. If you are experiencing any issues, do not hesitate to continue with the next examples.
+If you are still curious for the results, we have uploaded the resulting map as title image of this README.

 ## 01: Retrieval of one week:
 ```bash
@@ -110,7 +113,7 @@ jupyter notebook examples/01_produce_data_one_week.ipynb
 As a first example we want to download data for ozon covering one week. We calculate the daily mean from each station before combining the results to a grid with a lateral resolution of about 1.9° and a longitudinal resolution of 2.5°.
 The results are saved as [netCDF files](https://docs.xarray.dev/en/stable/user-guide/io.html).

-As the gridding is done offline, it will be executed for already downloaded files, whenever the notebook is rerun. Please note, that the file name for the gridded data also contains the date of creation.
+As the gridding is done offline, it will be executed for already downloaded files, whenever the notebook is rerun. Please note, that the file name for the gridded data also contains the date of creation. Therefore you might end up with several copies.

 ## 02: Retrieval of several years:
 ```bash
@@ -120,7 +123,7 @@ This notebook provides an example on how to download data, apply gridding and sa
 The AnalysisServiceDownload caches already obtained data on the local machine.
 This allows different grids without the necessity to repeat the request to the TOARDB, the statistical analysis and the subsequent download.

-As an example we calculated *dma8epa_strict* on a daily basis for the years 2000 to 2018 for all timeseries in the TOAR database.
+As an example we calculated *dma8epa_strict* on a daily basis for the years 2000 to 2001 for all timeseries in the TOAR database.
 The first attempt for this example covered the full range of 19 years in a single request. It turned out, that an extraction year by year is more reliable. 
 The subsequent requests function as a progress report and allow working with the data, while further requests are processed.


--- a/developer_notebooks/produce_data_many_years_withOptional.ipynb
+++ b/developer_notebooks/produce_data_many_years_withOptional.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Example with optional parameters\n",
+    "Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.\n",
+    "A python dictionary can be provided to include theses other fields. The analysis service provides an error message, if the requested parameters does not exist (check for typos) or if the provided value is wrong.\n",
+    "\n",
+    "In this example we want to obtain data from 2000 to 2018 (maybe change this, if you want your results faster:-)).\n",
+    "\n",
+    "The fist block contains the includes and the setup of the logging."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datetime import datetime as dt\n",
+    "from collections import namedtuple\n",
+    "from pathlib import Path\n",
+    "\n",
+    "from toargridding.toar_rest_client import AnalysisServiceDownload, Connection\n",
+    "from toargridding.grids import RegularGrid\n",
+    "from toargridding.gridding import get_gridded_toar_data\n",
+    "from toargridding.metadata import TimeSample"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now want to include packages for logging. We want so see some output in the shell, we want to log exceptions and we maybe want to have a logfile to review everything later:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "import logging\n",
+    "from toargridding.defaultLogging import toargridding_defaultLogging\n",
+    "\n",
+    "#setup of logging\n",
+    "logger = toargridding_defaultLogging()\n",
+    "logger.addShellLogger(logging.DEBUG)\n",
+    "logger.logExceptions()\n",
+    "logger.addRotatingLogFile(Path(\"log/produce_data_withOptional.log\"))#we need to explicitly set a logfile\n",
+    "#logger.addSysLogger(logging.DEBUG)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Preparation of requests\n",
+    "in the next block we prepare the request to the analysis service.\n",
+    "The dictionary details4Query adds additional requirements to the request. Here, the two fields *toar1_category* and *type_of_area* are used. Both stand for a classification of stations depending on the surrounding area. It is advised to use only one at once. \n",
+    "\n",
+    "\n",
+    "moreOptions is implemented as a dict to add additional arguments to the query to the REST API\n",
+    "For example the field *toar1_category* with its possible values Urban, RuralLowElevation, RuralHighElevation and Unclassified can be added \n",
+    "(see page 18 in https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf).\n",
+    "Or *type_of_area* with urban, suburban and rural on page 20 can be used.\n",
+    "\n",
+    "There are a many more metadata available in the user guide, feel free to look around."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#creation of request.\n",
+    "# helper to keep the configuration together\n",
+    "Config = namedtuple(\"Config\", [\"grid\", \"time\", \"variables\", \"stats\",\"moreOptions\"])\n",
+    "\n",
+    "#uncomment, what you want to test:-)\n",
+    "details4Query ={\n",
+    "    #\"toar1_category\" : \"Urban\" #uncomment if wished:-)\n",
+    "    #\"toar1_category\" : \"RuralLowElevation\" #uncomment if wished:-)\n",
+    "    #\"toar1_category\" : \"RuralHighElevation\" #uncomment if wished:-)\n",
+    "    #\"type_of_area\" : \"Urban\" #also test Rural, Suburban,\n",
+    "    \"type_of_area\" : \"Rural\" #also test Rural, Suburban,\n",
+    "    #\"type_of_area\" : \"Suburban\" #also test Rural, Suburban,\n",
+    "}\n",
+    "\n",
+    "#a regular grid with 1.9°x2.5° resolution. A warning will be issued as 1.9° does not result in  natural number of grid cells.\n",
+    "grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )\n",
+    "\n",
+    "configs = dict()\n",
+    "\n",
+    "# we split the request into one request per year.\n",
+    "for year in range(0,19):\n",
+    "    valid_data = Config(\n",
+    "        grid,\n",
+    "        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"daily\"),#possibly adopt range:-)\n",
+    "        #TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"monthly\"),#possibly adopt range:-)\n",
+    "        [\"mole_fraction_of_ozone_in_air\"],#variable name\n",
+    "        #[ \"mean\", \"dma8epax\"],# will start one request after another other...\n",
+    "        [ \"mean\" ],\n",
+    "        details4Query\n",
+    "    )\n",
+    "    \n",
+    "    configs[f\"test_ta{year}\"] = valid_data\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now need to setup the out connection to the analysis service of the TOAR database. We also setup directories to store the data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
+    "cache_basepath = Path(\"cache\")\n",
+    "result_basepath = Path(\"results\")\n",
+    "cache_basepath.mkdir(exist_ok=True)\n",
+    "result_basepath.mkdir(exist_ok=True)\n",
+    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download and gridding:\n",
+    "Now we come to the last step: we want to download the data, process them and store them to disc.\n",
+    "\n",
+    "CAVE: this cell runs about 45minutes per requested year. therefore we increase the waiting duration to 1h per request.\n",
+    "the processing is done on the server of the TOAR database.\n",
+    "A restart of the cell continues the request to the REST API. Data are cached on the local computer to prevent repetitive downloads.\n",
+    "The download can also take a few minutes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "# maybe adopt the interval for requesting the results and the total duration, before the client pauses the requests.\n",
+    "# as the requests take about 45min, it is more suitable to wait 60min before timing out the requests than the original 30min.\n",
+    "analysis_service.connection.set_request_times(interval_min=5, max_wait_minutes=60)\n",
+    "\n",
+    "for person, config in configs.items():\n",
+    "    print(f\"\\nProcessing {person}:\")\n",
+    "    print(f\"--------------------\")\n",
+    "    datasets, metadatas = get_gridded_toar_data(\n",
+    "        analysis_service=analysis_service,\n",
+    "        grid=config.grid,\n",
+    "        time=config.time,\n",
+    "        variables=config.variables,\n",
+    "        stats=config.stats,\n",
+    "        contributors_path=result_basepath,\n",
+    "        **config.moreOptions\n",
+    "    )\n",
+    "\n",
+    "    for dataset, metadata in zip(datasets, metadatas):\n",
+    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")\n",
+    "        print(metadata.get_id())"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "toargridding-8RVrxzmn-py3.11",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
+%% Cell type:markdown id: tags:
+
+# Example with optional parameters
+Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.
+A python dictionary can be provided to include theses other fields. The analysis service provides an error message, if the requested parameters does not exist (check for typos) or if the provided value is wrong.
+
+In this example we want to obtain data from 2000 to 2018 (maybe change this, if you want your results faster:-)).
+
+The fist block contains the includes and the setup of the logging.
+
+%% Cell type:code id: tags:
+
+``` python
+from datetime import datetime as dt
+from collections import namedtuple
+from pathlib import Path
+
+from toargridding.toar_rest_client import AnalysisServiceDownload, Connection
+from toargridding.grids import RegularGrid
+from toargridding.gridding import get_gridded_toar_data
+from toargridding.metadata import TimeSample
+```
+
+%% Cell type:markdown id: tags:
+
+We now want to include packages for logging. We want so see some output in the shell, we want to log exceptions and we maybe want to have a logfile to review everything later:
+
+%% Cell type:code id: tags:
+
+``` python
+
+import logging
+from toargridding.defaultLogging import toargridding_defaultLogging
+
+#setup of logging
+logger = toargridding_defaultLogging()
+logger.addShellLogger(logging.DEBUG)
+logger.logExceptions()
+logger.addRotatingLogFile(Path("log/produce_data_withOptional.log"))#we need to explicitly set a logfile
+#logger.addSysLogger(logging.DEBUG)
+```
+
+%% Cell type:markdown id: tags:
+
+## Preparation of requests
+in the next block we prepare the request to the analysis service.
+The dictionary details4Query adds additional requirements to the request. Here, the two fields *toar1_category* and *type_of_area* are used. Both stand for a classification of stations depending on the surrounding area. It is advised to use only one at once.
+
+
+moreOptions is implemented as a dict to add additional arguments to the query to the REST API
+For example the field *toar1_category* with its possible values Urban, RuralLowElevation, RuralHighElevation and Unclassified can be added
+(see page 18 in https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf).
+Or *type_of_area* with urban, suburban and rural on page 20 can be used.
+
+There are a many more metadata available in the user guide, feel free to look around.
+
+%% Cell type:code id: tags:
+
+``` python
+#creation of request.
+# helper to keep the configuration together
+Config = namedtuple("Config", ["grid", "time", "variables", "stats","moreOptions"])
+
+#uncomment, what you want to test:-)
+details4Query ={
+    #"toar1_category" : "Urban" #uncomment if wished:-)
+    #"toar1_category" : "RuralLowElevation" #uncomment if wished:-)
+    #"toar1_category" : "RuralHighElevation" #uncomment if wished:-)
+    #"type_of_area" : "Urban" #also test Rural, Suburban,
+    "type_of_area" : "Rural" #also test Rural, Suburban,
+    #"type_of_area" : "Suburban" #also test Rural, Suburban,
+}
+
+#a regular grid with 1.9°x2.5° resolution. A warning will be issued as 1.9° does not result in  natural number of grid cells.
+grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )
+
+configs = dict()
+
+# we split the request into one request per year.
+for year in range(0,19):
+    valid_data = Config(
+        grid,
+        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling="daily"),#possibly adopt range:-)
+        #TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling="monthly"),#possibly adopt range:-)
+        ["mole_fraction_of_ozone_in_air"],#variable name
+        #[ "mean", "dma8epax"],# will start one request after another other...
+        [ "mean" ],
+        details4Query
+    )
+
+    configs[f"test_ta{year}"] = valid_data
+```
+
+%% Cell type:markdown id: tags:
+
+We now need to setup the out connection to the analysis service of the TOAR database. We also setup directories to store the data.
+
+%% Cell type:code id: tags:
+
+``` python
+stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
+cache_basepath = Path("cache")
+result_basepath = Path("results")
+cache_basepath.mkdir(exist_ok=True)
+result_basepath.mkdir(exist_ok=True)
+analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
+```
+
+%% Cell type:markdown id: tags:
+
+## Download and gridding:
+Now we come to the last step: we want to download the data, process them and store them to disc.
+
+CAVE: this cell runs about 45minutes per requested year. therefore we increase the waiting duration to 1h per request.
+the processing is done on the server of the TOAR database.
+A restart of the cell continues the request to the REST API. Data are cached on the local computer to prevent repetitive downloads.
+The download can also take a few minutes
+
+%% Cell type:code id: tags:
+
+``` python
+
+# maybe adopt the interval for requesting the results and the total duration, before the client pauses the requests.
+# as the requests take about 45min, it is more suitable to wait 60min before timing out the requests than the original 30min.
+analysis_service.connection.set_request_times(interval_min=5, max_wait_minutes=60)
+
+for person, config in configs.items():
+    print(f"\nProcessing {person}:")
+    print(f"--------------------")
+    datasets, metadatas = get_gridded_toar_data(
+        analysis_service=analysis_service,
+        grid=config.grid,
+        time=config.time,
+        variables=config.variables,
+        stats=config.stats,
+        contributors_path=result_basepath,
+        **config.moreOptions
+    )
+
+    for dataset, metadata in zip(datasets, metadatas):
+        dataset.to_netcdf(result_basepath / f"{metadata.get_id()}_{config.grid.get_id()}.nc")
+        print(metadata.get_id())
+```
--- a/examples/00_download_and_visualization.ipynb
+++ b/examples/00_download_and_visualization.ipynb
--- a/examples/02_produce_data_manyStations.ipynb
+++ b/examples/02_produce_data_manyStations.ipynb
@@ -4,7 +4,17 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "inclusion of packages and setting up logging"
+    "# Example  to obtain larger requests:\n",
+    "Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.\n",
+    "\n",
+    "In this example we want to obtain data from 2000 to 2001.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### inclusion of packages"
   ]
  },
  {
@@ -21,24 +31,94 @@
    "from toargridding.toar_rest_client import AnalysisServiceDownload, Connection\n",
    "from toargridding.grids import RegularGrid\n",
    "from toargridding.gridding import get_gridded_toar_data\n",
-    "from toargridding.metadata import TimeSample\n",
+    "from toargridding.metadata import TimeSample"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Setup of logging\n",
    "\n",
-    "from toargridding.defaultLogging import toargridding_defaultLogging\n",
+    "In the next step we setup the logging, i.e. the level of information that is displayed as output. \n",
    "\n",
+    "We start with a default setup and restrict the output to information and more critical output like warnings and errors.\n",
    "\n",
+    "We also add logging to a file. This will create a new log file at midnight and keep up to 7 log files."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
    "from toargridding.defaultLogging import toargridding_defaultLogging\n",
-    "#setup of logging\n",
+    "\n",
    "logger = toargridding_defaultLogging()\n",
-    "logger.addShellLogger(logging.DEBUG)\n",
+    "logger.addShellLogger(logging.INFO)\n",
    "logger.logExceptions()\n",
-    "logger.addRotatingLogFile(Path(\"log/produce_data_manyStations.log\"))#we need to explicitly set a logfile"
+    "log_path = Path(\"log\")\n",
+    "log_path.mkdir(exist_ok=True)\n",
+    "logger.addRotatingLogFile( log_path / \"produce_data_manyStations.log\")#we need to explicitly set a logfile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "creation of configurations for the requests to the analysis service. The full duration is split in yearly requests"
+    "#### Setting up the analysis\n",
+    "\n",
+    "We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal and statistical aggregated data. \n",
+    "Besides the url of the service, we also need to setup two directories on our computer:\n",
+    "- one to save the data provided by the analysis service (called cache)\n",
+    "- a second to store our gridded dataset (called results)\n",
+    "Those will be found in the directory examples/cache and examples/results."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
+    "cache_basepath = Path(\"cache\")\n",
+    "result_basepath = Path(\"results\")\n",
+    "cache_basepath.mkdir(exist_ok=True)\n",
+    "result_basepath.mkdir(exist_ok=True)\n",
+    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Our following request will take some time, so we edit the durations between two checks, if our data are ready for download and the maximum duration for checking.\n",
+    "We will check every 45min for 12h. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Preparation of requests\n",
+    "\n",
+    "The basic idea is to split the request into several parts, here: one year per request.\n",
+    "\n",
+    "We also use a container class to keep the configurations together (type: namedtuple).\n",
+    "\n",
+    "In the end we have wo requests, that we want to submit."
   ]
  },
  {
@@ -57,26 +137,33 @@
    "##for educational reasons the extraction of only two years is fine:-)\n",
    "#for year in range (0,19):\n",
    "for year in range (0,2):\n",
-    "    valid_data = Config(\n",
+    "    request_config = Config(\n",
    "        grid,\n",
    "        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"daily\"),#possibly adopt range:-)\n",
    "        [\"mole_fraction_of_ozone_in_air\"],#variable name\n",
-    "        [ \"dma8epa_strict\" ]# change to dma8epa_strict\n",
+    "        [ \"mean\" ]# change to dma8epa_strict\n",
    "        \n",
    "    )\n",
-    "    \n",
-    "    configs[f\"test_ta{year}\"] = valid_data\n"
+    "    configs[f\"test_ta{year}\"] = request_config\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## execution of toargridding. \n",
-    "CAVE: the request takes over 30min per requested year. Therefore this cell needs to be executed at different times to check, if the results are ready for download.\n",
-    "the processing is done on the server of the TOAR database.\n",
-    "A restart of the cell continues the request to the REST API. The data are stored cached on the system to speed up the following analysis.\n",
-    "The download can also take a few minutes"
+    "#### execution of toargridding and saving of results \n",
+    "Now we want to request the data from the server and create the gridded dataset.\n",
+    "Therefore, we call the function `get_gridded_toar_data` with everything we have prepared until now.\n",
+    "\n",
+    "The request will be submitted to the analysis service, which will process the request. On our side, we will check in intervals, if the processing is finished. After several request, we will stop checking. The setup for this can be found a few cells above.\n",
+    "A restart of this cell allows to continue the look-up, if the data are available.\n",
+    "\n",
+    "The obtained data are stored in the result directory (`results_basepath`). Before submitting a request, toargridding checks his cache, if the data have already been downloaded.\n",
+    "\n",
+    "This function also creates a file with the extension \".contributors\".\n",
+    "\n",
+    "Last but not least, we want to save our dataset as netCDF file.\n",
+    "In the global metadata of this file we can find a recipe on how to obtain a list of contributors with the contributors file created by `get_gridded_toar_data`."
   ]
  },
  {
@@ -86,21 +173,8 @@
   "outputs": [],
   "source": [
    "\n",
-    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
-    "cache_basepath = Path(\"cache\")\n",
-    "result_basepath = Path(\"results\")\n",
-    "cache_basepath.mkdir(exist_ok=True)\n",
-    "result_basepath.mkdir(exist_ok=True)\n",
-    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)\n",
-    "\n",
-    "\n",
-    "#here we adopt the durations before, a request is stopped.\n",
-    "#the default value is 30 minutes. \n",
-    "#waiting up to 3h for one request\n",
-    "analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)\n",
-    "\n",
-    "for person, config in configs.items():\n",
-    "    print(f\"\\nProcessing {person}:\")\n",
+    "for config_id, config in configs.items():\n",
+    "    print(f\"\\nProcessing {config_id}:\")\n",
    "    print(f\"--------------------\")\n",
    "    datasets, metadatas = get_gridded_toar_data(\n",
    "        analysis_service=analysis_service,\n",
@@ -112,8 +186,7 @@
    "    )\n",
    "\n",
    "    for dataset, metadata in zip(datasets, metadatas):\n",
-    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")\n",
-    "        print(metadata.get_id())"
+    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")"
   ]
  }
 ],

 %% Cell type:markdown id: tags:

-inclusion of packages and setting up logging
+# Example  to obtain larger requests:
+Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.
+
+In this example we want to obtain data from 2000 to 2001.
+
+%% Cell type:markdown id: tags:
+
+#### inclusion of packages

 %% Cell type:code id: tags:

 ``` python
 import logging
 from datetime import datetime as dt
 from collections import namedtuple
 from pathlib import Path

 from toargridding.toar_rest_client import AnalysisServiceDownload, Connection
 from toargridding.grids import RegularGrid
 from toargridding.gridding import get_gridded_toar_data
 from toargridding.metadata import TimeSample
+```

-from toargridding.defaultLogging import toargridding_defaultLogging
+%% Cell type:markdown id: tags:

+#### Setup of logging

+In the next step we setup the logging, i.e. the level of information that is displayed as output.
+
+We start with a default setup and restrict the output to information and more critical output like warnings and errors.
+
+We also add logging to a file. This will create a new log file at midnight and keep up to 7 log files.
+
+%% Cell type:code id: tags:
+
+``` python
 from toargridding.defaultLogging import toargridding_defaultLogging
-#setup of logging
+
 logger = toargridding_defaultLogging()
-logger.addShellLogger(logging.DEBUG)
+logger.addShellLogger(logging.INFO)
 logger.logExceptions()
-logger.addRotatingLogFile(Path("log/produce_data_manyStations.log"))#we need to explicitly set a logfile
+log_path = Path("log")
+log_path.mkdir(exist_ok=True)
+logger.addRotatingLogFile( log_path / "produce_data_manyStations.log")#we need to explicitly set a logfile
 ```

 %% Cell type:markdown id: tags:

-creation of configurations for the requests to the analysis service. The full duration is split in yearly requests
+#### Setting up the analysis
+
+We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal and statistical aggregated data.
+Besides the url of the service, we also need to setup two directories on our computer:
+- one to save the data provided by the analysis service (called cache)
+- a second to store our gridded dataset (called results)
+Those will be found in the directory examples/cache and examples/results.
+
+%% Cell type:code id: tags:
+
+``` python
+
+stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
+cache_basepath = Path("cache")
+result_basepath = Path("results")
+cache_basepath.mkdir(exist_ok=True)
+result_basepath.mkdir(exist_ok=True)
+analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
+```
+
+%% Cell type:markdown id: tags:
+
+Our following request will take some time, so we edit the durations between two checks, if our data are ready for download and the maximum duration for checking.
+We will check every 45min for 12h.
+
+%% Cell type:code id: tags:
+
+``` python
+analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)
+```
+
+%% Cell type:markdown id: tags:
+
+#### Preparation of requests
+
+The basic idea is to split the request into several parts, here: one year per request.
+
+We also use a container class to keep the configurations together (type: namedtuple).
+
+In the end we have wo requests, that we want to submit.

 %% Cell type:code id: tags:

 ``` python
 #creation of request.

 Config = namedtuple("Config", ["grid", "time", "variables", "stats"])

 grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )

 configs = dict()
 ##for educational reasons the extraction of only two years is fine:-)
 #for year in range (0,19):
 for year in range (0,2):
-    valid_data = Config(
+    request_config = Config(
        grid,
        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling="daily"),#possibly adopt range:-)
        ["mole_fraction_of_ozone_in_air"],#variable name
-        [ "dma8epa_strict" ]# change to dma8epa_strict
+        [ "mean" ]# change to dma8epa_strict

    )
-
-    configs[f"test_ta{year}"] = valid_data
+    configs[f"test_ta{year}"] = request_config
 ```

 %% Cell type:markdown id: tags:

-## execution of toargridding.
-CAVE: the request takes over 30min per requested year. Therefore this cell needs to be executed at different times to check, if the results are ready for download.
-the processing is done on the server of the TOAR database.
-A restart of the cell continues the request to the REST API. The data are stored cached on the system to speed up the following analysis.
-The download can also take a few minutes
+#### execution of toargridding and saving of results
+Now we want to request the data from the server and create the gridded dataset.
+Therefore, we call the function `get_gridded_toar_data` with everything we have prepared until now.

-%% Cell type:code id: tags:
+The request will be submitted to the analysis service, which will process the request. On our side, we will check in intervals, if the processing is finished. After several request, we will stop checking. The setup for this can be found a few cells above.
+A restart of this cell allows to continue the look-up, if the data are available.

-``` python
+The obtained data are stored in the result directory (`results_basepath`). Before submitting a request, toargridding checks his cache, if the data have already been downloaded.

-stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
-cache_basepath = Path("cache")
-result_basepath = Path("results")
-cache_basepath.mkdir(exist_ok=True)
-result_basepath.mkdir(exist_ok=True)
-analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
+This function also creates a file with the extension ".contributors".

+Last but not least, we want to save our dataset as netCDF file.
+In the global metadata of this file we can find a recipe on how to obtain a list of contributors with the contributors file created by `get_gridded_toar_data`.

-#here we adopt the durations before, a request is stopped.
-#the default value is 30 minutes.
-#waiting up to 3h for one request
-analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)
+%% Cell type:code id: tags:
+
+``` python

-for person, config in configs.items():
-    print(f"\nProcessing {person}:")
+for config_id, config in configs.items():
+    print(f"\nProcessing {config_id}:")
    print(f"--------------------")
    datasets, metadatas = get_gridded_toar_data(
        analysis_service=analysis_service,
        grid=config.grid,
        time=config.time,
        variables=config.variables,
        stats=config.stats,
        contributors_path=result_basepath
    )

    for dataset, metadata in zip(datasets, metadatas):
        dataset.to_netcdf(result_basepath / f"{metadata.get_id()}_{config.grid.get_id()}.nc")
-        print(metadata.get_id())
 ```

--- a/examples/03_produce_data_withOptional.ipynb
+++ b/examples/03_produce_data_withOptional.ipynb
@@ -13,12 +13,20 @@
    "The fist block contains the includes and the setup of the logging."
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### inclusion of packages"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
+    "import logging\n",
    "from datetime import datetime as dt\n",
    "from collections import namedtuple\n",
    "from pathlib import Path\n",
@@ -26,14 +34,20 @@
    "from toargridding.toar_rest_client import AnalysisServiceDownload, Connection\n",
    "from toargridding.grids import RegularGrid\n",
    "from toargridding.gridding import get_gridded_toar_data\n",
-    "from toargridding.metadata import TimeSample"
+    "from toargridding.metadata import TimeSample\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We now want to include packages for logging. We want so see some output in the shell, we want to log exceptions and we maybe want to have a logfile to review everything later:"
+    "#### Setup of logging\n",
+    "\n",
+    "In the next step we setup the logging, i.e. the level of information that is displayed as output. \n",
+    "\n",
+    "We start with a default setup and restrict the output to information and more critical output like warnings and errors.\n",
+    "\n",
+    "We also add logging to a file. This will create a new log file at midnight and keep up to 7 log files."
   ]
  },
  {
@@ -42,33 +56,27 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "\n",
-    "import logging\n",
    "from toargridding.defaultLogging import toargridding_defaultLogging\n",
    "\n",
-    "#setup of logging\n",
    "logger = toargridding_defaultLogging()\n",
-    "logger.addShellLogger(logging.DEBUG)\n",
+    "logger.addShellLogger(logging.INFO)\n",
    "logger.logExceptions()\n",
-    "logger.addRotatingLogFile(Path(\"log/produce_data_withOptional.log\"))#we need to explicitly set a logfile\n",
-    "#logger.addSysLogger(logging.DEBUG)"
+    "log_path = Path(\"log\")\n",
+    "log_path.mkdir(exist_ok=True)\n",
+    "logger.addRotatingLogFile( log_path / \"produce_data_manyStations.log\")#we need to explicitly set a logfile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Preparation of requests\n",
-    "in the next block we prepare the request to the analysis service.\n",
-    "The dictionary details4Query adds additional requirements to the request. Here, the two fields *toar1_category* and *type_of_area* are used. Both stand for a classification of stations depending on the surrounding area. It is advised to use only one at once. \n",
+    "#### Setting up the analysis\n",
    "\n",
-    "\n",
-    "moreOptions is implemented as a dict to add additional arguments to the query to the REST API\n",
-    "For example the field *toar1_category* with its possible values Urban, RuralLowElevation, RuralHighElevation and Unclassified can be added \n",
-    "(see page 18 in https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf).\n",
-    "Or *type_of_area* with urban, suburban and rural on page 20 can be used.\n",
-    "\n",
-    "There are a many more metadata available in the user guide, feel free to look around."
+    "We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal and statistical aggregated data. \n",
+    "Besides the url of the service, we also need to setup two directories on our computer:\n",
+    "- one to save the data provided by the analysis service (called cache)\n",
+    "- a second to store our gridded dataset (called results)\n",
+    "Those will be found in the directory examples/cache and examples/results."
   ]
  },
  {
@@ -77,48 +85,21 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "#creation of request.\n",
-    "# helper to keep the configuration together\n",
-    "Config = namedtuple(\"Config\", [\"grid\", \"time\", \"variables\", \"stats\",\"moreOptions\"])\n",
-    "\n",
-    "#uncomment, what you want to test:-)\n",
-    "details4Query ={\n",
-    "    #\"toar1_category\" : \"Urban\" #uncomment if wished:-)\n",
-    "    #\"toar1_category\" : \"RuralLowElevation\" #uncomment if wished:-)\n",
-    "    #\"toar1_category\" : \"RuralHighElevation\" #uncomment if wished:-)\n",
-    "    #\"type_of_area\" : \"Urban\" #also test Rural, Suburban,\n",
-    "    \"type_of_area\" : \"Rural\" #also test Rural, Suburban,\n",
-    "    #\"type_of_area\" : \"Suburban\" #also test Rural, Suburban,\n",
-    "}\n",
-    "\n",
-    "#a regular grid with 1.9°x2.5° resolution. A warning will be issued as 1.9° does not result in  natural number of grid cells.\n",
-    "grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )\n",
    "\n",
-    "configs = dict()\n",
-    "\n",
-    "# we split the request into one request per year.\n",
-    "##for educational reasons the extraction of only two years is fine:-)\n",
-    "#for year in range(0,19):\n",
-    "for year in range(0,2):\n",
-    "    valid_data = Config(\n",
-    "        grid,\n",
-    "        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"daily\"),#possibly adopt range:-)\n",
-    "        #TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"monthly\"),#possibly adopt range:-)\n",
-    "        [\"mole_fraction_of_ozone_in_air\"],#variable name\n",
-    "        #[ \"mean\", \"dma8epax\"],# will start one request after another other...\n",
-    "        [ \"dma8epa_strict\" ],\n",
-    "        #[ \"mean\" ],\n",
-    "        details4Query\n",
-    "    )\n",
-    "    \n",
-    "    configs[f\"test_ta{year}\"] = valid_data\n"
+    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
+    "cache_basepath = Path(\"cache\")\n",
+    "result_basepath = Path(\"results\")\n",
+    "cache_basepath.mkdir(exist_ok=True)\n",
+    "result_basepath.mkdir(exist_ok=True)\n",
+    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We now need to setup the out connection to the analysis service of the TOAR database. We also setup directories to store the data."
+    "Our following request will take some time, so we edit the durations between two checks, if our data are ready for download and the maximum duration for checking.\n",
+    "We will check every 45min for 12h. "
   ]
  },
  {
@@ -127,25 +108,23 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
-    "cache_basepath = Path(\"cache\")\n",
-    "result_basepath = Path(\"results\")\n",
-    "cache_basepath.mkdir(exist_ok=True)\n",
-    "result_basepath.mkdir(exist_ok=True)\n",
-    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)"
+    "analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Download and gridding:\n",
-    "Now we come to the last step: we want to download the data, process them and store them to disc.\n",
+    "#### Preparation of requests with station metadata\n",
    "\n",
-    "CAVE: this cell runs about 45minutes per requested year. therefore we increase the waiting duration to 1h per request.\n",
-    "the processing is done on the server of the TOAR database.\n",
-    "A restart of the cell continues the request to the REST API. Data are cached on the local computer to prevent repetitive downloads.\n",
-    "The download can also take a few minutes"
+    "We restrict our request to one year and of daily mean ozone data. In addition we would like to only include urban stations.\n",
+    "\n",
+    "We use a container class to keep the configurations together (type: namedtuple).\n",
+    "\n",
+    "We also want to refine our station selection by using further metadata.\n",
+    "Therefore, we create the `station_metadata` dictionary. We can use the further metadata stored in the TOAR-DB by providing their name and our desired value. This also discards stations, without a provided value for a metadata field. We can find information on different metadata values in the [documentation](https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf). For example for the *toar1_category* on page 18 and for the *type_of_area* on page 20.\n",
+    "\n",
+    "In the end we have wo requests, that we want to submit."
   ]
  },
  {
@@ -154,13 +133,53 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "Config = namedtuple(\"Config\", [\"grid\", \"time\", \"variables\", \"stats\", \"station_metadata\"])\n",
+    "\n",
+    "#uncomment, if you want to change the metadata:\n",
+    "station_metadata ={\n",
+    "    #\"toar1_category\" : \"Urban\" #uncomment if wished:-)\n",
+    "    \"type_of_area\" : \"Urban\" #also test Rural, Suburban,\n",
+    "}\n",
+    "\n",
+    "grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )\n",
+    "\n",
+    "configs = dict()\n",
+    "request_config = Config(\n",
+    "    grid,\n",
+    "    TimeSample( start=dt(2000,1,1), end=dt(2000,12,31), sampling=\"daily\"),\n",
+    "    [\"mole_fraction_of_ozone_in_air\"],\n",
+    "    [ \"mean\" ],\n",
+    "    station_metadata\n",
+    ")\n",
+    "configs[f\"test_ta\"] = request_config\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### execution of toargridding and saving of results \n",
+    "Now we want to request the data from the TOAR analysis service and create the gridded dataset.\n",
+    "Therefore, we call the function `get_gridded_toar_data` with everything we have prepared until now.\n",
+    "\n",
+    "The request will be submitted to the analysis service, which will process the request. On our side, we will check in intervals, if the processing is finished. After several request, we will stop checking. The setup for this can be found a few cells above.\n",
+    "A restart of this cell allows to continue the look-up, if the data are available.\n",
    "\n",
-    "# maybe adopt the interval for requesting the results and the total duration, before the client pauses the requests.\n",
-    "# as the requests take about 45min, it is more suitable to wait 60min before timing out the requests than the original 30min.\n",
-    "analysis_service.connection.set_request_times(interval_min=5, max_wait_minutes=60)\n",
+    "The obtained data are stored in the result directory (`results_basepath`). Before submitting a request, toargridding checks his cache, if the data have already been downloaded.\n",
+    "\n",
+    "Last but not least, we want to save our dataset as netCDF file.\n",
+    "In the global metadata of this file we can find a recipe on how to obtain a list of contributors with the contributors file created by `get_gridded_toar_data`. This function also creates the required  file with the extension \"*.contributors\"."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
    "\n",
-    "for person, config in configs.items():\n",
-    "    print(f\"\\nProcessing {person}:\")\n",
+    "for config_id, config in configs.items():\n",
+    "    print(f\"\\nProcessing {config_id}:\")\n",
    "    print(f\"--------------------\")\n",
    "    datasets, metadatas = get_gridded_toar_data(\n",
    "        analysis_service=analysis_service,\n",
@@ -168,13 +187,12 @@
    "        time=config.time,\n",
    "        variables=config.variables,\n",
    "        stats=config.stats,\n",
-    "        contributors_path=result_basepath,\n",
-    "        **config.moreOptions\n",
+    "        contributors_path=result_basepath\n",
+    "        **config.station_metadata\n",
    "    )\n",
    "\n",
    "    for dataset, metadata in zip(datasets, metadatas):\n",
-    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")\n",
-    "        print(metadata.get_id())"
+    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")"
   ]
  }
 ],

 %% Cell type:markdown id: tags:

 # Example with optional parameters
 Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.
 A python dictionary can be provided to include theses other fields. The analysis service provides an error message, if the requested parameters does not exist (check for typos) or if the provided value is wrong.

 In this example we want to obtain data from 2000 to 2018 (maybe change this, if you want your results faster:-)).

 The fist block contains the includes and the setup of the logging.

+%% Cell type:markdown id: tags:
+
+#### inclusion of packages
+
 %% Cell type:code id: tags:

 ``` python
+import logging
 from datetime import datetime as dt
 from collections import namedtuple
 from pathlib import Path

 from toargridding.toar_rest_client import AnalysisServiceDownload, Connection
 from toargridding.grids import RegularGrid
 from toargridding.gridding import get_gridded_toar_data
 from toargridding.metadata import TimeSample
 ```

 %% Cell type:markdown id: tags:

-We now want to include packages for logging. We want so see some output in the shell, we want to log exceptions and we maybe want to have a logfile to review everything later:
+#### Setup of logging
+
+In the next step we setup the logging, i.e. the level of information that is displayed as output.
+
+We start with a default setup and restrict the output to information and more critical output like warnings and errors.
+
+We also add logging to a file. This will create a new log file at midnight and keep up to 7 log files.

 %% Cell type:code id: tags:

 ``` python
-
-import logging
 from toargridding.defaultLogging import toargridding_defaultLogging

-#setup of logging
 logger = toargridding_defaultLogging()
-logger.addShellLogger(logging.DEBUG)
+logger.addShellLogger(logging.INFO)
 logger.logExceptions()
-logger.addRotatingLogFile(Path("log/produce_data_withOptional.log"))#we need to explicitly set a logfile
-#logger.addSysLogger(logging.DEBUG)
+log_path = Path("log")
+log_path.mkdir(exist_ok=True)
+logger.addRotatingLogFile( log_path / "produce_data_manyStations.log")#we need to explicitly set a logfile
 ```

 %% Cell type:markdown id: tags:

-## Preparation of requests
-in the next block we prepare the request to the analysis service.
-The dictionary details4Query adds additional requirements to the request. Here, the two fields *toar1_category* and *type_of_area* are used. Both stand for a classification of stations depending on the surrounding area. It is advised to use only one at once.
+#### Setting up the analysis

-
-moreOptions is implemented as a dict to add additional arguments to the query to the REST API
-For example the field *toar1_category* with its possible values Urban, RuralLowElevation, RuralHighElevation and Unclassified can be added
-(see page 18 in https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf).
-Or *type_of_area* with urban, suburban and rural on page 20 can be used.
-
-There are a many more metadata available in the user guide, feel free to look around.
+We need to prepare our connection to the analysis service of the toar database, which will provide us with temporal and statistical aggregated data.
+Besides the url of the service, we also need to setup two directories on our computer:
+- one to save the data provided by the analysis service (called cache)
+- a second to store our gridded dataset (called results)
+Those will be found in the directory examples/cache and examples/results.

 %% Cell type:code id: tags:

 ``` python
-#creation of request.
-# helper to keep the configuration together
-Config = namedtuple("Config", ["grid", "time", "variables", "stats","moreOptions"])

-#uncomment, what you want to test:-)
-details4Query ={
-    #"toar1_category" : "Urban" #uncomment if wished:-)
-    #"toar1_category" : "RuralLowElevation" #uncomment if wished:-)
-    #"toar1_category" : "RuralHighElevation" #uncomment if wished:-)
-    #"type_of_area" : "Urban" #also test Rural, Suburban,
-    "type_of_area" : "Rural" #also test Rural, Suburban,
-    #"type_of_area" : "Suburban" #also test Rural, Suburban,
-}
+stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
+cache_basepath = Path("cache")
+result_basepath = Path("results")
+cache_basepath.mkdir(exist_ok=True)
+result_basepath.mkdir(exist_ok=True)
+analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
+```

-#a regular grid with 1.9°x2.5° resolution. A warning will be issued as 1.9° does not result in  natural number of grid cells.
-grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )
+%% Cell type:markdown id: tags:

-configs = dict()
+Our following request will take some time, so we edit the durations between two checks, if our data are ready for download and the maximum duration for checking.
+We will check every 45min for 12h.

-# we split the request into one request per year.
-##for educational reasons the extraction of only two years is fine:-)
-#for year in range(0,19):
-for year in range(0,2):
-    valid_data = Config(
-        grid,
-        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling="daily"),#possibly adopt range:-)
-        #TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling="monthly"),#possibly adopt range:-)
-        ["mole_fraction_of_ozone_in_air"],#variable name
-        #[ "mean", "dma8epax"],# will start one request after another other...
-        [ "dma8epa_strict" ],
-        #[ "mean" ],
-        details4Query
-    )
+%% Cell type:code id: tags:

-    configs[f"test_ta{year}"] = valid_data
+``` python
+analysis_service.connection.set_request_times(interval_min=45, max_wait_minutes=12*60)
 ```

 %% Cell type:markdown id: tags:

-We now need to setup the out connection to the analysis service of the TOAR database. We also setup directories to store the data.
+#### Preparation of requests with station metadata
+
+We restrict our request to one year and of daily mean ozone data. In addition we would like to only include urban stations.
+
+We use a container class to keep the configurations together (type: namedtuple).
+
+We also want to refine our station selection by using further metadata.
+Therefore, we create the `station_metadata` dictionary. We can use the further metadata stored in the TOAR-DB by providing their name and our desired value. This also discards stations, without a provided value for a metadata field. We can find information on different metadata values in the [documentation](https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf). For example for the *toar1_category* on page 18 and for the *type_of_area* on page 20.
+
+In the end we have wo requests, that we want to submit.

 %% Cell type:code id: tags:

 ``` python
-stats_endpoint = "https://toar-data.fz-juelich.de/api/v2/analysis/statistics/"
-cache_basepath = Path("cache")
-result_basepath = Path("results")
-cache_basepath.mkdir(exist_ok=True)
-result_basepath.mkdir(exist_ok=True)
-analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)
+Config = namedtuple("Config", ["grid", "time", "variables", "stats", "station_metadata"])
+
+#uncomment, if you want to change the metadata:
+station_metadata ={
+    #"toar1_category" : "Urban" #uncomment if wished:-)
+    "type_of_area" : "Urban" #also test Rural, Suburban,
+}
+
+grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )
+
+configs = dict()
+request_config = Config(
+    grid,
+    TimeSample( start=dt(2000,1,1), end=dt(2000,12,31), sampling="daily"),
+    ["mole_fraction_of_ozone_in_air"],
+    [ "mean" ],
+    station_metadata
+)
+configs[f"test_ta"] = request_config
 ```

 %% Cell type:markdown id: tags:

-## Download and gridding:
-Now we come to the last step: we want to download the data, process them and store them to disc.
+#### execution of toargridding and saving of results
+Now we want to request the data from the TOAR analysis service and create the gridded dataset.
+Therefore, we call the function `get_gridded_toar_data` with everything we have prepared until now.
+
+The request will be submitted to the analysis service, which will process the request. On our side, we will check in intervals, if the processing is finished. After several request, we will stop checking. The setup for this can be found a few cells above.
+A restart of this cell allows to continue the look-up, if the data are available.

-CAVE: this cell runs about 45minutes per requested year. therefore we increase the waiting duration to 1h per request.
-the processing is done on the server of the TOAR database.
-A restart of the cell continues the request to the REST API. Data are cached on the local computer to prevent repetitive downloads.
-The download can also take a few minutes
+The obtained data are stored in the result directory (`results_basepath`). Before submitting a request, toargridding checks his cache, if the data have already been downloaded.
+
+Last but not least, we want to save our dataset as netCDF file.
+In the global metadata of this file we can find a recipe on how to obtain a list of contributors with the contributors file created by `get_gridded_toar_data`. This function also creates the required  file with the extension "*.contributors".

 %% Cell type:code id: tags:

 ``` python

-# maybe adopt the interval for requesting the results and the total duration, before the client pauses the requests.
-# as the requests take about 45min, it is more suitable to wait 60min before timing out the requests than the original 30min.
-analysis_service.connection.set_request_times(interval_min=5, max_wait_minutes=60)
-
-for person, config in configs.items():
-    print(f"\nProcessing {person}:")
+for config_id, config in configs.items():
+    print(f"\nProcessing {config_id}:")
    print(f"--------------------")
    datasets, metadatas = get_gridded_toar_data(
        analysis_service=analysis_service,
        grid=config.grid,
        time=config.time,
        variables=config.variables,
        stats=config.stats,
-        contributors_path=result_basepath,
-        **config.moreOptions
+        contributors_path=result_basepath
+        **config.station_metadata
    )

    for dataset, metadata in zip(datasets, metadatas):
        dataset.to_netcdf(result_basepath / f"{metadata.get_id()}_{config.grid.get_id()}.nc")
-        print(metadata.get_id())
 ```