Skip to content
Snippets Groups Projects
produce_data_withOptional.ipynb 6.83 KiB
Newer Older
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example with optional parameters\n",
    "Toargridding has a number of required arguments for a dataset. Those include the time range, variable and statistical analysis. The TAOR-DB has a large number of metadata fileds that can be used to further refine this request.\n",
    "A python dictionary can be provided to include theses other fields. The analysis service provides an error message, if the requested parameters does not exist (check for typos) or if the provided value is wrong.\n",
    "\n",
    "In this example we want to obtain data from 2000 to 2018 (maybe change this, if you want your results faster:-)).\n",
    "\n",
    "The fist block contains the includes and the setup of the logging."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from datetime import datetime as dt\n",
    "from collections import namedtuple\n",
    "from pathlib import Path\n",
    "\n",
    "from toargridding.toar_rest_client import AnalysisServiceDownload, Connection\n",
    "from toargridding.grids import RegularGrid\n",
    "from toargridding.gridding import get_gridded_toar_data\n",
    "from toargridding.metadata import TimeSample\n",
    "\n",
    "from toargridding.defaultLogging import toargridding_defaultLogging\n",
    "\n",
    "#setup of logging\n",
    "logger = toargridding_defaultLogging()\n",
    "logger.addShellLogger(logging.DEBUG)\n",
    "logger.logExceptions()\n",
    "logger.addRotatingLogFile(Path(\"log/produce_data_withOptional.log\"))#we need to explicitly set a logfile\n",
    "#logger.addSysLogger(logging.DEBUG)"
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "in the next block we prepare the request to the analysis service.\n",
    "The dictionary details4Query adds additional requirements to the request. Here, the two fields *toar1_category* and *type_of_area* are used. Both stand for a classification of stations depending on the surrounding area. It is advised to use only one at once. "
   ]
  },
  {
   "cell_type": "code",
Carsten Hinz's avatar
Carsten Hinz committed
   "execution_count": null,
   "metadata": {},
Carsten Hinz's avatar
Carsten Hinz committed
   "outputs": [],
   "source": [
Carsten Hinz's avatar
Carsten Hinz committed
    "#creation of request.\n",
    "\n",
    "Config = namedtuple(\"Config\", [\"grid\", \"time\", \"variables\", \"stats\",\"moreOptions\"])\n",
    "\n",
    "#moreOptions is implemented as a dict to add additional arguments to the query to the REST API\n",
    "#For example the field toar1_category with its possible values Urban, RuralLowElevation, RuralHighElevation and Unclassified can be added.\n",
    "#see page 18 in https://toar-data.fz-juelich.de/sphinx/TOAR_UG_Vol03_Database/build/latex/toardatabase--userguide.pdf\n",
    "#or type_of_area with urban, suburban and rural on page 20 can be used\n",
    "    #\"toar1_category\" : \"Urban\" #uncomment if wished:-)\n",
    "    #\"toar1_category\" : \"RuralLowElevation\" #uncomment if wished:-)\n",
    "    #\"toar1_category\" : \"RuralHighElevation\" #uncomment if wished:-)\n",
    "    #\"type_of_area\" : \"Urban\" #also test Rural, Suburban,\n",
    "    \"type_of_area\" : \"Rural\" #also test Rural, Suburban,\n",
    "    #\"type_of_area\" : \"Suburban\" #also test Rural, Suburban,\n",
    "grid = RegularGrid( lat_resolution=1.9, lon_resolution=2.5, )\n",
    "configs = dict()\n",
    "##for educational reasons the extraction of only two years is fine:-)\n",
    "#for year in range(0,19):\n",
    "for year in range(0,2):\n",
    "    valid_data = Config(\n",
    "        grid,\n",
    "        TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"daily\"),#possibly adopt range:-)\n",
    "        #TimeSample( start=dt(2000+year,1,1), end=dt(2000+year,12,31), sampling=\"monthly\"),#possibly adopt range:-)\n",
    "        [\"mole_fraction_of_ozone_in_air\"],#variable name\n",
    "        #[ \"mean\", \"dma8epax\"],# will start one request after another other...\n",
    "        [ \"dma8epa_strict\" ],\n",
    "        #[ \"mean\" ],\n",
    "        details4Query\n",
    "    )\n",
    "    \n",
Carsten Hinz's avatar
Carsten Hinz committed
    "    configs[f\"test_ta{year}\"] = valid_data\n"
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we come to the last step: we want to download the data, process them and store them to disc.\n",
    "\n",
    "CAVE: this cell runs about 45minutes per requested year. therefore we increase the waiting duration to 1h per request.\n",
    "the processing is done on the server of the TOAR database.\n",
    "A restart of the cell continues the request to the REST API. Data are cached on the local computer to prevent repetitive downloads.\n",
    "The download can also take a few minutes"
   ]
  },
  {
   "cell_type": "code",
Carsten Hinz's avatar
Carsten Hinz committed
   "execution_count": null,
   "metadata": {},
Carsten Hinz's avatar
Carsten Hinz committed
   "outputs": [],
   "source": [
Carsten Hinz's avatar
Carsten Hinz committed
    "\n",
    "stats_endpoint = \"https://toar-data.fz-juelich.de/api/v2/analysis/statistics/\"\n",
    "cache_basepath = Path(\"cache\")\n",
    "result_basepath = Path(\"results\")\n",
    "cache_basepath.mkdir(exist_ok=True)\n",
Carsten Hinz's avatar
Carsten Hinz committed
    "result_basepath.mkdir(exist_ok=True)\n",
    "analysis_service = AnalysisServiceDownload(stats_endpoint=stats_endpoint, cache_dir=cache_basepath, sample_dir=result_basepath, use_downloaded=True)\n",
Carsten Hinz's avatar
Carsten Hinz committed
    "\n",
    "# maybe adopt the interval for requesting the results and the total duration, before the client pauses the requests.\n",
    "# as the requests take about 45min, it is more suitable to wait 60min before timing out the requests than the original 30min.\n",
    "analysis_service.connection.set_request_times(interval_min=5, max_wait_minutes=60)\n",
    "for person, config in configs.items():\n",
    "    print(f\"\\nProcessing {person}:\")\n",
    "    print(f\"--------------------\")\n",
    "    datasets, metadatas = get_gridded_toar_data(\n",
    "        analysis_service=analysis_service,\n",
    "        grid=config.grid,\n",
    "        time=config.time,\n",
    "        variables=config.variables,\n",
    "        stats=config.stats,\n",
    "        contributors_path=result_basepath,\n",
    "        **config.moreOptions\n",
    "    )\n",
    "\n",
    "    for dataset, metadata in zip(datasets, metadatas):\n",
    "        dataset.to_netcdf(result_basepath / f\"{metadata.get_id()}_{config.grid.get_id()}.nc\")\n",
Carsten Hinz's avatar
Carsten Hinz committed
    "        print(metadata.get_id())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "toargridding-8RVrxzmn-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}