Skip to content
Snippets Groups Projects
Select Git revision
  • enxhi_issue460_remove_TOAR-I_access
  • michael_issue459_preprocess_german_stations
  • sh_pollutants
  • develop protected
  • master default protected
  • release_v2.4.0
  • michael_issue450_feat_load-ifs-data
  • lukas_issue457_feat_set-config-paths-as-parameter
  • lukas_issue454_feat_use-toar-statistics-api-v2
  • lukas_issue453_refac_advanced-retry-strategy
  • lukas_issue452_bug_update-proj-version
  • lukas_issue449_refac_load-era5-data-from-toar-db
  • lukas_issue451_feat_robust-apriori-estimate-for-short-timeseries
  • lukas_issue448_feat_load-model-from-path
  • lukas_issue447_feat_store-and-load-local-clim-apriori-data
  • lukas_issue445_feat_data-insight-plot-monthly-distribution
  • lukas_issue442_feat_bias-free-evaluation
  • lukas_issue444_feat_choose-interp-method-cams
  • 414-include-crps-analysis-and-other-ens-verif-methods-or-plots
  • lukas_issue384_feat_aqw-data-handler
  • v2.4.0 protected
  • v2.3.0 protected
  • v2.2.0 protected
  • v2.1.0 protected
  • Kleinert_etal_2022_initial_submission
  • v2.0.0 protected
  • v1.5.0 protected
  • v1.4.0 protected
  • v1.3.0 protected
  • v1.2.1 protected
  • v1.2.0 protected
  • v1.1.0 protected
  • IntelliO3-ts-v1.0_R1-submit
  • v1.0.0 protected
  • v0.12.2 protected
  • v0.12.1 protected
  • v0.12.0 protected
  • v0.11.0 protected
  • v0.10.0 protected
  • IntelliO3-ts-v1.0_initial-submit
40 results

mlair

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Felix Kleinert authored
    5986aa65
    History

    MachineLearningTools

    This project contains the source code to rerun "IntelliO3-ts v1.0: A neural network approach to predict near-surface ozone concentrations in Germany" by F. Kleinert, L. H. Leufen and M. G. Schultz (2020, submitted to GMD). Moreover, the source code includes some functionality which is not used in the study named above.

    Installation

    We assume that you have downloaded or cloned the project from GitLab or b2share. In the latter case, you can skip the first remark regarding the data_path. Instructions on how to rerun the specific version are given below.

    • Install proj and geos on your machine using the console. E.g. for OpenSUSE / leap zypper install proj
    • c++ compiler required for cartopy installation
    • graphviz is required to plot the model architecture
    • Make sure that CUDA 10.0 is installed if you want to use Nvidia GPUs (compatible with TensorFlow 1.13.1).

    Depending on your system (GPU available or not) you can create a virtual environment by executing python3.6 -m venv venv. Make sure that the venv is activated (source venv/bin/activate). Afterwards you can install the requirements into the venv:

    • CPU version: pip install -r requirements.txt
    • GPU version: pip install -r requirements_gpu.txt

    Remarks on the first setup

    1. The source code does not include any data to process. Instead, it checks if data are available on your local machine and downloads data from the here. We did not implement a default data_path as we want to allow you to choose where exactly the data should be stored. Consequently, you have to pass a custom data path to ExperimentSetup in run.py (see example below). If all required data are already locally available, the program does not download any new data.

    2. Please note that cartopy may cause errors on run time. If cartopy raises an error, you can try the following (in activated venv):

      • pip uninstall shapely
      • pip uninstall cartopy
      • pip install --upgrade numpy
      • pip install --no-binary shapely shapely
      • pip install cartopy

      Catropy is needed only to create one plot showing the station locations and does not affect the neural network itself. If the procedure above does not solve the problem, you can force the workflow to ignore cartopy by adding the first two characters of your hostname (echo $HOSTNAME) as a list containing a string to the keyword argument hpc_hosts in run.py. The example below assumes that the output of echo $HOSTNAME is "your_hostname".

      Please also consult the installation instruction of the cartopy package itself.

    Example of all remarks given above:

    import [...]
    [...]
    def main(parser_args):
        [...]
    
        with RunEnvironment():
            ExperimentSetup(parser_args,
                            data_path="<your>/<custom>/<path>" # <- Remark 1
                            hpc_hosts=["yo"], # <- Remark 2
                            [...]

    HPC - JUWELS and HDFML setup

    The following instruction guides you through the installation on JUWELS and HDFML.

    • Clone the repo to HPC system (we recommend to place it in /p/projects/<project name>.
    • Setup venv by executing source setupHPC.sh. This script loads all pre-installed modules and creates a venv for all other packages. Furthermore, it creates slurm/batch scripts to execute code on compute nodes.
      You have to enter the HPC project's budget name (--account flag).
    • The default external data path on JUWELS and HDFML is set to /p/project/deepacf/intelliaq/<user>/DATA/toar_<sampling>.

    To choose a different location open `run.py` and add the following keyword argument to `ExperimentSetup`: `data_path=//`. * Execute `python run.py` on a login node to download example data. The program throws an OSerror after downloading. * Execute either `sbatch run_juwels_develgpus.bash` or `sbatch run_hdfml_batch.bash` to verify that the setup went well. * Currently cartopy is not working on our HPC system, therefore PlotStations does not create any output.

    HPC JUWELS and HDFML remarks

    Please note, that the HPC setup is customised for JUWELS and HDFML. When using another HPC system, you can use the HPC setup files as a skeleton and customise it to your needs.

    Note: The method PartitionCheck currently only checks if the hostname starts with ju or hdfmll. Therefore, it might be necessary to adopt the if statement in src/run_modules/PartitionCheck._run.

    How to run IntelliO3-ts

    After you followed the instructions above, you can rerun the trained model by executing (in activated venv) python run.py --experiment_date=IntelliO3-ts. If you want to train the model from scratch, you have to modify run.py. To be more precise, you have to set create_new_model=True. Please note that the evaluation of bootstrapped input variables takes some time. You can skip this evaluation by using evaluate_bootstraps=False.