diff --git a/README.md b/README.md index baae0af91036da10ba70f154ac875c18908858c3..7696415b9d9ad2168ad54b2d45b2b1606d39d89f 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,25 @@ -# MachineLearningTools - -This is a collection of all relevant functions used for ML stuff in the ESDE group - -## Inception Model - -See a description [here](https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202) -or take a look on the papers [Going Deeper with Convolutions (Szegedy et al., 2014)](https://arxiv.org/abs/1409.4842) -and [Network In Network (Lin et al., 2014)](https://arxiv.org/abs/1312.4400). +# MLAir - Machine Learning on Air Data +MLAir (Machine Learning on Air data) is an environment that simplifies and accelerates the creation of new machine +learning (ML) models for the analysis and forecasting of meteorological and air quality time series. # Installation * Install __proj__ on your machine using the console. E.g. for opensuse / leap `zypper install proj` -* c++ compiler required for cartopy installation - -## HPC - JUWELS and HDFML setup -The following instruction guide you throug the installation on JUWELS and HDFML. -* Clone the repo to HPC system (we recommend to place it in `/p/projects/<project name>`. +* A c++ compiler is required for the installation of the program __cartopy__ +* Install all requirements from `requirements.txt` preferably in a virtual environment +* Installation of MLAir: + * Either clone MLAir from its repository in gitlab (link??) and use it without installation + * or download the distribution file (?? .whl) and install it via `pip install <??>`. In this case, you can simply + import MLAir in any python script inside your virtual environment using `import mlair`. + +## Special instructions for installation on Jülich HPC systems + +_Please note, that the HPC setup is customised for JUWELS and HDFML. When using another HPC system, you can use the HPC +setup files as a skeleton and customise it to your needs._ + +The following instruction guide you through the installation on JUWELS and HDFML. +* Clone the repo to HPC system (we recommend to place it in `/p/projects/<project name>`). * Setup venv by executing `source setupHPC.sh`. This script loads all pre-installed modules and creates a venv for all other packages. Furthermore, it creates slurm/batch scripts to execute code on compute nodes. <br> You have to enter the HPC project's budget name (--account flag). @@ -27,9 +30,6 @@ You have to enter the HPC project's budget name (--account flag). * Execute either `sbatch run_juwels_develgpus.bash` or `sbatch run_hdfml_batch.bash` to verify that the setup went well. * Currently cartopy is not working on our HPC system, therefore PlotStations does not create any output. -### HPC JUWELS and HDFML remarks -Please note, that the HPC setup is customised for JUWELS and HDFML. When using another HPC system, you can use the HPC setup files as a skeleton and customise it to your needs. - Note: The method `PartitionCheck` currently only checks if the hostname starts with `ju` or `hdfmll`. Therefore, it might be necessary to adopt the `if` statement in `PartitionCheck._run`. @@ -39,8 +39,7 @@ Therefore, it might be necessary to adopt the `if` statement in `PartitionCheck. * To use hourly data from ToarDB via JOIN interface, a private token is required. Request your personal access token and add it to `src/join_settings.py` in the hourly data section. Replace the `TOAR_SERVICE_URL` and the `Authorization` value. To make sure, that this **sensitive** data is not uploaded to the remote server, use the following command to -prevent git from tracking this file: `git update-index --assume-unchanged src/join_settings.py -` +prevent git from tracking this file: `git update-index --assume-unchanged src/join_settings.py` # Customise your experiment @@ -97,3 +96,10 @@ station-wise std is a decent estimate of the true std. scaling values instead of the calculation method. For method *centre*, std can still be None, but is required for the *standardise* method. **Important**: Format of given values **must** match internal data format of DataPreparation class: `xr.DataArray` with `dims=["variables"]` and one value for each variable. + + +## Inception Model + +See a description [here](https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202) +or take a look on the papers [Going Deeper with Convolutions (Szegedy et al., 2014)](https://arxiv.org/abs/1409.4842) +and [Network In Network (Lin et al., 2014)](https://arxiv.org/abs/1312.4400).