split is now included

9a771123 · lukas leufen · cc612677 · 9a771123 · 9a771123
Commit 9a771123 authored 4 years ago by lukas leufen
--- a/docs/_source/customise.rst
+++ b/docs/_source/customise.rst
+Default Workflow
+----------------
+
+.. role:: py(code)
+   :language: python
+
+MLAir is constituted of so-called :py:`run_modules` which are executed in a distinct order called :py:`workflow`. MLAir
+provides a :py:`default_workflow`. This workflow runs the run modules :py:`ExperimentSetup`, :py:`PreProcessing`,
+:py:`ModelSetup`, :py:`Training`, and :py:`PostProcessing` one by one.
+
+
+.. figure:: ./_plots/run_modules_schedule.png
+
+    Sketch of the default workflow.
+
+
+.. code-block:: python
+
+    import mlair
+
+    # create your custom MLAir workflow
+    DefaultWorkflow = mlair.DefaultWorkflow()
+    # execute default workflow
+    DefaultWorkflow.run()
+
+The output of running this default workflow will be structured like the following.
+
+.. code-block::
+
+    INFO: mlair started
+    INFO: ExperimentSetup started
+    ...
+    INFO: ExperimentSetup finished after 00:00:01 (hh:mm:ss)
+    INFO: PreProcessing started
+    ...
+    INFO: PreProcessing finished after 00:00:11 (hh:mm:ss)
+    INFO: ModelSetup started
+    ...
+    INFO: ModelSetup finished after 00:00:01 (hh:mm:ss)
+    INFO: Training started
+    ...
+    INFO: Training finished after 00:02:15 (hh:mm:ss)
+    INFO: PostProcessing started
+    ...
+    INFO: PostProcessing finished after 00:01:37 (hh:mm:ss)
+    INFO: mlair finished after 00:04:05 (hh:mm:ss)
+
+Customised Run Module and Workflow
+----------------------------------
+
+It is possible to create new custom run modules. A custom run module is required to inherit from the base class
+:py:`RunEnvironment` and to hold the constructor method :py:`__init__()`. This method has to execute the module on call.
+In the following example, this is done by using the :py:`_run()` method that is called by the initialiser. It is
+possible to parse arguments to the custom run module as shown.
+
+.. code-block:: python
+
+    import mlair
+    import logging
+
+    class CustomStage(mlair.RunEnvironment):
+        """A custom MLAir stage for demonstration."""
+
+        def __init__(self, test_string):
+            super().__init__()  # always call super init method
+            self._run(test_string)  # call a class method
+
+        def _run(self, test_string):
+            logging.info("Just running a custom stage.")
+            logging.info("test_string = " + test_string)
+            epochs = self.data_store.get("epochs")
+            logging.info("epochs = " + str(epochs))
+
+
+If a custom run module is defined, it is required to adjust the workflow. For this, you need to load the empty
+:py:`Workflow` class and add each run module that is required. The order of adding modules defines the order of
+execution if running the workflow.
+
+.. code-block:: python
+
+    # create your custom MLAir workflow
+    CustomWorkflow = mlair.Workflow()
+    # provide stages without initialisation
+    CustomWorkflow.add(mlair.ExperimentSetup, epochs=128)
+    # add also keyword arguments for a specific stage
+    CustomWorkflow.add(CustomStage, test_string="Hello World")
+    # finally execute custom workflow in order of adding
+    CustomWorkflow.run()
+
+The output will look like:
+
+.. code-block::
+
+    INFO: mlair started
+    ...
+    INFO: ExperimentSetup finished after 00:00:12 (hh:mm:ss)
+    INFO: CustomStage started
+    INFO: Just running a custom stage.
+    INFO: test_string = Hello World
+    INFO: epochs = 128
+    INFO: CustomStage finished after 00:00:01 (hh:mm:ss)
+    INFO: mlair finished after 00:00:13 (hh:mm:ss)
+
+Custom Model
+------------
+
+Create your own model to run your personal experiment. To guarantee a proper integration in the MLAir workflow, models
+are restricted to inherit from the :py:`AbstractModelClass`. This will ensure a smooth training and evaluation
+behaviour.
+
+
+How to create a customised model?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Create a new model class inheriting from :py:`AbstractModelClass`
+
+.. code-block:: python
+
+    from mlair import AbstractModelClass
+
+    class MyCustomisedModel(AbstractModelClass):
+
+        def __init__(self, shape_inputs: list, shape_outputs: list):
+
+            super().__init__(shape_inputs[0], shape_outputs[0])
+
+            # settings
+            self.dropout_rate = 0.1
+            self.activation = keras.layers.PReLU
+
+            # apply to model
+            self.set_model()
+            self.set_compile_options()
+            self.set_custom_objects(loss=self.compile_options['loss'])
+
+* Make sure to add the :py:`super().__init__()` and at least :py:`set_model()` and :py:`set_compile_options()` to your
+  custom init method.
+* The shown model expects a single input and output branch provided in a list. Therefore shapes of input and output are
+  extracted and then provided to the super class initialiser.
+* Some general settings like the dropout rate are set in the init method additionally.
+* If you have custom objects in your model, that are not part of the keras or tensorflow frameworks, you need to add
+  them to custom objects. To do this, call :py:`set_custom_objects` with arbitrarily kwargs. In the shown example, the
+  loss has been added for demonstration only, because we use a build-in loss function. Nonetheless, we always encourage
+  you to add the loss as custom object, to prevent potential errors when loading an already created model instead of
+  training a new one.
+* Now build your model inside :py:`set_model()` by using the instance attributes :py:`self.shape_inputs` and
+  :py:`self.shape_outputs` and storing the model as :py:`self.model`.
+
+.. code-block:: python
+
+    import keras
+
+    class MyCustomisedModel(AbstractModelClass):
+
+        def set_model(self):
+            x_input = keras.layers.Input(shape=self.shape_inputs)
+            x_in = keras.layers.Conv2D(32, (1, 1), padding='same', name='{}_Conv_1x1'.format("major"))(x_input)
+            x_in = self.activation(name='{}_conv_act'.format("major"))(x_in)
+            x_in = keras.layers.Flatten(name='{}'.format("major"))(x_in)
+            x_in = keras.layers.Dropout(self.dropout_rate, name='{}_Dropout_1'.format("major"))(x_in)
+            x_in = keras.layers.Dense(16, name='{}_Dense_16'.format("major"))(x_in)
+            x_in = self.activation()(x_in)
+            x_in = keras.layers.Dense(self.shape_outputs, name='{}_Dense'.format("major"))(x_in)
+            out_main = self.activation()(x_in)
+            self.model = keras.Model(inputs=x_input, outputs=[out_main])
+
+* Your are free how to design your model. Just make sure to save it in the class attribute model.
+* Additionally, set your custom compile options including the loss definition.
+
+.. code-block:: python
+
+    class MyCustomisedModel(AbstractModelClass):
+
+        def set_compile_options(self):
+            self.initial_lr = 1e-2
+            self.optimizer = keras.optimizers.SGD(lr=self.initial_lr, momentum=0.9)
+            self.lr_decay = mlair.model_modules.keras_extensions.LearningRateDecay(base_lr=self.initial_lr,
+                                                                                   drop=.94,
+                                                                                   epochs_drop=10)
+            self.loss = keras.losses.mean_squared_error
+            self.compile_options = {"metrics": ["mse", "mae"]}
+
+* The allocation of the instance parameters :py:`initial_lr`, :py:`optimizer`, and :py:`lr_decay` could be also part of
+  the model class' initialiser. The same applies to :py:`self.loss` and :py:`compile_options`, but we recommend to use
+  the :py:`set_compile_options` method for the definition of parameters, that are related to the compile options.
+* More important is that the compile options are actually saved. There are three ways to achieve this.
+
+  * (1): Set all compile options by parsing a dictionary with all options to :py:`self.compile_options`.
+  * (2): Set all compile options as instance attributes. MLAir will search for these attributes and store them.
+  * (3): Define your compile options partly as dictionary and instance attributes (as shown in this example).
+  * If using (3) and defining the same compile option with different values, MLAir will raise an error.
+
+      Incorrect: (Will raise an error because of a mismatch for the :py:`optimizer` parameter.)
+
+      .. code-block:: python
+
+          def set_compile_options(self):
+              self.optimizer = keras.optimizers.SGD()
+              self.loss = keras.losses.mean_squared_error
+              self.compile_options = {"optimizer" = keras.optimizers.Adam()}
+
+
+Specials for Branched Models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* If you have a branched model with multiple outputs, you need either set only a single loss for all branch outputs or
+  provide the same number of loss functions considering the right order.
+
+.. code-block:: python
+
+    class MyCustomisedModel(AbstractModelClass):
+
+        def set_model(self):
+            ...
+            self.model = keras.Model(inputs=x_input, outputs=[out_minor_1, out_minor_2, out_main])
+
+        def set_compile_options(self):
+            self.loss = [keras.losses.mean_absolute_error] +  # for out_minor_1
+                        [keras.losses.mean_squared_error] +   # for out_minor_2
+                        [keras.losses.mean_squared_error]     # for out_main
+
+
+How to access my customised model?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the customised model is created, you can easily access the model with
+
+>>> MyCustomisedModel().model
+<your custom model>
+
+The loss is accessible via
+
+>>> MyCustomisedModel().loss
+<your custom loss>
+
+You can treat the instance of your model as instance but also as the model itself. If you call a method, that refers to
+the model instead of the model instance, you can directly apply the command on the instance instead of adding the model
+parameter call.
+
+>>> MyCustomisedModel().model.compile(**kwargs) == MyCustomisedModel().compile(**kwargs)
+True
+
+Custom Data Handler
+-------------------
--- a/docs/_source/get-started.rst
+++ b/docs/_source/get-started.rst
-Get started with MLAir
-======================
+
+Getting started with MLAir
+==========================
+
+.. role:: py(code)
+   :language: python
+

 Install MLAir
 -------------

 MLAir is based on several python frameworks. To work properly, you have to install all packages from the
-`requirements.txt` file. Additionally to support the geographical plotting part it is required to install geo
+:py:`requirements.txt` file. Additionally to support the geographical plotting part it is required to install geo
 packages built for your operating system. Name names of these package may differ for different systems, we refer
-here to the opensuse / leap OS. The geo plot can be removed from the `plot_list`, in this case there is no need to
+here to the opensuse / leap OS. The geo plot can be removed from the :py:`plot_list`, in this case there is no need to
 install the geo packages.

-* (geo) Install **proj** on your machine using the console. E.g. for opensuse / leap `zypper install proj`
+Pre-requirements
+~~~~~~~~~~~~~~~~
+
+* (geo) Install **proj** on your machine using the console. E.g. for opensuse / leap :py:`zypper install proj`
 * (geo) A c++ compiler is required for the installation of the program **cartopy**
-* Install all requirements from [`requirements.txt`](https://gitlab.version.fz-juelich.de/toar/machinelearningtools/-/blob/master/requirements.txt)
-  preferably in a virtual environment
 * (tf) Currently, TensorFlow-1.13 is mentioned in the requirements. We already tested the TensorFlow-1.15 version and couldn't
  find any compatibility errors. Please note, that tf-1.13 and 1.15 have two distinct branches each, the default branch
  for CPU support, and the "-gpu" branch for GPU support. If the GPU version is installed, MLAir will make use of the GPU
  device.
-* Installation of **MLAir**:
-    * Either clone MLAir from the [gitlab repository](https://gitlab.version.fz-juelich.de/toar/machinelearningtools.git)
-      and use it without installation (beside the requirements)
-    * or download the distribution file (?? .whl) and install it via `pip install <??>`. In this case, you can simply
-      import MLAir in any python script inside your virtual environment using `import mlair`.
+
+Installation of MLAir
+~~~~~~~~~~~~~~~~~~~~~
+
+* Install all requirements from `requirements.txt <https://gitlab.version.fz-juelich.de/toar/machinelearningtools/-/blob/master/requirements.txt>`_
+  preferably in a virtual environment
+* Either clone MLAir from the `gitlab repository <https://gitlab.version.fz-juelich.de/toar/machinelearningtools.git>`_
+* or download the distribution file (?? .whl) and install it via :py:`pip install <??>`. In this case, you can simply
+  import MLAir in any python script inside your virtual environment using :py:`import mlair`.


 How to start with MLAir
 -----------------------

-In this section, we show three examples how to work with MLAir.
+In this section, we show three examples how to work with MLAir. Note, that for these examples MLAir was installed using
+the distribution file. In case you are using the git clone it is required to adjust the import path if not directly
+executed inside the source directory of MLAir.

 Example 1
 ~~~~~~~~~
@@ -126,107 +138,3 @@ We can see from the terminal that no training was performed. Analysis is now mad
    ...
    INFO: mlair finished after 00:00:06 (hh:mm:ss)

-
-
-Customised workflows and models
-------------------------------
-
-Custom Workflow
-~~~~~~~~~~~~~~~
-
-MLAir provides a default workflow. If additional steps are to be performed, you have to append custom run modules to
-the workflow.
-
-.. code-block:: python
-
-    import mlair
-    import logging
-
-    class CustomStage(mlair.RunEnvironment):
-        """A custom MLAir stage for demonstration."""
-
-        def __init__(self, test_string):
-            super().__init__()  # always call super init method
-            self._run(test_string)  # call a class method
-
-        def _run(self, test_string):
-            logging.info("Just running a custom stage.")
-            logging.info("test_string = " + test_string)
-            epochs = self.data_store.get("epochs")
-            logging.info("epochs = " + str(epochs))
-
-
-    # create your custom MLAir workflow
-    CustomWorkflow = mlair.Workflow()
-    # provide stages without initialisation
-    CustomWorkflow.add(mlair.ExperimentSetup, epochs=128)
-    # add also keyword arguments for a specific stage
-    CustomWorkflow.add(CustomStage, test_string="Hello World")
-    # finally execute custom workflow in order of adding
-    CustomWorkflow.run()
-
-.. code-block::
-
-    INFO: mlair started
-    ...
-    INFO: ExperimentSetup finished after 00:00:12 (hh:mm:ss)
-    INFO: CustomStage started
-    INFO: Just running a custom stage.
-    INFO: test_string = Hello World
-    INFO: epochs = 128
-    INFO: CustomStage finished after 00:00:01 (hh:mm:ss)
-    INFO: mlair finished after 00:00:13 (hh:mm:ss)
-
-Custom Model
-~~~~~~~~~~~~
-
-Each model has to inherit from the abstract model class to ensure a smooth training and evaluation behaviour. It is
-required to implement the set model and set compile options methods. The later has to set the loss at least.
-
-.. code-block:: python
-
-    import keras
-    from keras.losses import mean_squared_error as mse
-    from keras.optimizers import SGD
-
-    from mlair.model_modules import AbstractModelClass
-
-    class MyLittleModel(AbstractModelClass):
-        """
-        A customised model with a 1x1 Conv, and 3 Dense layers (32, 16
-        window_lead_time). Dropout is used after Conv layer.
-        """
-        def __init__(self, window_history_size, window_lead_time, channels):
-            super().__init__()
-            # settings
-            self.window_history_size = window_history_size
-            self.window_lead_time = window_lead_time
-            self.channels = channels
-            self.dropout_rate = 0.1
-            self.activation = keras.layers.PReLU
-            self.lr = 1e-2
-            # apply to model
-            self.set_model()
-            self.set_compile_options()
-            self.set_custom_objects(loss=self.compile_options['loss'])
-
-        def set_model(self):
-            # add 1 to window_size to include current time step t0
-            shape = (self.window_history_size + 1, 1, self.channels)
-            x_input = keras.layers.Input(shape=shape)
-            x_in = keras.layers.Conv2D(32, (1, 1), padding='same')(x_input)
-            x_in = self.activation()(x_in)
-            x_in = keras.layers.Flatten()(x_in)
-            x_in = keras.layers.Dropout(self.dropout_rate)(x_in)
-            x_in = keras.layers.Dense(32)(x_in)
-            x_in = self.activation()(x_in)
-            x_in = keras.layers.Dense(16)(x_in)
-            x_in = self.activation()(x_in)
-            x_in = keras.layers.Dense(self.window_lead_time)(x_in)
-            out = self.activation()(x_in)
-            self.model = keras.Model(inputs=x_input, outputs=[out])
-
-        def set_compile_options(self):
-            self.compile_options = {"optimizer": SGD(lr=self.lr),
-                                    "loss": mse,
-                                    "metrics": ["mse"]}