fix bugs caused by model name refac and tf update

Bug

Error description

  1. During HPC run there was an issue that a variable expected to be an xarray is actually of None type.

  2. There are some issues with the model creation if training is skipped.

Error message

Error for 1)

2021-11-26 20:45:28,227 - INFO: calculate_error_metrics finished after 0:00:07 (hh:mm:ss)  [time_tracking.py:__exit__:134]
2021-11-26 20:45:28,227 - ERROR: 'NoneType' object has no attribute 'coords'  [run_environment.py:__exit__:137]
Traceback (most recent call last):
  File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/workflows/abstract_workflow.py", line 30, in run
    stage(**self._registry_kwargs[pos])
  File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 99, in __init__
    self._run()
  File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 129, in _run
    skill_score_competitive, _, skill_score_climatological, errors = self.calculate_error_metrics()
  File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 911, in calculate_error_metrics
    for n in external_data.coords[self.model_type_dim].values]
AttributeError: 'NoneType' object has no attribute 'coords'
  1. When loading the model, sometimes keras is not able to load the model via keras.models.load_model.

First guess on error origin

  1. None type check is not performed properly

Error origin

In the following line, the None check must be called before the renaming has started

external_data = self._get_external_data(station, path)  # test data
external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
                                              for n in external_data.coords[self.model_type_dim].values]
# test errors
if external_data is not None:
    ....
  1. The origin is not so clear. But to solve the issue we could change the code from load model to load weights. The model itself has already been build from model class and is not required to rebuild from scratch.

Solution

  1. move 2nd line shown in code piece inside the if statement to prevent unintended calling .coords on None type.
  external_data = self._get_external_data(station, path)  # test data
- external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
-                                              for n in external_data.coords[self.model_type_dim].values]
 # test errors
 if external_data is not None:
+   external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
+                                                for n in external_data.coords[self.model_type_dim].values]
    ....
  1. Replace load_model in model class by load_weights
class AbstractModelClass:
    ....
    def load_model(self, name: str, compile: bool = False):
        hist = self.model.history
-       self.model = keras.models.load_model(name)
+       self.model.load_weights(name)
        self.model.history = hist
        if compile is True:
            self.model.compile(**self.compile_options)
class PostProcessing:
    ....
-   def _load_model(self) -> keras.models:
+   def _load_model(self) -> AbstractModelClass:
        try:
            model = self.data_store.get("best_model")
        except NameNotFoundInDataStore:
            logging.info("No model was saved in data store. Try to load model from experiment path.")
            model_name = self.data_store.get("model_name", "model")
-           model_class: AbstractModelClass = self.data_store.get("model", "model")
-           model = keras.models.load_model(model_name, custom_objects=model_class.custom_objects)
+           model: AbstractModelClass = self.data_store.get("model", "model")
+           model.load_model(model_name)
        return model