fix bugs caused by model name refac and tf update
Bug
Error description
-
During HPC run there was an issue that a variable expected to be an xarray is actually of None type.
-
There are some issues with the model creation if training is skipped.
Error message
Error for 1)
2021-11-26 20:45:28,227 - INFO: calculate_error_metrics finished after 0:00:07 (hh:mm:ss) [time_tracking.py:__exit__:134]
2021-11-26 20:45:28,227 - ERROR: 'NoneType' object has no attribute 'coords' [run_environment.py:__exit__:137]
Traceback (most recent call last):
File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/workflows/abstract_workflow.py", line 30, in run
stage(**self._registry_kwargs[pos])
File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 99, in __init__
self._run()
File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 129, in _run
skill_score_competitive, _, skill_score_climatological, errors = self.calculate_error_metrics()
File "/p/home/jusers/leufen1/hdfml/intelliaq/mlair_tf2/mlair/mlair/run_modules/post_processing.py", line 911, in calculate_error_metrics
for n in external_data.coords[self.model_type_dim].values]
AttributeError: 'NoneType' object has no attribute 'coords'
- When loading the model, sometimes keras is not able to load the model via
keras.models.load_model
.
First guess on error origin
- None type check is not performed properly
Error origin
In the following line, the None check must be called before the renaming has started
external_data = self._get_external_data(station, path) # test data
external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
for n in external_data.coords[self.model_type_dim].values]
# test errors
if external_data is not None:
....
- The origin is not so clear. But to solve the issue we could change the code from load model to load weights. The model itself has already been build from model class and is not required to rebuild from scratch.
Solution
- move 2nd line shown in code piece inside the if statement to prevent unintended calling
.coords
on None type.
external_data = self._get_external_data(station, path) # test data
- external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
- for n in external_data.coords[self.model_type_dim].values]
# test errors
if external_data is not None:
+ external_data.coords[self.model_type_dim] = [{self.forecast_indicator: self.model_display_name}.get(n, n)
+ for n in external_data.coords[self.model_type_dim].values]
....
- Replace
load_model
in model class byload_weights
class AbstractModelClass:
....
def load_model(self, name: str, compile: bool = False):
hist = self.model.history
- self.model = keras.models.load_model(name)
+ self.model.load_weights(name)
self.model.history = hist
if compile is True:
self.model.compile(**self.compile_options)
class PostProcessing:
....
- def _load_model(self) -> keras.models:
+ def _load_model(self) -> AbstractModelClass:
try:
model = self.data_store.get("best_model")
except NameNotFoundInDataStore:
logging.info("No model was saved in data store. Try to load model from experiment path.")
model_name = self.data_store.get("model_name", "model")
- model_class: AbstractModelClass = self.data_store.get("model", "model")
- model = keras.models.load_model(model_name, custom_objects=model_class.custom_objects)
+ model: AbstractModelClass = self.data_store.get("model", "model")
+ model.load_model(model_name)
return model