Option for monthly splitting of input data
The new splitting approach, where the monthly pickle-files are driectly converted to tfrecords-files as part of the specific sub-dataset (i.e. training-, val- and test-dataset) only allows for splitting on yearly base.
E.g. something like
partition = {"test":{"2017":[1,2,3,4,5,6,7,8,9,10]} , "val":{"2017":[11]}, "test":{"2017":[12]}} in era5_dataset_v2.py is not feasible so far and would result in having the whole 2017-data in all subsets.
An approach for this might be to loop over the partitions and to invoke PyStager for each partion seperately, although this may result in idle nodes for small sub-datasets such as the val- or test-dataset (only one month in the example mentioned above).
Edited by Michael Langguth