implement lazy data preprocessing
lazy data loading
lazy data loading on first time if possible
store the data locally in data path under different folder e.g.
create a checksum for the name and reuse this data always if checksum fits (this will replace all previous steps and save a lot of computation time)
if this works, this can be used for all subsets because data was preloaded as "preprocessing". For all subsets is would be sufficient to use lazy loading
checkout how to create a checksum
target_data(data already loaded, interpolated, kzf applied, only create history, labels, ... has to be performed), additional attributes are stored for the
lazy_preprocessingwhich is default False to trigger lazy preprocessing
compare checksum and try to load data
continue with missing steps
there must be a check regarding variables, and start/end point. Data must be reloaded if start date is earlier than available in data (maybe there could be a check for the case, that there is not data for the starting point, which would trigger an unintended repreprocessing of data) -> NO check for start and end. We assume that data are first used with total time range.
Check this links out:
These links are related how a class can be stored
It seems that a checksum is not possible to create for classes. Maybe there is a way to create a string that summarises all essential properties of a class and create a checksum from this?