load ifs data
Implement a data loader that can load locally stored IFS forecast data. Should look similar to existing era5 data loader.
-
be able to load IFS data -
trigger IFS loading by using ifs
indata_origin
Design choices
- IFS data contain two temporal axes: init time (every 12 hours), valid time (hourly)
- for now: create a single time series consisting on the closest combination of init and valid time (use only t0+0h to t0+11h of each init time).
- for future: think about information for ti>t0 of each sample. Maybe use rather init time of t0 and all forecast steps for future time steps. This will break with a single timeseries, but this is similar to the filter approach and prevents data leakage. Maybe this should be part of another issue.
discussions
Usage of operational forecast data poses some problems with the current setup of MLAir. For now, raw time series always contained a single time dimension which then is transformed into two during sample setup. Now, this second dimension already exists from the NWP model's lead time. So all methods implemented for now cannot handle this data (interpolation, filter, ...).
Changing the general behaviour, e.g. always adding window dimension 0, is a huge refactoring step.
Designing a new data handler just for IFS data harms compability with all other data handlers and produced a lot of almost duplicated code.
Changing the filter calculation methods might therefore be the simplest solution.
TODO
-
refac: expand dims in all data loader (era5_local, join) by dimension window
so that data can be merged with ifs data. Window dimension has single entry0
. If final dataframe has only the 0 dimension, remove this dimension again. If no IFS data are loaded, returned data are like before! -
implement/adjust: ClimateFIR filter should be able to use data with two time dimensions as input. As after the first filter iteration, data is already structured with two time dimensions, it should be possible to use such data from the beginning. -
new data handler for IFS data? Skip interpolation (or apply later after data is resturctured?), create time series data for each init time (ti<t0: closest combination of init and valid time, ti>=t0: most recent forecast, be aware of running time 01 and 13 local time), maybe interpolate now, calculate filter.