# Changelog All notable changes to this project will be documented in this file. ## v1.3.0 - 2021-02-24 - competitors and improved transformation ### general: * release of official MLAir logo (#274) * new transformation schema for better independence of MLAir and data handler (#272) * competing models can be included in postprocessing for direct comparison (#198) ### new features: * new helper functions for geographic issues (#280) * default data handler and inheritances can use min/max and log transformation (#276, #275) * include IntelliO3-ts model as reference via automatic download (#131) ### technical: * experiment name now always includes target sampling type (#263) * competitive skill score plot is refactored (#260) * bug fix for climatological skill scores (#259) * bug fix for custom objects handling (#277) * bug fix for monitoring plots when multiple output branches are used (#278) * update requirements to newer version and dependencies (#262, #273) * HPC scripts are updated to work properly with parallel data processing (#281) ## v1.2.1 - 2021-02-08 - bug fix for recursive import error ### general: * applied bug fix ### technical: * bug fix for recursive import error, (#269) ## v1.2.0 - 2020-12-18 - parallel preprocessing and improved data handlers ### general: * new plots * parallelism for faster preprocessing * improved data handler with mixed sampling types * enhanced test coverage ### new features: * station map plot highlights now subsets on the map and displays number of stations for each subset (#227, #231) * two new data availability plots `PlotAvailabilityHistogram` (#191, #192, #223) * introduced parallel code in preprocessing if system supports parallelism (#164, #224, #225) * data handler `DataHandlerMixedSampling` (and inheritances) supports an offset parameter to end inputs at a different time than 00 hours (#220) * args for data handler `DataHandlerMixedSampling` (and inheritances) that differ for input and target can now be parsed as tuple (#229) ### technical: * added templates for release and bug issues (#189) * improved test coverage (#236, #238, #239, #240, #241, #242, #243, #244, #245) * station map plot includes now number of stations for each subset (#231) * postprocessing plots are encapsulated in try except statements (#107) * updated git settings (#213) * bug fix for data handler (#235) * reordering and bug fix for preprocessing reporting (#207, #232) * bug fix for outdated system path style (#226) * new plots are included in default plot list (#211) * `helpers/join` connection to ToarDB (e.g. used by DefaultDataHandler) reports now which variable could not be loaded (#222) * plot `PlotBootstrapSkillScore` can now additionally highlight specific variables, but not included in postprocessing up to now (#201) * data handler `DataHandlerMixedSampling` has now a reduced data loading (#221) ## v1.1.0 - 2020-11-18 - hourly resolution support and new data handlers ### general: * MLAir can be used with 1H resolution data from JOIN * new data handlers to use the Kolmogorov-Zurbenko filter and mixed sampling types ### new features: * new data handler `DataHandlerKzFilter` to use Kolmogorov-Zurbenko filter (kz filter) on inputs (#195) * new data handler `DataHandlerMixedSampling` that can used mixed sampling types for input and target (#197) * new data handler `DataHandlerMixedSamplingWithFilter` that uses kz filter and mixed sampling (#197) * new data handler `DataHandlerSeparationOfScales` to filter-depended time steps sizes on filtered inputs using mixed sampling (#196) ### technical: * bug fix for very short time series in TimeSeriesPlot (#215) * bug fix for variable dictionary when using hourly resolution (#212) * variable naming for data from JOIN interface harmonised (#206) * transformation setup is now separated for inputs and targets (#202) * bug fix in PlotClimatologicalSkillScore if only single station is used (#193) * preprocessed data is now stored inside experiment and not in the data folder ## v1.0.0 - 2020-10-08 - official release of new version 1.0.0 ### general: - This is the first official release of MLAir ready for use - updated license, installation instruction ### technical: - restructured order of packages in requirements ## v0.12.2 - 2020-10-01 - HDFML support ### general: - HDFML support ### technical: - installation script for HDFML adjusted, #183 ## v0.12.1 - 2020-09-28 - examples in notebook ### general: - introduced a notebook documentation for easy starting, #174 - updated special installation instructions for the Juelich HPC systems, #172 ### new features: - names of input and output shape are renamed consistently to: input_shape, and output_shape, #175 ### technical: - it is possible to assign a custom name to a run module (e.g. used in logging), #173 ## v0.12.0 - 2020-09-21 - Documentation and Bugfixes ### general: - improved documentation include installation instructions and many examples from the paper, #153 - bugfixes (see technical) ### new features: - `MyLittleModel` is now a pure feed-forward network (before it had a CNN part), #168 ### technical: - new compile options check to ensure its execution, #154 - bugfix for key errors in time series plot, #169 - bugfix for not used kwargs in `DefaultDataHandler`, #170 - `trainable` parameter is renamed by `train_model` to prevent confusion with the tf trainable parameter, #162 - fixed HPC installation failure, #159 ## v0.11.0 - 2020-08-24 - Advanced Data Handling for MLAir ### general - Introduce advanced data handling with much more flexibility (independent of TOAR DB, custom data handling is pluggable), #144 - default data handler is still using TOAR DB ### new features - default data handler using TOAR DB refactored according to advanced data handling, #140, #141, #152 - data sets are handled as collections, #142, and are iterable in a standard way (StandardIterator) and optimised for keras (KerasIterator), #143 - automatically moving station map plot, #136 ### technical - model modules available from package, #139 - renaming of parameter time dimension, #151 - refactoring of README.md, #138 ## v0.10.0 - 2020-07-15 - MLAir is official name, Workflows, easy Model plug-in ### general - Official project name is released: MLAir (Machine Learning on Air data) - a model class can now easily be plugged in into MLAir. #121 - introduced new concept of workflows, #134 ### new features - workflows are used to execute a sequence of run modules, #134 - default workflows for standard and the Juelich HPC systems are available, custom workflows can be defined, #134 - seasonal decomposition is available for conditional quantile plot, #112 - map plot is created with coordinates, #108 - `flatten_tails` are now more general and easier to customise, #114 - model classes have custom compile options (replaces `set_loss`), #110 - model can be set in ExperimentSetup from outside, #121 - default experiment settings can be queried using `get_defaults()`, #123 - training and model settings are reported as MarkDown and Tex tables, #145 ### technical - Juelich HPC systems are supported and installation scripts are available, #106 - data store is tracked, I/O is saved and illustrated in a plot, #116 - batch size, epoch parameter have to be defined in ExperimentSetup, #127, #122 - automatic documentation with sphinx, #109 - default experiment settings are updated, #123 - refactoring of experiment path and its default naming, #124 - refactoring of some parameter names, #146 - preparation for package distribution with pip, #119 - all run scripts are updated to run with workflows, #134 - the experiment folder is restructured, #130 ## v0.9.0 - 2020-04-15 - faster bootstraps, extreme value upsamling ### general - improved and faster bootstrap workflow - new plot PlotAvailability - extreme values upsampling - improved runtime environment ### new features - entire bootstrap workflow has been refactored and much faster now, can be skipped with `evaluate_bootstraps=False`, #60 - upsampling of extreme values, set with parameter `extreme_values=[your_values_standardised]` (e.g. `[1, 2]`) and `extremes_on_right_tail_only=<True/False>` if only right tail of distribution is affected or both, #58, #87 - minimal data length property (in total and for all subsets), #76 - custom objects in model class to load customised model objects like padding class, loss, #72 - new plot for data availability: `PlotAvailability`, #103 - introduced (default) `plot_list` to specify which plots to draw - latex and markdown information on sample sizes for each station, #90 ### technical - implemented tests on gpu and from scratch for develop, release and master branches, #95 - usage of tensorflow 1.13.1 (gpu / cpu), separated in 2 different requirements, #81 - new abstract plot class to have uniform plot class design - New time tracking wrapper to use for functions or classes - improved logger (info on display, debug into file), #73, #85, #88 - improved run environment, especially for error handling, #86 - prefix `general` in data store scope is now optional and can be skipped. If given scope is not `general`, it is treated as subscope, #82 - all 2D Padding classes are now selected by `Padding2D(padding_name=<padding_type>)` e.g. `Padding2D(padding_name="SymPad2D")`, #78 - custom learning rate (or lr_decay) is optional now, #71