Select Git revision
CHANGELOG.md
To find the state of this project's repository at the time of any of these versions, check out the tags.
CHANGELOG.md 11.36 KiB
Changelog
All notable changes to this project will be documented in this file.
v1.5.0 - 2021-11-11 -
general:
- introduces method to estimate sample uncertainty
- improved multiprocessing
- last release with tensorflow v1 support
new features:
- test set sample uncertainty estmation during postprocessing (#333 (closed))
- support of Kolmogorov Zurbenko filter for data handlers with filters (#334 (closed))
technical:
- new communication scheme for multiprocessing (#321 (closed), #322 (closed))
- improved error reporting (#323 (closed))
- feature importance returns now unaggregated results (#335 (closed))
- error metrics are reported for all competitors (#332 (closed))
- minor bugfixes and refacs (#330 (closed), #326 (closed), #329 (closed), #325 (closed), #324 (closed), #320 (closed), #337 (closed))
v1.4.0 - 2021-07-27 - new model classes and data handlers, improved usability and transparency
general:
- many technical adjustments to improve usability and transparency of MLAir
- new FCN and CNN classes for easy NN model creation
- new plots
new features:
- new FCN class that can be customized in many ways (#284 (closed))
- also new CNN class (#289 (closed))
- added new bootstrap analysis method: mean bootstrapping (#300 (closed))
- new data handler using FIR filters (#306 (closed))
- performance measures are now stored in local files (#286 (closed))
- histogram plots for inputs and targets (#299 (closed))
- periodogram plots for filtered data (#298 (closed))
technical:
- a calling run script can be stored inside experiment folder if reference to this script is parsed as argument (#99 (closed))
- new callback to track epoch-runtime (#312 (closed))
- added switch to use multiprocessing (#297 (closed))
- customize maximum number of parallel processes (#308 (closed))
- support non-monotonic window lead times (#313 (closed))
- resolved bug with FileExistsError (#311 (closed))
- resolved bug if no chemical is used at all (#307 (closed))
- min/max scaler now scales between -1 and 1 (#302 (closed))
- added missing offset parameter to some data handlers (#305 (closed))
- improved data store logging (#304 (closed))
- improved logging message on station removal in preprocessing (#294 (closed))
- limited number of retries in JOIN module (#296 (closed))
- adjusted competing skill score plot (#301 (closed))
- transformation parameter check (#295 (closed))
- implemented lazy data preprocessing for selected data handlers (#292 (closed))
- fix bug in separation of scales data handler (#290 (closed))
v1.3.0 - 2021-02-24 - competitors and improved transformation
general:
- release of official MLAir logo (#274 (closed))
- new transformation schema for better independence of MLAir and data handler (#272 (closed))
- competing models can be included in postprocessing for direct comparison (#198 (closed))
new features:
- new helper functions for geographic issues (#280 (closed))
- default data handler and inheritances can use min/max and log transformation (#276 (closed), #275 (closed))
- include IntelliO3-ts model as reference via automatic download (#131 (closed))
technical:
- experiment name now always includes target sampling type (#263 (closed))
- competitive skill score plot is refactored (#260 (closed))
- bug fix for climatological skill scores (#259 (closed))
- bug fix for custom objects handling (#277 (closed))
- bug fix for monitoring plots when multiple output branches are used (#278 (closed))
- update requirements to newer version and dependencies (#262 (closed), #273 (closed))
- HPC scripts are updated to work properly with parallel data processing (#281 (closed))
v1.2.1 - 2021-02-08 - bug fix for recursive import error
general:
- applied bug fix
technical:
- bug fix for recursive import error, (#269 (closed))
v1.2.0 - 2020-12-18 - parallel preprocessing and improved data handlers
general:
- new plots
- parallelism for faster preprocessing
- improved data handler with mixed sampling types
- enhanced test coverage
new features:
- station map plot highlights now subsets on the map and displays number of stations for each subset (#227 (closed), #231 (closed))
- two new data availability plots
PlotAvailabilityHistogram
(#191 (closed), #192 (closed), #223 (closed)) - introduced parallel code in preprocessing if system supports parallelism (#164 (closed), #224 (closed), #225 (closed))
- data handler
DataHandlerMixedSampling
(and inheritances) supports an offset parameter to end inputs at a different time than 00 hours (#220 (closed)) - args for data handler
DataHandlerMixedSampling
(and inheritances) that differ for input and target can now be parsed as tuple (#229 (closed))
technical:
- added templates for release and bug issues (#189 (closed))
- improved test coverage (#236 (closed), #238 (closed), #239 (closed), #240 (closed), #241 (closed), #242 (closed), #243 (closed), #244 (closed), #245 (closed))
- station map plot includes now number of stations for each subset (#231 (closed))
- postprocessing plots are encapsulated in try except statements (#107 (closed))
- updated git settings (#213 (closed))
- bug fix for data handler (#235 (closed))
- reordering and bug fix for preprocessing reporting (#207 (closed), #232 (closed))
- bug fix for outdated system path style (#226 (closed))
- new plots are included in default plot list (#211 (closed))
-
helpers/join
connection to ToarDB (e.g. used by DefaultDataHandler) reports now which variable could not be loaded (#222 (closed)) - plot
PlotBootstrapSkillScore
can now additionally highlight specific variables, but not included in postprocessing up to now (#201 (closed)) - data handler
DataHandlerMixedSampling
has now a reduced data loading (#221 (closed))
v1.1.0 - 2020-11-18 - hourly resolution support and new data handlers
general:
- MLAir can be used with 1H resolution data from JOIN
- new data handlers to use the Kolmogorov-Zurbenko filter and mixed sampling types
new features:
- new data handler
DataHandlerKzFilter
to use Kolmogorov-Zurbenko filter (kz filter) on inputs (#195 (closed)) - new data handler
DataHandlerMixedSampling
that can used mixed sampling types for input and target (#197 (closed)) - new data handler
DataHandlerMixedSamplingWithFilter
that uses kz filter and mixed sampling (#197 (closed)) - new data handler
DataHandlerSeparationOfScales
to filter-depended time steps sizes on filtered inputs using mixed sampling (#196 (closed))
technical:
- bug fix for very short time series in TimeSeriesPlot (#215 (closed))
- bug fix for variable dictionary when using hourly resolution (#212 (closed))
- variable naming for data from JOIN interface harmonised (#206 (closed))
- transformation setup is now separated for inputs and targets (#202 (closed))
- bug fix in PlotClimatologicalSkillScore if only single station is used (#193 (closed))
- preprocessed data is now stored inside experiment and not in the data folder
v1.0.0 - 2020-10-08 - official release of new version 1.0.0
general:
- This is the first official release of MLAir ready for use
- updated license, installation instruction
technical:
- restructured order of packages in requirements
v0.12.2 - 2020-10-01 - HDFML support
general:
- HDFML support
technical:
- installation script for HDFML adjusted, #183 (closed)
v0.12.1 - 2020-09-28 - examples in notebook
general:
- introduced a notebook documentation for easy starting, #174 (closed)
- updated special installation instructions for the Juelich HPC systems, #172 (closed)
new features:
- names of input and output shape are renamed consistently to: input_shape, and output_shape, #175 (closed)
technical:
- it is possible to assign a custom name to a run module (e.g. used in logging), #173 (closed)
v0.12.0 - 2020-09-21 - Documentation and Bugfixes
general:
- improved documentation include installation instructions and many examples from the paper, #153 (closed)
- bugfixes (see technical)
new features:
-
MyLittleModel
is now a pure feed-forward network (before it had a CNN part), #168 (closed)
technical:
- new compile options check to ensure its execution, #154 (closed)
- bugfix for key errors in time series plot, #169 (closed)
- bugfix for not used kwargs in
DefaultDataHandler
, #170 (closed) -
trainable
parameter is renamed bytrain_model
to prevent confusion with the tf trainable parameter, #162 (closed) - fixed HPC installation failure, #159 (closed)
v0.11.0 - 2020-08-24 - Advanced Data Handling for MLAir
general
- Introduce advanced data handling with much more flexibility (independent of TOAR DB, custom data handling is pluggable), #144 (closed)
- default data handler is still using TOAR DB
new features
- default data handler using TOAR DB refactored according to advanced data handling, #140 (closed), #141 (closed), #152 (closed)
- data sets are handled as collections, #142 (closed), and are iterable in a standard way (StandardIterator) and optimised for keras (KerasIterator), #143 (closed)
- automatically moving station map plot, #136 (closed)
technical
- model modules available from package, #139 (closed)
- renaming of parameter time dimension, #151 (closed)
- refactoring of README.md, #138 (closed)
v0.10.0 - 2020-07-15 - MLAir is official name, Workflows, easy Model plug-in
general
- Official project name is released: MLAir (Machine Learning on Air data)
- a model class can now easily be plugged in into MLAir. #121 (closed)
- introduced new concept of workflows, #134 (closed)
new features
- workflows are used to execute a sequence of run modules, #134 (closed)
- default workflows for standard and the Juelich HPC systems are available, custom workflows can be defined, #134 (closed)
- seasonal decomposition is available for conditional quantile plot, #112 (closed)
- map plot is created with coordinates, #108 (closed)
-
flatten_tails
are now more general and easier to customise, #114 (closed) - model classes have custom compile options (replaces
set_loss
), #110 (closed) - model can be set in ExperimentSetup from outside, #121 (closed)
- default experiment settings can be queried using
get_defaults()
, #123 (closed) - training and model settings are reported as MarkDown and Tex tables, #145 (closed)
technical
- Juelich HPC systems are supported and installation scripts are available, #106 (closed)
- data store is tracked, I/O is saved and illustrated in a plot, #116 (closed)
- batch size, epoch parameter have to be defined in ExperimentSetup, #127 (closed), #122 (closed)
- automatic documentation with sphinx, #109 (closed)
- default experiment settings are updated, #123 (closed)
- refactoring of experiment path and its default naming, #124 (closed)
- refactoring of some parameter names, #146 (closed)
- preparation for package distribution with pip, #119 (closed)
- all run scripts are updated to run with workflows, #134 (closed)
- the experiment folder is restructured, #130 (closed)