Skip to content
Snippets Groups Projects
Select Git revision
  • dev_mgs_#15
  • max_version_0.6.9
  • master default protected
  • max_version_0.6.3
  • v0.6.8
  • v0.6.7
  • v0.6.6
  • v0.6.5
  • v0.6.4
  • v0.6.3
  • v0.6.2
  • v0.6.1
  • v0.6.0
  • v0.5.0
  • v0.4.0
  • v0.3.0
  • v0.2.0
  • v0.1.0
18 results

toarstats

  • Clone with SSH
  • Clone with HTTPS
  • Sabine Schröder's avatar
    Sabine Schröder authored
    Max version 0.6.3
    
    See merge request !15
    28b69523
    History

    toarstats

    This repository contains a collection of statistical tools for the analysis of time series data. It is split into two subpackages:

    • metrics: collection of statistics and metrics to calculate on hourly time series data (some specific to ozone measurements)
    • trends: calculate quantile regression on time series data

    Installation

    To install the package in a specific version and all dependencies run the following command from within the dist folder of this repository:

    python3 -m pip install toarstats-<version>-py3-none-any.whl

    It is advised to set up a virtual environment beforehand.

    metrics

    This subpackage contains a collection of statistics that can be calculated on hourly data. The statistics in the ozone_metrics.py file are specific to ozone data. The statistics in the stats.py file can be calculated for other variables as well.

    Import

    To use the package import calculate_statistics with:

    from toarstats.metrics import calculate_statistics # or
    from toarstats.metrics import * # or
    import toarstats.metrics

    Interface

    The calculate_statistics interface is defined like this:

    calculate_statistics(
        sampling=None, statistics=None, data=None, metadata=None, seasons=None,
        crops=None, min_data_capture=None, datetimes=None, values=None,
        station_lat=None, station_lon=None, station_climatic_zone=None
    )
        """Calculate the requested statistics.
    
        This function is the public interface for the ``toarstats`` package.
        It takes all the user inputs and returns the result of all requested
        statistics and metrics.
    
        :param sampling: temporal aggregation, one of ``daily``,
                         ``monthly``, ``seasonal``, ``vegseason``,
                         ``summer``, ``xsummer``, ``annual``, or ``custom``;
                         ``summer`` will pick the 6-months summer season in
                         the hemisphere where the station is located;
                         ``xsummer`` does the same for a 7-months summer
                         season;
                         ``vegseason`` requires also the ``crops`` argument
                         and will then determine the appropriate growing
                         seasons based on the ``climatic_zone`` metadata and
                         crop type;
                         ``custom`` will create one aggregate value over the
                         entire time series
        :param statistics: a single statistic or metric or a list of
                           statistics and metrics to call, these must be
                           defined in ``stats.py`` or ``ozone_metrics.py``
        :param data: data containing a list of date time values and
                     associated parameter values on which to calculate the
                     statistics;
                     if not given, both ``datetimes`` and ``values`` must be
                     given instead
        :param metadata: metadata information about the station's latitude,
                         longitude and climatic zone (keys: ``station_lat``,
                         ``station_lon`` and ``station_climatic_zone``);
                         if not given and any requested statistic or metric
                         needs metadata information, ``station_lat``,
                         ``station_lon`` and ``station_climatic_zone`` must
                         be given instead
        :param seasons: a list of season names for seasonal statistics;
                        for a definition of seasons, see ``stats_utils.py``;
                        if ``None`` is passed, seasonal statistics will be
                        computed for the default seasons of the respective
                        metrics, normally, these are the four meteorological
                        seasons ``DJF``, ``MAM``, ``JJA`` and ``SON``;
                        if sampling is set to ``summer`` or ``xsummer``, the
                        correct season will be determined based on the
                        ``station_lat`` metadata;
                        if sampling is ``vegseason`` and the ``crops``
                        argument is given, the appropriate growing seasons
                        will be selected based on the crop type and
                        ``climatic_zone`` metadata;
                        the growing seasons for ``wheat`` and ``rice`` will
                        also be selected if sampling is ``seasonal`` and the
                        chosen metrics contain ``aot40`` or ``w126``
        :param crops: a single crop type or a list of crop types for
                      ``vegseason`` statistics;
                      default is ``["wheat", "rice"]``
        :param min_data_capture: a fractional value which will be used to
                                 identify valid data periods;
                                 the default is 0.75 for most statistics,
                                 meaning that 75% of hourly values must be
                                 present in a given interval in order to
                                 mark a result as valid;
                                 note that the ``count``, ``mean`` and
                                 ``stddev`` statistics do not use this
                                 capture criterion, ``count`` counts all
                                 values, ``mean`` and ``stddev`` are
                                 calculated when there are at least 10 valid
                                 hourly values in an interval;
                                 the fraction may not always be applied to
                                 original hourly values, but could for
                                 example also be used to count the number of
                                 valid days for a ``monthly``, ``seasonal``,
                                 or ``annual`` statistic
        :param datetimes: must be given with ``values`` if the ``data``
                          argument is missing
        :param values: must be given with ``datetimes`` if the ``data``
                       argument is missing
        :param station_lat: station's latitude, used if missing in the
                            ``metadata`` argument
        :param station_lon: station's longitude, used if missing in the
                            ``metadata`` argument
        :param station_climatic_zone: station's climatic zone, used if
                                      missing in the ``metadata`` argument
        """

    trends

    This subpackage contains a collection of regression methods.

    Import

    To use the package import calculate_trend with:

    from toarstats.trends import calculate_trend # or
    from toarstats.trends import * # or
    import toarstats.trends

    Interface

    The calculate_trend interface is defined like this:

    calculate_trend(method, data, quantiles=None, num_samples=1000)
        """Calculate the trend using the requested method.
    
        This function is the public interface for the ``trends`` subpackage.
        It takes all the user inputs and returns the result of the requested
        trend analysis.
    
        The calculation follows "Guidance note on best statistical practices
        for TOAR analyses" (Chang et al. 2023,
        https://arxiv.org/pdf/2304.14236.pdf) Annex E.
    
        :param method: either ``"OLS"`` or ``"quant"``
        :param data: data containing a list of date time values and
                     associated parameter values on which to calculate the
                     trend
        :param quantiles: a single quantile or a list of quantiles to
                          calculate, these must be between 0 and 1; only
                          needed when ``method="quant"``
        :param num_samples: number of sampled trends in moving block
                            bootstrap
        """