Select Git revision
-
Niklas Selke authored
Added a 'CHANGELOG.md' and a 'LICENSE' file. Modified the 'README.md' and added files for packaging. Created package version 0.1.0.
Niklas Selke authoredAdded a 'CHANGELOG.md' and a 'LICENSE' file. Modified the 'README.md' and added files for packaging. Created package version 0.1.0.
README.md 4.60 KiB
toarstats
This repository contains a collection of statistics that can be
calculated on hourly data. The statistics in the ozone_metrics.py
file are specific to ozone data. The statistics in the stats.py
file
can be calculated for other variables as well.
Installation
To install the package in a specific version and all dependencies run
the following command from within the dist
folder of this repository:
python3 -m pip install toarstats-<version>-py3-none-any.whl
It is advised to set up a virtual environment beforehand.
Usage
Import
To use the package import toarstats
with:
from toarstats import toarstats # or
from toarstats import * # or
import toarstats
Interface
The toarstats
interface is defined like this:
toarstats(sampling, statistics, data, metadata, seasons=None,
crops=None, data_capture=None)
"""Calculate the given statistics with the given sampling.
This function is the public interface for the toarstats package and
acts as a wrapper around all statistics and metrics included in the
package.
:param sampling: temporal aggregation, one of ``daily``,
``monthly``, ``seasonal``, ``vegseason``,
``summer``, ``xsummer``, or ``annual``;
``summer`` will pick the 6-months summer season in
the hemisphere where the station is located;
``xsummer`` does the same for a 7-months summer
season;
``vegseason`` requires also the crops argument and
will then determine the appropriate growing seasons
based on the ``climatic_zone`` metadata and crop
type
:param statistics: a list of statistics and metrics to call, these
must be defined in ``stats.py`` or
``ozone_metrics.py``;
a single string can also be given
:param data: a data frame with datetime values with hourly
resolution and a column with parameter values on which
to calculate the requested statistics and metrics
:param metadata: a named tuple with metadata information for
``station_lat``, ``station_lon``, and
``station_climatic_zone``
:param seasons: a list of season names for seasonal statistics;
for a definition of seasons, see ``stats_utils.py``;
if ``None`` is passed, seasonal statistics will be
computed for the default seasons of the respective
metrics, normally, these are the four meteorological
seasons ``DJF``, ``MAM``, ``JJA`` and ``SON``;
if sampling is set to ``summer`` or ``xsummer``, the
correct season will be determined based on the
``station_lat`` metadata;
if sampling is ``vegseason`` and the crops argument
is given, the appropriate growing seasons will be
selected based on the crop type and
``climatic_zone`` metadata;
the growing seasons for wheat and rice will also be
selected if sampling is ``seasonal`` and the chosen
metrics contains ``aot40`` or ``w126``
:param crops: a list of crop types for ``vegseason`` statistics;
default is ``["wheat", "rice"]``;
a single string can also be given
:param data_capture: a fractional value which will be used to
identify valid data periods;
the default is 0.75 for most statistics,
meaning that 75% of hourly values must be
present in a given interval in order to mark a
result as valid;
note that the ``value_count``, ``mean`` and
``standard_deviation`` statistics do not use
this capture criterion, ``value_count`` counts
all values, ``mean`` and ``standard_deviation``
are calculated when there are at least 10 valid
hourly values in an interval;
the fraction may not always be applied to
original hourly values, but could for example
also be used to count the number of valid days
for a ``monthly``, ``seasonal``, or ``annual``
statistic
"""