Merge branch '23-pypi-release-and-extended-documentation-buildsystem' into...

Merge branch '23-pypi-release-and-extended-documentation-buildsystem' into '23-pypi-release-and-extended-documentation' updated version, we might declare the release as a new version See merge request !24

Merge branch '23-pypi-release-and-extended-documentation-buildsystem' into...
656b6842 · Carsten Hinz · 1cb9b3cc · 9433505f · 656b6842 · 656b6842
Commit 656b6842 authored 3 months ago by Carsten Hinz
--- a/.gitignore
+++ b/.gitignore
@@ -18,3 +18,4 @@ log/
 combined_contributors.txt
 testenv/
 test.combined.contributors
+dist
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
+# Changelog:
+Adding of notable and major changes.
+
+## 0.3.0 Preparation of PiPY release.
+
+### Project structure
+Updated project metadata and added license, changelog
+
+### general:
+* working version of toargridding 
+
+### security:
+* added option to pass access token to TOAR-II analysis service
\ No newline at end of file
--- a/CITATION.cff
+++ b/CITATION.cff
+cff-version: 1.2.0
+title: TOAR Gridding Tool
+message: If you use this software, please cite both the article from preferred-citation and the software itself.
+authors:
+  - family-names: Grasse
+    given-names: Simon
+      email: s.grasse@fz-juelich.de
+      affiliation: Jülich Supercomputing Centre - Forschungszentrum Jülich GmbH
+      orcid: 'https://orcid.org/0000-0002-9621-012X'
+    orcid: 
+  - family-names: Hinz
+    given-names: Carsten
+    orcid: 
+      email: c.hinz@fz-juelich.de
+      affiliation: Jülich Supercomputing Centre - Forschungszentrum Jülich GmbH
+      orcid: 'https://orcid.org/0000-0002-7402-8394'
+version: 0.3.0
+identifiers:
+  - type: url
+    value: https://juser.fz-juelich.de/record/1033661
+    description: First entry in publication database of Forschungszentrum Jülich
+repository-code: 'https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/'
+license: MIT
+version: 0.3.0
+date-released: '2025-01-21'
+preferred-citation:
+  authors:
+    - family-names: Grasse
+      given-names: Simon
+    - family-names: Hinz
+      given-names: Carsten
+  title: TOAR Gridding Tool
+  url: https://juser.fz-juelich.de/record/1033661
+  type: generic
+  year: '2024'
+  conference: {}
+  publisher: {}
\ No newline at end of file
--- a/LICENSE
+++ b/LICENSE
+MIT License
+
+Copyright (c) 2025 Forschungszentrum Jülich GmbH
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
\ No newline at end of file
--- a/README.md
+++ b/README.md
@@ -36,10 +36,36 @@ The visualization on a map in one example relies on [cartopy](https://scitools.o

 # Installation

+## 1) Install from PyPI
+
+We intent to provide this package on PyPI. This documentation is preliminary as it is written before the actual release as part of the preparation.
+We intent to allow an installation with
+```bash
+pip install toargridding
+```
+or 
+```bash
+python -m pip install toargridding
+```
+We suggest, that you use the TOAR Gridding tool as part of an virtual environment for the creation of your own data analysis scripts.
+Therefore create directory, navigate to it 
+```bash
+# on linux:
+mkdir /my/path/data/analysis
+cd  /my/path/data/analysis
+```
+and create a virtual environment
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install toargridding
+```
+
+## 2) Install from Source
 Move to the folder you want to download this project to.
 We now need to download the source code from the [repository](https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/) either as ZIP file or via git:

-## 1) Download with GIT
+### 2.1) Download with GIT
 Clone the project from its git repository:
 ```bash
 git clone https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding.git 
@@ -49,7 +75,7 @@ With git we need to checkout the testing branch (testing), as the main branch is
 cd toargridding
 ```

-## 2) Installing Dependencies and Setting up Virtual Environment
+### 2.2) Installing Dependencies and Setting up Virtual Environment

 Setup a virtual environment for your code to avoid conflicts with other projects. Here, you can use your preferred tool or run:
 ```bash
@@ -87,7 +113,7 @@ Switching from `pip` to [conda](https://conda.io/projects/conda/en/latest/index.

 # Examples

-This package provides a number of examples as jupyter notebooks (https://jupyter.org/).
+The [git-repository of this package](https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/) includes a number of examples as jupyter notebooks (https://jupyter.org/).
 Jupyter uses your web-browser to display results and the code blocks.
 As an alternative, visual studio code directly supports execution of jupyter notebooks.
 For VS Code, please ensure to select the kernel of the virtual environment [see](https://code.visualstudio.com/docs/datascience/jupyter-notebooks).
@@ -323,3 +349,7 @@ class dataClass:
    anStr : str
    secStr : str = "Default value"
 ```
+# Citation
+
+We refer to the entries of the publication database of the Research Center Jülich for a citation in different formats:
+https://juser.fz-juelich.de/record/1033661
\ No newline at end of file
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -5,11 +5,11 @@ build-backend = "hatchling.build"
 [project]
 name = "toargridding"
 dynamic = ["version"]
-description = ""
+description = "Creation of gridded data sets from the TOAR-II database and TOAR-II analysis service."
 readme = "README.md"
 requires-python = ">=3.10,<3.12"
-license = "MIT"
-keywords = []
+license = { file="LICENSE"}
+keywords = ["TOAR-II", "TOAR", "gridded data", "ozone", "fzj", "jsc" ]
 authors = [
  { name = "Simon Grasse", email = "s.grasse@fz-juelich.de" },
  { name = "Carsten Hinz", email = "c.hinz@fz-juelich.de"},
@@ -19,6 +19,11 @@ classifiers = [
  "Programming Language :: Python",
  "Programming Language :: Python :: 3.10",
  "Programming Language :: Python :: 3.11",
+  "Framework :: Pytest",
+  "Framework :: Hatch",
+  "Framework :: Sphinx",
+  "Intended Audience :: Science/Research",
+  "License :: OSI Approved :: MIT License",
 ]
 dependencies = [
    "requests",
@@ -41,6 +46,9 @@ types = ["mypy>=1.0.0"]
 Issues = "https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding/-/issues"
 Source = "https://gitlab.jsc.fz-juelich.de/esde/toar-public/toargridding"

+[project.scripts]
+combine_contributor_files = "tools.combine_contributor_files:main"
+
 [tool.hatch.version]
 path = "src/toargridding/__about__.py"

@@ -83,3 +91,10 @@ exclude_lines = [
  "if __name__ == .__main__.:",
  "if TYPE_CHECKING:",
 ]
+
+[tool.hatch.build]
+include = [
+  "/src",
+  "/tools"
+  ]
+skip-excluded-dirs = true
\ No newline at end of file
--- a/src/toargridding/__about__.py
+++ b/src/toargridding/__about__.py
-VERSION = "0.2.0"
+VERSION = "0.4.0"
--- a/src/toargridding/contributors.py
+++ b/src/toargridding/contributors.py
@@ -39,38 +39,23 @@ class contributions_manager_by_id(contributors_manager):
    """storing the IDs of all contributing timeseries
    The decoding into a name ist will be done by a new contributors endpoint as part of the RESTAPI.
    In offline mode, a file can be created or a full request can be stored in the metadata.
-
-    The future plan is to run toargridding as a service. Then the request ID will be provided as a unique identify to retrieve the contributors from a database.
    """
    
-    def __init__(self, requestID, contributors_path : Path = None, endpoint="https://toar-data.fz-juelich.de/api/v2/", access_token: str = None):
+    def __init__(self, requestID, contributors_path : Path, endpoint="https://toar-data.fz-juelich.de/api/v2/"):
        super().__init__(requestID)
        self.endpoint = endpoint
-        self.runsAsService = True
        self.inline_mode = False
-        if contributors_path is not None:
-            self.runsAsService = False
-            self.contributors_path = contributors_path
+        self.contributors_path = contributors_path
        self.file_request_path = "request_contributors"
-        self.register_path = "register_timeseries_list_of_contributors"
-        self.database_request_path = "request_timeseries_list_of_contributors"
-        self.access_token = access_token
-        if self.access_token is not None:
-            self.headers = {"AccessToken": self.access_token}
-        else:
-            self.headers = None
    def setup_contributors_endpoint_for_metadata(self):
        """!create the contributors endpoint depending on the intended mode.

-        This either adds the timeseries IDs to the database and provides a get endpoint for the contributors or writes the timeseries IDs to a file and provides a curl command to upload the file to the database.
+        This writes the timeseries IDs to a file and provides a curl command to upload the file to the database.
        """
-        if self.runsAsService:
-            return self.setup_contributors_service()
+        if self.inline_mode:
+            return self.setup_contributors_inline()
        else:
-            if self.inline_mode:
-                return self.setup_contributors_inline()
-            else:
-                return self.setup_contributors_id_file()
+            return self.setup_contributors_id_file()
    def setup_contributors_inline(self) -> str:
        return f"{self.endpoint}{self.file_request_path}" + "/?timeseriesids=" + ",".join([str(id) for id in self.contributors])
    def setup_contributors_id_file(self) -> str:
@@ -80,7 +65,28 @@ class contributions_manager_by_id(contributors_manager):
            for id in self.contributors:
                f.write(f"{id}\n")
        return f'Endpoint: {self.endpoint}{self.file_request_path} with data file {file_name}'
-    def setup_contributors_service(self) -> str:
+
+
+class contributions_manager_as_service(contributors_manager):
+    """
+    The TOAR database supports a dedicated table for contributors. Here, we can register the timeseries IDs and use the request ID as key.
+    The access to this registration requires an access token.
+    In prinziple this is the same operation as with a file. But we do not need to manage the file and the metadata will contain the direct link to obatain the list of contributors.
+    """
+    def __init__(self, requestID, endpoint="https://toar-data.fz-juelich.de/api/v2/", access_token: str = None):
+        super().__init__(requestID)
+        self.endpoint = endpoint
+        self.access_token = access_token
+        self.register_path = "register_timeseries_list_of_contributors"
+        self.database_request_path = "request_timeseries_list_of_contributors"
+        if self.access_token is not None:
+            self.headers = {"AccessToken": self.access_token}
+        else:
+            self.headers = None
+    def setup_contributors_endpoint_for_metadata(self):
+        """
+        This adds the timeseries IDs to the database and provides a get endpoint for the contributors
+        """
        response = requests.post(  f"{self.endpoint}{self.register_path}/{self.requestID}", json={"timeseriesids": list(self.contributors)}, headers=self.headers )
        response.raise_for_status()
        return f"Call: {self.endpoint}{self.database_request_path}/{self.requestID} to obtain a list of contributors."
@@ -95,9 +101,14 @@ class contributions_manager_by_name(contributions_manager_by_id):
    The resulting string is typically shorter than the list of all timeseries IDs.
    """
    def __init__(self, requestID, contributors_path : Path = None, endpoint="https://toar-data.fz-juelich.de/api/v2/", access_token: str = None):
-        super().__init__(requestID, contributors_path, endpoint, access_token)
+        super().__init__(requestID, contributors_path, endpoint)
        self.metadata_endpoint = endpoint + "timeseries"
        self.inline_mode = False
+        self.access_token = access_token
+        if self.access_token is not None:
+            self.headers = {"AccessToken": self.access_token}
+        else:
+            self.headers = None
    def setup_contributors_endpoint_for_metadata(self):
        if self.inline_mode:
            return self.setup_contributors_inline()

--- a/src/toargridding/gridding.py
+++ b/src/toargridding/gridding.py
@@ -8,7 +8,7 @@ from toargridding.grids import GridDefinition
 from toargridding.metadata import Metadata, TimeSample
 from toargridding.toar_rest_client import AnalysisService

-from toargridding.contributors import contributions_manager_by_id
+from toargridding.contributors import contributions_manager_by_id, contributors_manager

 GriddedResult = namedtuple("GriddedResult", ["dataset", "metadata"])

@@ -20,7 +20,7 @@ def get_gridded_toar_data(
    variables: list[str],
    stats: list[str],
    contributors_path : Path = None,
-    access_token: str = None,
+    contributors_manager : contributors_manager | None = None,
    **kwargs,
 ) -> tuple[list[xr.Dataset], list[Metadata]]:
    """API to download data as xarrays
@@ -60,7 +60,12 @@ def get_gridded_toar_data(
        data = analysis_service.get_data(metadata)
        #TODO add processing of contributors
        # create contributors endpoint and write result to metadata
-        contributors_field = contributions_manager_by_id(metadata.get_id(), contributors_path, access_token=access_token)
+        if contributors_manager:
+            contributors_field = contributors_manager
+        elif contributors_path:
+            contributors_field = contributions_manager_by_id(metadata.get_id(), contributors_path)
+        else:
+            raise ValueError("No contributor manager provided, please either porivde a contributors path or create a contributor manager")
        contributors_field.extract_contributors_from_data_frame(data.stations_data)
        metadata.contributors_metadata_field = contributors_field.setup_contributors_endpoint_for_metadata()
        ds = grid.as_xarray(data)