92 views
 owned this note
TOAR DB REST API # TOAR DB REST API ## links to other REST APIs *[0] OGC API Core definition: https://docs.opengeospatial.org/DRAFTS/17-069r4.html *[1] https://www.ncdc.noaa.gov/cdo-web/webservices [2] https://www.ncdc.noaa.gov/cdo-web/webservices/v2 [3] https://cds.climate.copernicus.eu/api/v2 (no REST API, but python API) [4] www.star.nesdis.noaa.gov: under construction [5] www.knmi.nl/omi: page not found [6] https://github.com/nasa/apod-api : with documentation and examples [7] https://scihub.copernicus.eu/userguide/ODataAPI [8] https://climate.esa.int: ftp site ftp://cci_web@ftp-ae.oma.be/esacci (blank password needed for every subdirectory (interactively!) -- can get quite long: ftp://ftp-ae.oma.be/esacci/ozone/Tropospheric_Columns/L2/LNTOC/MIPAS_SCIAMACHY/) [9] https://geo.woudc.org/csw?service=CSW&version=2.0.2&request=GetCapabilities (Catalog Service) https://geo.woudc.org/ows?service=WMS&version=1.3.0&request=GetCapabilities (Map Service) https://geo.woudc.org/ows?service=WFS&version=1.1.0&request=GetCapabilities (Feature Service) https://geo.woudc.org/wps?service=WPS&version=1.0.0&request=GetCapabilities (Processing Service) https://geo.woudc.org/def (RDF/SKOS definition service) [10] https://confluence.ecmwf.int/display/WEBAPI/ECMWF+Web+API+Home *[11] https://github.com/woudc/woudc-api [12] https://eudat.eu/services/userdoc/b2share-http-rest-api ## Elements from OGC API core definition By default, every API implementing this standard will provide access to a single dataset. Rather than sharing the data as a complete dataset, the OGC API Features standards offer direct, fine-grained access to the data at the feature (object) level. (standards) A particular example is the use of the concepts of datasets and dataset distributions as defined in DCAT (https://docs.opengeospatial.org/DRAFTS/17-069r4.html#DCAT) and used in schema.org (https://docs.opengeospatial.org/DRAFTS/17-069r4.html#schema.org). This standard specifies requirements and recommendations for APIs that share feature data and that want to follow a standard way of doing so. In general, APIs will go beyond the requirements and recommendations stated in this standard - or other parts of the OGC API family of standards - and will support additional operations, parameters, etc. that are specific to the API or the software tool used to implement the API. The Landing page provides links to: * the API definition (link relations service-desc and service-doc), * the Conformance declaration (path /conformance, link relation conformance), and * the Collections (path /collections, link relation data). Feature and feature collection: Each feature in a dataset is part of exactly one collection. _bbox_ or _datetime_ parameter may be used to select only a subset of the features in the collection (the features that are in the bounding box or time interval). The _bbox_ parameter matches all features in the collection that are not associated with a location, too. The _datetime_ parameter matches all features in the collection that are not associated with a time stamp or interval, too. ``` Lower left corner, coordinate axis 1 Lower left corner, coordinate axis 2 Minimum value, coordinate axis 3 (optional) Upper right corner, coordinate axis 1 Upper right corner, coordinate axis 2 Maximum value, coordinate axis 3 (optional) ``` If the bounding box consists of four numbers, the coordinate reference system of the values SHALL be interpreted as WGS 84 longitude/latitude (http://www.opengis.net/def/crs/OGC/1.3/CRS84) unless a different coordinate reference system is specified in a parameter bbox-crs. For WGS 84 longitude/latitude the bounding box is in most cases the sequence of minimum longitude, minimum latitude, maximum longitude and maximum latitude. However, in cases where the box spans the anti-meridian the first value (west-most box edge) is larger than the third value (east-most box edge). _datetime_: ISO 8601-2 distinguishes open start/end timestamps (double-dot) and unknown start/end timestamps (empty string). For queries, an unknown start/end has the same effect as an open start/end. ``` February 12, 2018, 00:00:00 UTC to March 18, 2018, 12:31:12 UTC: datetime=2018-02-12T00%3A00%3A00Z%2F2018-03-18T12%3A31%3A12Z February 12, 2018, 00:00:00 UTC or later: datetime=2018-02-12T00%3A00%3A00Z%2F.. or datetime=2018-02-12T00%3A00%3A00Z%2F March 18, 2018, 12:31:12 UTC or earlier: datetime=..%2F2018-03-18T12%3A31%3A12Z or datetime=%2F2018-03-18T12%3A31%3A12Z ``` __Filtering:__ If features in the feature collection include a feature property that has a simple value (for example, a string or integer) that is expected to be useful for applications using the service to filter the features of the collection based on this property, a parameter with the name of the feature property and with the following characteristics (using an OpenAPI Specification 3.0 fragment) SHOULD be supported Example: ``` name: name in: query description: >- Only return buildings with a particular name. Use '*' as a wildcard.\ Default = return all buildings. required: false schema: type: string style: form explode: false example: 'name=A*' ``` Combination of filters (including bbox and datetime): the logical operator between the predicates is 'AND.' The _limit_ parameter may be used to control the subset of the selected features that should be returned in the response, the page size. ``` schema: type: integer minimum: 1 (example value) maximum: 10000 (example value) default: 10 (example value) ``` Each page may include information about the number of selected and returned features (_numberMatched_ and _numberReturned_) as well as links to support paging (link relation _next_). Encoding (HTML and/or GeoJSON): Two common approaches are: * an additional path for each encoding of each resource (this can be expressed, for example, using format specific suffixes like ".html"); * an additional query parameter (for example, "accept" or "f") that overrides the Accept header of the HTTP request. (see also "Testing" at about 75% of the Core API document) ## Elements from other API documentations ### url name In [1] it is .../api/vN/{endpoint} This allows for API versioning [7] also allows for versioning: https://scihub.copernicus.eu/dhus/odata/v1 ### pagination [1] uses `limit` (= Nrecords) and `offset` (= record number) [7] uses skip and top ### endpoints From [1]: Endpoint Description /datasets A dataset is the primary grouping for data at NCDC. /datacategories A data category is a general type of data used to group similar data types. /datatypes A data type is a specific type of data that is often unique to a dataset. /locationcategories A location category is a grouping of similar locations. /locations A location is a geopolitical entity. /stations A station is a any weather observing platform where data is recorded. /data A datum is an observed value along with any ancillary attributes at a specific place and time. From [7]: /Products /DeletedProducts /Collections Download data via https://scihub.copernicus.eu/dhus/odata/v1/Products('2b17b57d-fff4-4645-b539-91f305c27c69')/$value [10] comes with a Python package API client: https://pypi.org/project/ecmwf-api-client/ mostyl a wrapper around MARS. Sparse documentation at https://confluence.ecmwf.int/display/WEBAPI/Brief+request+syntax ### query options from [7]: $format Specifies the HTTP response format of the record e.g. XML or JSON/ allows clients to request a response in a particular format $filter Specifies an expression or function that must evaluate to true for a record to be returned in the collection $orderby Determines what values are used to order a collection of records/ allows clients to request resources in a particular order $select Specifies a subset of properties to return/ allows clients to requests a specific set of properties for each entity $skip Sets the number of records to skip before it retrieves records in a collection $top Determines the maximum number of records to return $count Allows clients to request a count of the matching resources identified by the Resource Path section of the URI $inlinecount Specifies that the response to the request includes a count of the number of the matching resources $expand Specifies the related resources to be included in line with the retrieved resources ### JSON responses from [1]: "All responses are JSON and will be a single item or a collection of items with metadata." (looks similar to B2SHARE) ### Authentication from [1]: "In order to access the CDO web services a token must first be obtained from the token request page." from [3]: You need an API key (to be installed: pip intall cdsapi) from [7]: "Full authentication is required to access the API" [10] requires an "ECMWF key" ### Stations [1] distinguishes between `location`(e.g. station-id, city name, etc.) and `station` ### Provenance information [7]: download manifest files via https://scihub.copernicus.eu/dhus/odata/v1/Products('2b17b57d-fff4-4645-b539-91f305c27c69')/Nodes('S1A_IW_SLC__1SDV_20160117T103451_20160117T103518_009533_00DD94_D46A.SAFE')/Nodes('manifest.safe')/$value ## Other stuff (useful for documentation) from [7]: URI used by an OData service has up to three significant parts: the Service Root URI, the Resource Path and the Query Options. * the Service Root URI identifies the root of the OData service * the Resource Path identifies the resource collection entity to be interacted with. The resource path enables any aspect of the data model (Data Hub Products, Data Hub Collections, etc.) exposed by the OData service * the system Query Options part refines the results from [7]: In order to help clients discover the OData services, the Data Hub OData Service Metadata Document exposes the Entity Data Model of the service including among others, the Entities and Properties that can be queried. This document can be queried as with the following URL: https://scihub.copernicus.eu/dhus/odata/v1/$metadata ## Own definitions for TOAR URIs and query keys always lowercase, no '_', no '-' values (i.e. controlled vocabulary): CamelCase ### URI .../api/vN/{endpoint} for own service definition (follow [1] as much as possible) for OGC compliant services: .../oapi/wcs/... .../oapi/sos/... (follow [11]) for query parameters that accept lists, see RFC 6570 (https://tools.ietf.org/html/rfc6570) ### general options #### pagination use `limit` (= Nrecords) and `offset` (= record number) as in [1] add `page` (=offset/limit) as user-friendly alternative option #### format `format` options application/json (see https://www.rfc-editor.org/rfc/rfc4627.html) text/csv (see https://tools.ietf.org/html/rfc7111) #### sorting `sortfield` [1] allowed values ... (depends on endpoint) `sortorder` [1] allowed values: asc, desc #### date range `datetime` [0] Either single value in ISO format or datetime range: ``` startdate:enddate ..enddate startdate.. ``` Example:2012-09-10 Accepts valid ISO formated date (YYYY-MM-DD) or date time (YYYY-MM-DDThh:mm:ss). #### includemetadata [1] has a query option `includemetadata` - we explicitly reject this as we want to keep data and metadata together as long as possible. #### access token add token to request header [1]: curl -H "token:<token>" "url" $.ajax({ url:<url>, data:{<data>}, headers:{ token:<token> } }) ### End points #### stations --> 2022-08-12 rejected, because already included in endpoint search following [0][1]; ist eine collection im OGC Sinn .../stations or .../stations/{id} or .../stations/id/{numid} {id} one element of station-code list {numid} numerical id of a station (not managed) query options: parametername optional (one or several) example: parametername=o3&parametername=no2 [case-insensitive, Trennung auch über comma-separated list] location optional (one or several) - a valid nominatim query string example: location=Köln&location='Virgin Islands' bbox 47.5204,-122.2047,47.6139,-122.1065 optional. The desired geographical extent for search. Designed to take a parameter generated by Google Maps API V3 LatLngBounds.toUrlValue. Stations returned must be located within the extent specified. columns selection of output columns to be listed in the response standard options: datetime, sortfield, sortorder, limit, offset, page #### variables following [0]; ist eine collection im OGC Sinn .../variables or .../variables/{id} or .../variables/id/{numid} {id} one element of variablename list {numid} numerical id of a variable (not managed) query options: station optional (one or several) example: station={stationcode} [case-insensitive, Trennung auch über comma-separated list] location optional (one or several) - a valid nominatim query string example: location=Köln&location='Virgin Islands' bbox 47.5204,-122.2047,47.6139,-122.1065 optional. The desired geographical extent for search. Designed to take a parameter generated by Google Maps API V3 LatLngBounds.toUrlValue. Stations returned must be located within the extent specified. standard options: datetime, sortfield, sortorder, limit, offset, page entspricht einer UNIQUE() query mit ANY() Beispiel: variables/?station=DEBW001&station=KOR01783 liefert die Vereinigungsmenge aller Variablen an beiden Stationen #### parameters Synonym zu _variables_ #### timeseries following [0]; ist eine collection im OGC Sinn .../timeseries or .../timeseries/{id} or .../timeseries/id/{numid} {id} unique identifier of timeseries ("label") {numid} numerical id of a variable (not managed) query options: variable optional (one or several) ??parametername optional (one or several), Synonym zu variable station optional (one or several) example: station={stationcode} [case-insensitive, Trennung auch über comma-separated list] location optional (one or several) - a valid nominatim query string example: location=Köln&location='Virgin Islands' bbox 47.5204,-122.2047,47.6139,-122.1065 optional. The desired geographical extent for search. Designed to take a parameter generated by Google Maps API V3 LatLngBounds.toUrlValue. Stations returned must be located within the extent specified. resourceprovider optional (one) --> has_role example: resourceprovider=UBA resourceprovider=Erika%20Musterfrau samplingfrequency optional (one) controlled vocabulary example: 3-hourly, irregular version optional --> done with latest_version, provider_version default (no version argument): return latest version only special value: 'all' return all versions example: 1.3.0 or 000001.000003.00000000000000 sourcetype optional (one or two) allowed values: measurement or model sourcename optional (one or many) measurement method or model experiment identifier example: UVAbsorption, COSMOS-REA6, COSMO-EPS, ECMWF-ERA5 samplingheight optional (one) units: metre datafilter optional (one) controlled vocabulary: None, CleanSector standard options: datetime, sortfield, sortorder, limit, offset, page #### data loosely following [1] data access according to geometry of result: * data/timeseries * data/map * data/compositetimeseries __timeseries__ options: dataset (=o3 etc.) (from https://fz-juelich.sciebo.de/apps/onlyoffice/891179382?filePath=%2FCTS%2FRepository%20TOAR%20DC%2FDocuments%2FTOAR_TG_Vol02_Data_Processing_2021-02.docx) Criterion 3.1: id of the corresponding station (see Steps 2a-2c) Criterion 3.2: dataset (formerly variable) id (see Step 1) Criterion 3.3: role: resource_provider (organisation) Criterion 3.4: sampling_frequency (!!!*** muss noch in CTS!!! ***, sschr: erledigt) Criterion 3.5: version number Criterion 3.6: data_source (measurement or model) Criterion 3.7: measurement method or model experiment identifier (e.g. COSMOS-REA6, COSMO-EPS, ECMWF-ERA5, etc.) Criterion 3.8: sampling height Criterion 3.9: data filtering procedures standard options: startdate, enddate, sortfield, sortorder, limit, offset, page ## Open questions urls without '-' for better consistency with [1]! query options without '_' for consistency with [1]! how to bundle responses for several datasets in data queries? (Test a couple of [1] examples) Wie unterscheiden wir zwischen "raw" und "translated" controlled vocabulary? Query argument "raw"??? oder "short"? "post(able)"?? Siehe vollständiges YAML Schema ungefährt bei 40% auf der Seite https://docs.opengeospatial.org/DRAFTS/17-069r4.html ## References OGC API Core definition: https://docs.opengeospatial.org/DRAFTS/17-069r4.html OGC API coordinate reference systems: OGC API - Features - Part 2: Coordinate Reference Systems by Reference https://docs.opengeospatial.org/DRAFTS/18-058.html * see also http://www.opengis.net/def/crs/OGC/1.3/CRS84 (for coordinates without height) or http://www.opengis.net/def/crs/OGC/0/CRS84h (for coordinates with height) OGC API: https://www.geoapi.org/snapshot/python/metadata.html#content W3C Data on the Web Best Practices: https://docs.opengeospatial.org/DRAFTS/17-069r4.html#DWBP (referenced by OGC API core definition doc) W3C/OGC Spatial Data on the Web Best Practices: https://docs.opengeospatial.org/DRAFTS/17-069r4.html#SDWBP (referenced by OGC API core definition doc)