Skip to content
Snippets Groups Projects
Select Git revision
  • 6c93d91b64948cb26029ce759ce4ee406b85d6fe
  • main default protected
  • airflow-2.7.0 protected
  • airflow253 protected
  • air251
  • test_docker_op
  • airflow225
  • mptest
  • https-deployment
  • datacat_integration protected
  • datacatalog-integration
  • stable-2.2.2 protected
  • stable-2.2.1 protected
  • stable-2.2.0 protected
  • stable-2.1.4 protected
  • stable-2.1.3 protected
  • stable-2.1.2 protected
  • stable-2.1.1 protected
  • stable-2.1.0 protected
  • stable-2.0.2 protected
  • stable-2.0.1 protected
  • stable-2.0.0 protected
  • stable-1.0.1 protected
  • stable-1.0 protected
  • stable-0.1 protected
25 results

apirequests.adoc

Blame
  • Requests documentation for YORC/TOSCA integration

    This document provides a list and structure of request to start a data transfer with DLS. The minimal workflow currently defined in the project assumes that the DLS part will be about moving data from a EUDAT service (B2SHARE) into a HPC through ssh.

    Prerequisites

    Host address with api e.g., export DLS=http://HOST:PORT/api/v1/

    Also credentials USER:PASS (which are sent as http authentication).

    A full description of the airflow API can be found in Documentation.

    A connection to B2SHARE instance needs to be set-up. This can be done either in manual fashion or with following request (please note that we use testing instance of B2SHARE):

    curl -X POST -u USER:PASS -H "Content-Type: application/json" \
       --data '{"connection_id": "default_b2share","conn_type":"https", "host": "/b2share-testing.fz-juelich.de", "schema":"https"}' \
       $DLS/connections

    There should be an object created in B2SHARE, each object in B2SHARE is identified by a id, which needs to be passed to the DLS workflow as a parameter (see below).

    Credentials handling

    Credentials needed to access SSH-based storages should be passed over to the pipelines (both for up- and download) as pipelines parameters. Basically each DLS pipelines can be extended by tasks to handle those parameters and convert them into connections that can be used in the remainings of the pipeline. The code for that can be found in <a href='dags/conn_deco.py'>dags/conn_deco.py</a>.

    Passing credentials directly

    Following parameters are expected:

       host = params.get('host')
       port = params.get('port', 2222)
       user = params.get('login')
       key = params.get('key')

    Those will be used to create a temporary connection (with randomized name). The connection will be deleted after the pipline run.

    Passing vault connection id

    Credentials from vault can be used to access resources. This is achieved by passing a id of the creentials in vault to the respecitve pipeline. The parameter is vault_id and crednetials are located in the vault defined in admin/connections/my_vault. The location of the credentials in vault is defined as follows: /secret/data/ssh-credentials/<vault-id-value>

    Starting data transfer

    To start a transfer following request needs to be sent it includes B2SHARE object id as a parameter. For testing purposes one can use b38609df2b334ea296ea1857e568dbea which includes one 100MB file. The target parameter is optional with default value /tmp/ and determines location to which the files from B2SHARE object will be uploaded.

    The parameters should be passed along the credentials as described in Credentials handling.

    curl -X POST -u USER:PASS -H "Content-Type: application/json" \
       --data '{"conf": {"oid": ID, "target": PATH}}' \
       $DLS/dags/taskflow_example/dagRuns

    Checking status of a pipeline

    curl -X GET -u USER:PASS -H "Content-Type: application/json" $DLS/dags/taskflow_example/dagRuns

    Uploading example

    To upload to B2SHARE use dags/upload_example.py pipeline.

    Comparing to the download case, for the upload the b2share connection must include the authenitcation token in extra field.

    {"access_token": "foo"}

    For information on how to generate access token, please refer to B2SHARE documenation: https://eudat.eu/services/userdoc/b2share-http-rest-api

    Also connection to data catlog (with name datacat) needs to be defined in the DLS.

    The upload target is defined by parameter mid which points to a metadata template in datacatalog. The metadata template should comprise following information:

    {
      "name": "registration",
      "url": "https://b2share-testing.fz-juelich.de/",
       "metadata": {
          "title":"Result of the computation",
          "creator_name": "eflows4HPC",
          "description": "Output registered by eflows4HPC DLS",
          "community": "a9217684-945b-4436-8632-cac271f894ed",
          "helmholtz centre": "Forschungszentrum Jülich",
          "open_access": "True"
       }
    }

    The values in the metadata field can be adjusted to the user needs. Once such a record is created in the data catalog, its identifier can be used as the value for mid parameter of the upload pipeline. To find out more about how to create records in data catalog check https://datacatalog.fz-juelich.de/docs.

    An example upload template can be found at: https://datacatalog.fz-juelich.de/storage.html?type=storage_target&oid=71e863ac-aee6-4680-a57c-de318530b71e Thus the mid = 71e863ac-aee6-4680-a57c-de318530b71e.

    The upload piplenes requires input parameter source which defines input directory. Similarly to the download case SSH credentials need to be passed to the execution (see Credentials handling for details) (not included in the example).

    curl -X POST -u USER:PASS -H "Content-Type: application/json" \
       --data '{"conf": {"mid": MID, "source": PATH}}' \
       $DLS/dags/upload_example/dagRuns

    Comments

    I could image that a name of DLS pipeline (taskflow_example) can change and needs to be passed as parameter to YORC.