Skip to content
Snippets Groups Projects
Select Git revision
  • devel default
  • 107-compilation-error-when-building-maestro-core-on-m1-apple-processors
  • 108-implement-cpu-id-query-for-apple-m1-hardware
  • 58-scripting-interface-to-maestro-core
  • 101-need-ci-test-using-installed-maestro
  • 57-sphinx-documentation
  • 105-memory-leak-in-pm-message-envelope-handling
  • 104-permit-disabling-memory-pool
  • 103-liberl-installation-issue-on-devel
  • 94-maestro-rdma-transport-ignores-max_msg_size-2
  • main protected
  • 102-possible-race-in-check_pm_redundant_interlock-test
  • 97-check-if-shm-provider-can-be-enabled-after-libfabric-1-14-is-in-our-tree-2
  • 100-include-maestro-attributes-h-cannot-include-mamba-header-from-deps-path
  • 97-check-if-shm-provider-can-be-enabled-after-libfabric-1-14-is-in-our-tree
  • 17-job-failed-282354-needs-update-of-mio-interface-and-build-rules
  • 96-test-libfabric-update-to-1-13-or-1-14
  • feature/stop-telemetry-after-all-left
  • 94-maestro-rdma-transport-ignores-max_msg_size
  • 93-improve-performance-of-mstro_attribute_val_cmp_str
  • v0.3_rc1
  • maestro_d65
  • d65_experiments_20211113
  • v0.2
  • v0.2_rc1
  • d3.3
  • d3.3-review
  • d5.5
  • d5.5-review
  • v0.1
  • d3.2
  • d3.2-draft
  • v0.0
33 results

maestro-core

  • Clone with SSH
  • Clone with HTTPS
  • Utz-Uwe Haus's avatar
    Utz-Uwe Haus authored
    Default is disabled.
    
    To enable, set environment variable MSTRO_TELEMTRY_HEARTBEAT to
    a (decimal, possibly fractional) value of seconds between reports.
    
    Currently will log a CSV line of statistics under INFO priority and 'HB'
    thread identifier at every wakeup; once MIO Telemetry is integrated,
    the configured telemetry output channels and methods will be used
    instead of the logging infrastructure.
    84c7c2b3
    History

    Description

    Maestro is a data- and memory-aware middleware framework that addresses the ubiquitous problems of data movement in complex memory hierarchies that exist at multiple levels of the HPC software stack.

    Maestro architecture overview image

    This repository contains the Maestro Core Library, as developed for D3.2. It features the Maestro Core API, used by example codes and a MVP demonstrator.

    Installation

    Please refer to INSTALL.md

    Usage

    Maestro can be executed on various sizes and types of machines from a simple laptop to large HPC clusters. On Cray systems, please build all binaries on the service nodes (login nodes) and execute on compute nodes.

    Access an installed Maestro version

    Please include the main Maestro header file in your code

    #include "maestro.h"

    Please add the include path and library path of Maestro to the compilation/linking command

    -I$(MAESTRO_PATH)/include/maestro -L$(MAESTRO_PATH)/lib -lmaestro

    Please export the path to Maestro library before running

    export LD_LIBRARY_PATH=$(MAESTRO_PATH)/lib:$LD_LIBRARY_PATH

    where $(MAESTRO_PATH) is Maestro install path specified during configuration with ./configure --prefix=$(MAESTRO_PATH)

    Unit tests

    Build the unit tests only

    make check TESTS=

    Run the unit tests only

    make check TESTS

    Build and run the unit tests. This may take some time

    make check

    Limits

    Maestro requires quite a few file descriptors and also locks pages into memory for RDMA purposes. We try to give a diagnostic message if errors are triggered that may be due to resource constraints. Still, we recommend

    ulimit -n 1024
    ulimit -l 256

    to set at least 1024 file descriptors and 256k of RDMA space.

    When using a system that uses PBSPro or ALPS workload manager, please export

    export APRUN_XFER_LIMITS=1

    before submiting your job to ensure enough limits for Maestro on the compute nodes.

    Fabric provider choice/ High-Performance interconnect usage

    Maestro isolates the user from the multitude of network provider choices by using libfabric, and transparently choosing 'the best' connectivity between components. Unfortunately this functionality is not fully working, due to issues in the upstream libfabric code, and in incomplete testing of our usage of it.

    Please use

    export FI_PROVIDER=provider

    to force a specific fabric provider.

    List of supported providers

    Provider Support
    sockets TCP/IP networking
    verbs Infiniband networks and slingshot
    gni Cray Areis networks
    psm2 Intel Omni-Path networks

    You can execute

    ./Maestro-source-dir/deps/libfabric/util/fi_info

    to see all the discovered providers by libfabrics, where Maestro-source-dir is the directory containing Maestro source.

    The safest (and lowest performance) connectivity is provided by the sockets provider. It should work on almost any network that can support TCP/IP networking, including ethernet, IB, and GNI (Aries).

    Usage of the tcp and tcp;ofi_rxm provider is currently broken, an upstream issue is open.

    On Cray XC systems the GNI (Aries) provider is supported. If you compile with the rdma-credentials and gni-headers modules loaded, the GNI provider should be autoselected if a GNI NIC is found at runtime.

    NOTE that GNI NICs on login nodes typically do not work, due to a limitation of the libfabric/gni driver, so you will have to run your application exclusively on compute nodes, or manually switch the components running on login nodes to the sockets provider.

    If you are using GNI you will implicitly be using Cray libdrc, a mechanism to obtain network authentication tokens. Maestro core is requesting workflow-level tokens that even support running multiple components of a workflow from different user IDs. In some cases the system may run out of tokens, and there is no user-level token inquiry tool available. If you see failure of GNI startup, try running your application with

    DRC_DEBUG_LEVEL=DEBUG

    and look for an error message like

    LIBDRC:CORE:DEBUG        rdmacred.c:658 - finished acquire request, rc=-28 

    If you see this, contact your system admin to clear cached DRC credentials.

    Examples

    Examples can be found in /examples directory. It includes currently one simple example, single_node_pool_op.c. It is a multi-threaded application (pthread) consists of a producer thread, and two consumer threads. single_node_pool_op.c is based on d3.2 of Maestro project, more reading (d3.2) here.

    To build the example, please use

    make MAESTRO=$(MAESTRO_PATH)

    Example paramters, such as num_producers, num_consumers, num_archivers, cdo_size, and cdo_count, can be configured using single_node_pool_op_config.yaml file.

    Before executing the binary, please export LD_LIBRARY_PATH=$(MAESTRO_PATH)/lib:$LD_LIBRARY_PATH and then

    ./single_node_pool_op.o

    Demos

    Local multithreaded demo (MVP1)

    MVP1 consists in a local multithreaded demo application. More reading (d3.2) here

    Reference version is tagged d3.2-draft, on master branch.

    make check also builds the demo executable demo_mvp_d3_2 in addition to examples, and runs it. ./run_demo.sh permits to run the demo alone.

    Adaptive Transport demo

    Pool manager interlock demo uses a three application setup, comprising one pool manager process, and showing GFS and MIO transport. More reading (d5.5 to appear on BSCW) and information on how to setup a VM to run Mero here

    Reference version is tagged d5.5-review, on master branch.

    The pool manager interlock demonstration check_pm_interlock.sh is automatically launched with make check.

    Documentation

    Doxygen documentation is available and compiled in docs folder.

    Common issues/FAQs

    • If you have many network interfaces/many addresses assigned to an interface (may happen with IPv6 rather suddenly) the libfabric setup of the pool manager may hit too many open files/errno=-24 issues. Check ulimit -n, and increase the limit.

    • If you see clients stuck at JOIN time while everything else looks good, there is a chance that your firewalling intercepts the packages.

    • Is it safe to invalidate the pointer to data wrapped by an offered cdo, given that ownership has passed to maestro core? (We keep hold onto the cdo handle itself of course.)

      No, the allocation that was captured in the cdo handle must not be touched until after DISPOSE. Of course you can forget the pointer you have, but you must not re-use the allocation or free it.

    • Does a producer have to check whether an offered cdo has been consumed (DEMANDed), and wait until it is, before calling (withdraw followed by) dispose?

      A producer cannot directly figure that out (unless you do complicated event ops). The idea is: The consumer must submit the REQUIRE before the WITHDRAW occurs. This can be accomplished by 1. pre-posting the REQUIRE, or 2. by posting it after observing an OFFER:before or OFFER:after event (for safety with a 'require-ack' flag or any earlier event, like DECLARE or SEAL), 3. by posting it in a WITHDRAW:before with require-ack set. In all these cases Maestro will ensure that the REQUIRE can be satisfied, by taking a copy (more or less eagerly, this is to be tuned], or by blocking WITHDRAW.

    • Should withdraw be needed at all if an offered cdo is consumed by another application?

      Every OFFER must be followed by WITHDRAW (and DISPOSE); every REQUIRE must be followed by RETRACT or DEMAND (and DISPOSE). Remember that one OFFER can satisfy many REQUIRES for the same CDO; WITHDRAW indicates that you're no longer ready to do so (and maestro needs to ensure outstanding REQUIRES can still be satisfied if their DEMAND comes in)

    • Should dispose block until an offered cdo is consumed by another? Or will it only block if there has already been a require posted?

      WITHDRAW may block if maestro decides that it cannot or does not want to take a copy and there is at least one outstanding REQUIRE for the CDO, or a DEMAND is still in progress. DISPOSE should never block (but may take some time -- but not related to the pool protocol).

    Authors and acknowledgment

    Data Orchestration in High Performance Computing project has received funding from the European Union’s Horizon 2020 research and innovation program through grant agreement 801101.

    License

    BSD 3-clause License