Hybrid API too simple for heterogeneous systems(?)
The hybrid API currently flattens the 2D identifiers (mpi_rank, omp_thread_num)
into a 1D space as follows:
sionlib_rank = mpi_rank * num_threads + omp_thread_num
This assumes that there is a single num_threads
that is valid across all MPI processes. On heterogeneous systems (or in heterogeneous applications, think MPMD) this is not necessarily the case.
As a first step, this should be mentioned in the documentation. In the future, this limitation should be lifted by allowing arbitrary numbers of threads on each process.