|
|
LinkTest can test different process-process combinations in parallel. What connections are tested in parallel to other connections and when is defined by the communication pattern. Given an even number of processes, say 16, LinkTest, in its default configuration, can test 16 of the potential 240 connections by testing between 8 exclusive pairs forwards and backwards. To test the remaining 224 possible connections requires repeating this 14 additional times. Each time LinkTest can test connections in parallel is called a step, so to test the connections between 16 processes takes about 15 steps.
|
|
|
|
|
|
A communication pattern defines which connections are tested together in parallel. Continue with our example of 16 processes, the communication patterns defines which 8 connection pairs are tested in parallel. A common way to generate this pairing is the [1-Factor algorithm](https://en.wikipedia.org/wiki/All-to-all_(parallel_pattern)#1-factor_algorithm), however, given an even number $`n`$ processes there are $`P(n)`$ possible permutations, where $`P(n)`$ is:
|
|
|
```math
|
|
|
P=\prod\limits_{j=0}^{n-2}\prod\limits_{i=0}^{n/2-1}\bigg[\begin{pmatrix}n-2i\\2\end{pmatrix}-j\frac{n-2i}{2}\bigg].
|
|
|
```
|
|
|
For large $`n`$ this behaves like:
|
|
|
```math
|
|
|
P\approx(n!)^n.
|
|
|
```
|
|
|
Note that this is an upper bound on the possible number of permutations. For the derivation of these expressions see [here](#derivation:-number-of-permutation).
|
|
|
A communication pattern defines which connections are tested together in parallel. Continue with our example of 16 processes, the communication patterns defines which 8 connection pairs are tested in parallel. A common way to generate this pairing is the [1-Factor algorithm](https://en.wikipedia.org/wiki/All-to-all_(parallel_pattern)#1-factor_algorithm), however, given an even number $`n`$ processes there are $`(n-1)!!`$ possible pairings. For the derivation of these expressions see [here](derivation:-number-of-communication-patterns).
|
|
|
|
|
|
Furthermore these expression over exaggerate the possible space as two steps in which the same pair order is tested, but the ordering of the pairs is different, are considered to be different communication patterns, when in reality this is not the case. As a result $`P`$ should be divided by $`n!`$. We have also likely forgotten to include other effects, as such consider these numbers to be an upper bound.
|
|
|
|
... | ... | @@ -30,59 +22,13 @@ Differs from the previous in that the even number of groups is split into bisect |
|
|
|
|
|
The above figure compares the communication patterns for 16 processes split into four groups with a common hostname (red, green, blue and magenta) a possible 1-Factor communication scheme. The colored lines between the rounded boxes with numbers, which denote the group-local process number, a tested connection between two process. The color of the line indicates the step in which the connection was tested. Lines with a common color correspond to connections that are tested together in the same step. In the 1-Factor diagram we find connections between all possible nodes. In the hostname-grouping diagram we only find connections between the four differently colored groups. In a given step all processes from one group only communicate with the processes of one other group. In the bisection diagram we only find connections between the upper-left and lower-right set of hosts. The dashed line denotes the bisection split of the processes. The hostname bisection diagram is a combination of the hostname grouping and bisection. Like the bisection diagram communication only occurs across the dashed diagonal, but now in any step all processes from a given group only communicate with the processes of one other group.
|
|
|
|
|
|
## Derivation: Number of Permutations
|
|
|
## Derivation: Number of Communication Patterns
|
|
|
|
|
|
Given an even number of $`n`$ processes we wish to partition them into $`n/2`$ groups of size $`2`$, each group denotes a connection between a pair of processes here. From [partitioning](https://en.wikipedia.org/wiki/Partition\_(number_theory)) $`n`$ elements into $`k`$ groups we can write the number of possible partitions for the first group $`P_{0,0}`$ as:
|
|
|
```math
|
|
|
P_{0,0}=\begin{pmatrix}n\\2\end{pmatrix},
|
|
|
```
|
|
|
where the bracketed term denotes an entry in [Pascal's triangle](https://en.wikipedia.org/wiki/Pascal%27s_triangle). For the second group the number of possible partitions is smaller as two of the $`n`$ elements is already in the first group. For the second group the number of partitions $`P_1`$ is hence:
|
|
|
```math
|
|
|
P_{0,1}=\begin{pmatrix}n-2\\2\end{pmatrix}.
|
|
|
```
|
|
|
More generally the number of possible partitions for any group $`i`$ is:
|
|
|
```math
|
|
|
P_{0,i}=\begin{pmatrix}n-2i\\2\end{pmatrix}.
|
|
|
```
|
|
|
The total number of possible ways to partition the $`n`$ processes into $`n/2`$ groups $`P_{0}`$ is then:
|
|
|
```math
|
|
|
P_{0}=\prod\limits_{i=0}^{n/2-1}P_{0,i}=\prod\limits_{i=0}^{n/2-1}\begin{pmatrix}n-2i\\2\end{pmatrix}=\frac{n!}{(2!)^{n/2}}.
|
|
|
```
|
|
|
A Linktest communication pattern is defined by a partition of $n = 2*x$ processes into $x$ unordered pairs. There are $(n−1)!!=(n−1)(n−3) … 3*1$ such partitions.
|
|
|
|
|
|
We now wish to find an additional $`n-2`$ partitions without any repeats in them. For the second partition the first group now has $`P_{1,0}`$ possibilities:
|
|
|
```math
|
|
|
P_{1,0}=\begin{pmatrix}n\\2\end{pmatrix}-\frac{n}{2}.
|
|
|
```
|
|
|
The additional term here removes the groups from the first partition. For the second group $`P_{1,1}`$ is:
|
|
|
```math
|
|
|
P_{1,1}=\begin{pmatrix}n-2\\2\end{pmatrix}-\frac{n-2}{2}.
|
|
|
```
|
|
|
The additional term here because the potential groups from the first partition to remove are now two smaller because those possibilities have already been removed with the subtraction of two in the coordinate for Pascal's triangle. More generally the possible partitions for the i<sup>th</sup> group in the second partitioning are:
|
|
|
```math
|
|
|
P_{1,i}=\begin{pmatrix}n-2i\\2\end{pmatrix}-\frac{n-2i}{2}.
|
|
|
```
|
|
|
From this we can find that the number of possible partitions for the second partitioning is:
|
|
|
```math
|
|
|
P_{1}=\prod\limits_{i=0}^{n/2-1}P_{1,i}=\prod\limits_{i=0}^{n/2-1}\bigg[\begin{pmatrix}n-2i\\2\end{pmatrix}-\frac{n-2i}{2}\bigg].
|
|
|
```
|
|
|
|
|
|
More generally for the j<sup>th</sup> partitioning into groups we find:
|
|
|
```math
|
|
|
P_{j,i}=\begin{pmatrix}n-2i\\2\end{pmatrix}-j\frac{n-2i}{2}.
|
|
|
```
|
|
|
The number of possible partitions for the j<sup>th</sup> partitioning is then:
|
|
|
```math
|
|
|
P_{j}=\prod\limits_{i=0}^{n/2-1}P_{j,i}=\prod\limits_{i=0}^{n/2-1}\bigg[\begin{pmatrix}n-2i\\2\end{pmatrix}-j\frac{n-2i}{2}\bigg].
|
|
|
```
|
|
|
|
|
|
From this we can find the total number of $`n-1`$ partitions without repetition $`P`$ as:
|
|
|
```math
|
|
|
P=\prod\limits_{j=0}^{n-2}\prod\limits_{i=0}^{n/2-1}\bigg[\begin{pmatrix}n-2i\\2\end{pmatrix}-j\frac{n-2i}{2}\bigg].
|
|
|
```
|
|
|
|
|
|
In the limit as $`n`$ tends to infinity $`P`$ tends towards:
|
|
|
```math
|
|
|
\lim\limits_{n\to\infty}P=(n!)^n
|
|
|
```
|
|
|
|
|
|
Note that in this derivation two partitions are considered to be different even when the groups in the partition are the same. In this case the two partitions only differ by a permutation. |
|
|
\ No newline at end of file |
|
|
Proof via induction:
|
|
|
**Induction Start** x = 1, n = 2, obviously there is only 1 partition. $(n-1)!! = 1!! = 1$
|
|
|
**Induction Thesis** $a_x = (n-1)!!$ is the number of partitions of $n$ processes where $n = 2*x$ into $x$ unordered pairs.
|
|
|
**Induction Step** $n+2$ processes can be devided as follows.
|
|
|
There are $n+1$ possibilities to pair the first element with any other. After that there are n elements left. Per induction thesis there are $a_x$ possibilities to finish the partition.
|
|
|
$(n+1) * a_x = (n+1)(n-1)!! = (n+1)!! = a_{(x+1)}$ |