... | @@ -25,7 +25,7 @@ The results of this test for two different identically configured nodes can be s |
... | @@ -25,7 +25,7 @@ The results of this test for two different identically configured nodes can be s |
|
|
|
|
|
To understand these results we need to understand how AMD EPYC 7742 CPUs are internally built up. These CPUs are 64-bit 64-core x86 server microprocessors based on the ZEN-2 micro-architecture with logic fabricated using the TSMC 7 nm process and IO fabricated using GlobalFoundries 14nm process. They were first introduced in 2019. They have a base clock speed 2.25 GHz, which can boost up to 3.4 GHz on a single core. The processors support up to two-way simultaneous multi-threading, hence the need for pinning above.
|
|
To understand these results we need to understand how AMD EPYC 7742 CPUs are internally built up. These CPUs are 64-bit 64-core x86 server microprocessors based on the ZEN-2 micro-architecture with logic fabricated using the TSMC 7 nm process and IO fabricated using GlobalFoundries 14nm process. They were first introduced in 2019. They have a base clock speed 2.25 GHz, which can boost up to 3.4 GHz on a single core. The processors support up to two-way simultaneous multi-threading, hence the need for pinning above.
|
|
|
|
|
|
Each CPU is built up of 8 CPU chiplets, also known as a Core Complex Die (CCD), which each house 8 cores split into two groups of 4, which are known as a Core CompleX (CCX), which share their L3 Cache. 2 CCDs are further abstracted as a quadrant. Now this structure is very important as we see in the results.
|
|
Each CPU is built up of 8 CPU chiplets, also known as a Core Complex Die (CCD), which each house 8 cores split into two groups of 4, which are known as a Core CompleX (CCX), which share their 16 MiB (4 times 4 MiB) L3 Cache. 2 CCDs are further abstracted as a quadrant. Now this structure is very important as we see in the results.
|
|
|
|
|
|
Let us begin by looking at the large scale structures in the indexed image. Four blocks are easily identifiable, 2 blue blocks with some purple on the main diagonal and two red off-diagonal blocks. Recall that the nodes we are testing consist of two 64-core CPUs. The blue blocks show intra-CPU core timings, while the red blocks show the inter-CPU core timings. The inter-core CPU timings are slower as communication must occur between the CPUs via the motherboard.
|
|
Let us begin by looking at the large scale structures in the indexed image. Four blocks are easily identifiable, 2 blue blocks with some purple on the main diagonal and two red off-diagonal blocks. Recall that the nodes we are testing consist of two 64-core CPUs. The blue blocks show intra-CPU core timings, while the red blocks show the inter-CPU core timings. The inter-core CPU timings are slower as communication must occur between the CPUs via the motherboard.
|
|
|
|
|
... | | ... | |