@@ -25,7 +25,7 @@ The results of this test for two different identically configured nodes can be s
...
@@ -25,7 +25,7 @@ The results of this test for two different identically configured nodes can be s
To understand these results we need to understand how AMD EPYC 7742 CPUs are internally built up. These CPUs are 64-bit 64-core x86 server microprocessors based on the ZEN-2 micro-architecture with logic fabricated using the TSMC 7 nm process and IO fabricated using GlobalFoundries 14nm process. They were first introduced in 2019. They have a base clock speed 2.25 GHz, which can boost up to 3.4 GHz on a single core. The processors support up to two-way simultaneous multi-threading, hence the need for pinning above.
To understand these results we need to understand how AMD EPYC 7742 CPUs are internally built up. These CPUs are 64-bit 64-core x86 server microprocessors based on the ZEN-2 micro-architecture with logic fabricated using the TSMC 7 nm process and IO fabricated using GlobalFoundries 14nm process. They were first introduced in 2019. They have a base clock speed 2.25 GHz, which can boost up to 3.4 GHz on a single core. The processors support up to two-way simultaneous multi-threading, hence the need for pinning above.
Each CPU is built up of 8 CPU chiplets, also known as a Core Complex Die (CCD), which each house 8 cores split into two groups of 4, which are known as a Core CompleX (CCX), which share their L3 Cache. 2 CCDs are further abstracted as a quadrant. Now this structure is very important as we see in the results.
Each CPU is built up of 8 CPU chiplets, also known as a Core Complex Die (CCD), which each house 8 cores split into two groups of 4, which are known as a Core CompleX (CCX), which share their 16 MiB (4 times 4 MiB) L3 Cache. 2 CCDs are further abstracted as a quadrant. Now this structure is very important as we see in the results.
Let us begin by looking at the large scale structures in the indexed image. Four blocks are easily identifiable, 2 blue blocks with some purple on the main diagonal and two red off-diagonal blocks. Recall that the nodes we are testing consist of two 64-core CPUs. The blue blocks show intra-CPU core timings, while the red blocks show the inter-CPU core timings. The inter-core CPU timings are slower as communication must occur between the CPUs via the motherboard.
Let us begin by looking at the large scale structures in the indexed image. Four blocks are easily identifiable, 2 blue blocks with some purple on the main diagonal and two red off-diagonal blocks. Recall that the nodes we are testing consist of two 64-core CPUs. The blue blocks show intra-CPU core timings, while the red blocks show the inter-CPU core timings. The inter-core CPU timings are slower as communication must occur between the CPUs via the motherboard.