Chapter 1

Energy Metrics for Performance Optimization

This document provides an analysis of energy consumption relative to various computational performance metrics. Specifically, we look at Energy (Joules) per Number of CPUs (NCPUs), Energy per Speedup, and Energy vs Nodes.

1. Energy (J) / NCPUs

Energy consumption is a critical factor in evaluating the efficiency of a computational task. It is common to measure energy usage per number of CPUs (NCPUs) to understand how the hardware contributes to the overall energy demand.

Formula:

\[ \text{Energy per CPU (J)} = \frac{\text{Total Energy Consumed (J)}}{\text{Number of CPUs (NCPUs)}} \]

Where:

  • Total Energy Consumed (J) is the total energy used during the computation.
  • Number of CPUs (NCPUs) is the total number of CPUs utilized in the computation.

Interpretation:

  • A lower value indicates that the system is more efficient in utilizing CPU resources without wasting excessive energy.
  • If energy per CPU increases, it may suggest that the system is not scaling well with additional CPUs, or that there is a disproportionate increase in power usage relative to computational throughput.

2. Energy per Speedup

Energy per speedup is a metric that helps evaluate how the energy efficiency of a system changes with improvements in computational speed. Speedup is typically defined as the ratio of the time taken to solve a problem on a single CPU to the time taken when multiple CPUs or nodes are used.

Formula:

[ \text{Energy per Speedup (J)} = \frac{\text{Total Energy Consumed (J)}}{\text{Speedup Factor}} ]

Where:

  • Total Energy Consumed (J) is the total energy consumed during the computation.
  • Speedup Factor is the ratio of the execution time with one CPU to the execution time with multiple CPUs/nodes.

Interpretation:

  • A high value of energy per speedup means that even though speedup is achieved, it comes at the cost of significantly increased energy consumption.
  • Optimizing the energy per speedup is key to improving the sustainability of high-performance computing systems.

3. Energy vs Nodes

The relationship between energy consumption and the number of nodes used in parallel computing systems is an important metric for scaling performance. It shows how the energy usage scales as more nodes are added to the system.

Formula:

[ \text{Energy per Node (J)} = \frac{\text{Total Energy Consumed (J)}}{\text{Number of Nodes}} ]

Where:

  • Total Energy Consumed (J) is the energy consumed during computation.
  • Number of Nodes refers to the total number of computing nodes used in the system.

Interpretation:

  • If energy per node decreases as more nodes are added, it indicates good scalability in terms of energy efficiency. This means that the system is effectively utilizing additional nodes without a disproportionate increase in energy consumption.
  • Conversely, if the energy per node increases with the addition of more nodes, it may suggest inefficiencies in parallelization or that the system is not optimally scaling with more hardware resources.

Conclusion

The energy metrics analyzed above—Energy (J) / NCPUs, Energy per Speedup, and Energy vs Nodes—are essential for optimizing the performance and sustainability of large-scale computational systems. Balancing performance gains with energy consumption is crucial to achieving both high throughput and energy efficiency in modern high-performance computing environments.

By carefully monitoring and optimizing these energy metrics, organizations can

Scalability test of arbor on JUWELS and JUSUF