摘要

Nowadays, there is no doubt that energy consumption has become a limiting factor in the design and operation of high performance computing (HPC) systems. This is evidenced by the rise of efforts both from the academia and the industry to reduce the energy consumption of those systems. Unlike hardware solutions, software initiatives targeting HPC systems' energy consumption reduction despite their effectiveness are often limited for reasons including (1) the program specific nature of the solution proposed; (2) the need of deep understanding of the application task at hand; and (3) proposed solutions are often difficult to use by novices and/or are designed for single task environments. This paper proposes a three step blind system-wide, application independent, fine-grain, and easy to use (user friendly) methodology for improving energy performance of HPC systems. The methodology typically breaks into phase detection, phase characterization, and phase identification and system reconfiguration. And it is blind in the sense that it does not require any knowledge from users. It relies upon reconfigurable capabilities offered by the majority of HPC subsystems-including the processor, storage, memory, and communication subsystems-to reduce the overall energy consumption of the system (excluding network equipments) at runtime. We also present an implementation of our methodology through which we demonstrate its effectiveness via static analyses and experiments using benchmarks representative of HPC workloads. The memory becoming one of the most power hungry HPC system this paper also introduces and investigates the relevance of a potential power saving scheme which we refer to as memory size scaling. It is destined to scale the memory size for saving energy in a CPU frequency scaling like fashion.

  • 出版日期2014-10
  • 单位INRIA

全文