TY - GEN
T1 - Investigating the potential of application-centric aggressive power management for HPC workloads
AU - Rodero, I.
AU - Chandra, S.
AU - Parashar, M.
AU - Muralidhar, R.
AU - Seshadri, H.
AU - Poole, S.
PY - 2010
Y1 - 2010
N2 - Energy efficiency of large-scale data centers is becoming a major concern not only for reasons of energy conservation, failures, and cost reduction, but also because such systems are soon reaching the limits of power available to them. Like High Performance Computing (HPC) systems, large-scale cluster-based data centers can consume power in megawatts, and of all the power consumed by such a system, only a fraction is used for actual computations. In this paper, we study the potential of application-centric aggressive power management of data center's resources for HPC workloads. Specifically, we consider power management mechanisms and controls (currently or soon to be) available at different levels and for different subsystems, and leverage several innovative approaches that have been taken to tackle this problem in the last few years, can be effectively used in a applicationaware manner for HPC workloads. To do this, we first profile standard HPC benchmarks with respect to behaviors, resource usage and power impact on individual computing nodes. Based on a power and latency model and the workload profiles, we develop an algorithm that can improve energy efficiency with little or no performance loss. We then evaluate our proposed algorithm through simulations using empirical power characterization and quantification. Finally, we validate the simulation results with actual executions on real hardware. The obtained results show that by using application aware power management, we can reduce the average energy consumption without significant penalty in performance. This motivates us to investigate autonomic approaches for application-aware aggressive power management and cross layer and cross function predictive subsystem level power management for large-scale data centers.
AB - Energy efficiency of large-scale data centers is becoming a major concern not only for reasons of energy conservation, failures, and cost reduction, but also because such systems are soon reaching the limits of power available to them. Like High Performance Computing (HPC) systems, large-scale cluster-based data centers can consume power in megawatts, and of all the power consumed by such a system, only a fraction is used for actual computations. In this paper, we study the potential of application-centric aggressive power management of data center's resources for HPC workloads. Specifically, we consider power management mechanisms and controls (currently or soon to be) available at different levels and for different subsystems, and leverage several innovative approaches that have been taken to tackle this problem in the last few years, can be effectively used in a applicationaware manner for HPC workloads. To do this, we first profile standard HPC benchmarks with respect to behaviors, resource usage and power impact on individual computing nodes. Based on a power and latency model and the workload profiles, we develop an algorithm that can improve energy efficiency with little or no performance loss. We then evaluate our proposed algorithm through simulations using empirical power characterization and quantification. Finally, we validate the simulation results with actual executions on real hardware. The obtained results show that by using application aware power management, we can reduce the average energy consumption without significant penalty in performance. This motivates us to investigate autonomic approaches for application-aware aggressive power management and cross layer and cross function predictive subsystem level power management for large-scale data centers.
UR - http://www.scopus.com/inward/record.url?scp=79952804640&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952804640&partnerID=8YFLogxK
U2 - 10.1109/HIPC.2010.5713196
DO - 10.1109/HIPC.2010.5713196
M3 - Conference contribution
AN - SCOPUS:79952804640
SN - 9781424485185
T3 - 17th International Conference on High Performance Computing, HiPC 2010
BT - 17th International Conference on High Performance Computing, HiPC 2010
T2 - 17th International Conference on High Performance Computing, HiPC 2010
Y2 - 19 December 2010 through 22 December 2010
ER -