A Survey of Methods For Analyzing and Improving GPU Energy Efficiency (1404.4629v2)

Published 17 Apr 2014 in cs.AR

Abstract: Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architect highly energy-efficient GPUs of tomorrow.

Authors (2)

Sparsh Mittal (39 papers)
Jeffrey S. Vetter (12 papers)

Citations (188)

View on Semantic Scholar

Summary

The paper presents a comprehensive survey categorizing methods to analyze and improve GPU energy efficiency using DVFS, CPU-GPU workload division, and architectural enhancements.
It evaluates dynamic resource allocation and application-specific optimizations that balance performance with reduced power consumption in high-performance computing.
The survey highlights the need for continuous innovation in GPU power management to ensure reliable and cost-effective operations in data centers and mobile devices.

Survey of Methods For Analyzing and Improving GPU Energy Efficiency

The paper "A Survey of Methods For Analyzing and Improving GPU Energy Efficiency" by Sparsh Mittal and Jeffrey S. Vetter provides a comprehensive overview of the techniques aimed at addressing the growing problem of GPU power consumption. As GPUs have become integral to high-performance computing (HPC) platforms, their elevated levels of power consumption have raised concerns about reliability, economic viability, and architectural feasibility.

The survey categorizes numerous methodologies for enhancing the energy efficiency of GPUs. This categorization includes techniques based on Dynamic Voltage and Frequency Scaling (DVFS), CPU-GPU workload division, architectural improvements in GPU components, dynamic resource allocation, and application-specific optimizations. Each category offers diverse insights into how GPU power consumption can be managed effectively without compromising the computational capabilities that GPUs are known for.

DVFS-based techniques leverage the relationship between operating frequency and power consumption to dynamically adapt the GPU's supply voltage, thereby saving energy. These techniques are crucial as the computational demand on GPUs often varies based on workload characteristics. The paper presents a range of DVFS algorithms that are employed across different scenarios to achieve optimal power performance.

CPU-GPU workload division is another major approach discussed in the survey. By intelligently partitioning tasks between CPU and GPU, and even consolidating GPU workloads, systems can achieve significant power savings. This is vital given the varied characteristics of CPU and GPU tasks; matching the task type to the optimal processor can lead to both performance enhancements and reduced energy consumption.

Architectural techniques focus on optimizing specific GPU components, such as caches, memory, shader units, and thread scheduling mechanisms. These optimizations target reducing power leakage and improving performance per watt, which is especially relevant for applications that require frequent data handling and processing. The survey highlights novel architectural proposals, including dynamic adaptation of core usage and improvements in memory fetch granularity, as promising methods to reduce energy overhead.

Dynamic resource allocation capitalizes on intra- and inter-application variability to adjust GPU component activity levels. The survey covers several power-gating strategies and predictive models that can estimate the resource requirements and accordingly manage power consumption by activating or deactivating computational units.

Application-specific and programming-level techniques focus on code optimizations to improve the efficiency of GPU utilization. These techniques, including kernel fusion and smart blocking strategies, are essential for aligning hardware resource use with application demands, thereby maximizing energy efficiency.

The implications of this research are significant, underscoring the importance of continued innovation in GPU energy management across developing technologies like 3D die stacking and non-volatile memory integration. Furthermore, as GPUs become increasingly deployed in environments like data centers, cloud platforms, and mobile devices, the need for sophisticated energy management solutions grows.

Looking ahead, it is clear that research in this area must address a variety of challenges, from managing power consumption in heterogeneous systems to leveraging virtualization technologies for reducing GPU idle power. The survey effectively facilitates a deeper understanding of existing strategies for GPU energy efficiency, offering a robust foundation for researchers to develop future advancements in this critical field.

PDF Markdown

A Survey of Methods For Analyzing and Improving GPU Energy Efficiency (1404.4629v2)

Summary

Survey of Methods For Analyzing and Improving GPU Energy Efficiency

Related Papers