TY - GEN
T1 - On-the-fly elimination of dynamic irregularities for GPU computing
AU - Zhang, Eddy Z.
AU - Jiang, Yunlian
AU - Guo, Ziyu
AU - Tian, Kai
AU - Shen, Xipeng
PY - 2011
Y1 - 2011
N2 - The power-efficient massively parallel Graphics Processing Units (GPUs) have become increasingly influential for general-purpose computing over the past few years. However, their efficiency is sensitive to dynamic irregular memory references and control flows in an application. Experiments have shown great performance gains when these irregularities are removed. But it remains an open question how to achieve those gains through software approaches on modern GPUs. This paper presents a systematic exploration to tackle dynamic irregularities in both control flows and memory references. It reveals some properties of dynamic irregularities in both control flows and memory references, their interactions, and their relations with program data and threads. It describes several heuristicsbased algorithms and runtime adaptation techniques for effectively removing dynamic irregularities through data reordering and job swapping. It presents a framework, G-Streamline, as a unified software solution to dynamic irregularities in GPU computing. GStreamline has several distinctive properties. It is a pure software solution and works on the fly, requiring no hardware extensions or offline profiling. It treats both types of irregularities at the same time in a holistic fashion, maximizing the whole-program performance by resolving conflicts among optimizations. Its optimization overhead is largely transparent to GPU kernel executions, jeopardizing no basic efficiency of the GPU application. Finally, it is robust to the presence of various complexities in GPU applications. Experiments show that G-Streamline is effective in reducing dynamic irregularities in GPU computing, producing speedups between 1.07 and 2.5 for a variety of applications.
AB - The power-efficient massively parallel Graphics Processing Units (GPUs) have become increasingly influential for general-purpose computing over the past few years. However, their efficiency is sensitive to dynamic irregular memory references and control flows in an application. Experiments have shown great performance gains when these irregularities are removed. But it remains an open question how to achieve those gains through software approaches on modern GPUs. This paper presents a systematic exploration to tackle dynamic irregularities in both control flows and memory references. It reveals some properties of dynamic irregularities in both control flows and memory references, their interactions, and their relations with program data and threads. It describes several heuristicsbased algorithms and runtime adaptation techniques for effectively removing dynamic irregularities through data reordering and job swapping. It presents a framework, G-Streamline, as a unified software solution to dynamic irregularities in GPU computing. GStreamline has several distinctive properties. It is a pure software solution and works on the fly, requiring no hardware extensions or offline profiling. It treats both types of irregularities at the same time in a holistic fashion, maximizing the whole-program performance by resolving conflicts among optimizations. Its optimization overhead is largely transparent to GPU kernel executions, jeopardizing no basic efficiency of the GPU application. Finally, it is robust to the presence of various complexities in GPU applications. Experiments show that G-Streamline is effective in reducing dynamic irregularities in GPU computing, producing speedups between 1.07 and 2.5 for a variety of applications.
KW - CPU-GPU pipelining
KW - Data transformation
KW - GPGPU
KW - Memory coalescing
KW - Thread divergence
KW - Threaddata remapping
UR - http://www.scopus.com/inward/record.url?scp=79953126288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79953126288&partnerID=8YFLogxK
U2 - 10.1145/1950365.1950408
DO - 10.1145/1950365.1950408
M3 - Conference contribution
AN - SCOPUS:79953126288
SN - 9781450302661
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 369
EP - 380
BT - ASPLOS XVI - 16th International Conference on Architectural Support for Programming Languages and Operating Systems
T2 - 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011
Y2 - 5 March 2011 through 11 March 2011
ER -