Abstract
Wave propagation forward modeling is a widely used computational method in oil and gas exploration. The iterative stencil loops in such problems have broad applications in scientific computing. However, executing such loops can be highly time-consuming, which greatly limits their performance and power efficiency. In this paper, we accelerate the forward-modeling technique on the latest multi-core and many-core architectures such as Intel® Sandy Bridge CPUs, NVIDIA Fermi C2070 GPUs, NVIDIA Kepler K20× GPUs, and the Intel® Xeon Phi co-processor. For the GPU platforms, we propose two parallel strategies to explore the performance optimization opportunities for our stencil kernels. For Sandy Bridge CPUs and MIC, we also employ various optimization techniques in order to achieve the best performance. Although our stencil with 114 component variables poses several great challenges for performance optimization, and the low stencil ratio between computation and memory access is too inefficient to fully take advantage of our evaluated architectures, we manage to achieve performance efficiencies ranging from 4.730% to 20.02% of the theoretical peak. We also conduct cross-platform performance and power analysis (focusing on Kepler GPU and MIC) and the results could serve as insights for users selecting the most suitable accelerators for their targeted applications.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 301-318 |
| Number of pages | 18 |
| Journal | International Journal of High Performance Computing Applications |
| Volume | 28 |
| Issue number | 3 |
| DOIs | |
| State | Published - Aug 1 2014 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Theoretical Computer Science
- Hardware and Architecture
Keywords
- 3D wave forward modeling
- Complex stencil
- Intel Xeon Phi
- Kepler GPU
- optimization techniques
- performance power analysis