The solution of high-dimensional PBMs using CPUs are often computationally intractable. This study focuses on the development of a scalable algorithm to parallelize the nested loops inside the PBM via a GPU framework. The developed PBM is unique since it adapts to the size of the problem and uses the GPU cores accordingly. This algorithm was parallelized for NVIDIA® GPUs as it was written in CUDA® and C/C++. The major bottleneck of such algorithms is the communication time between the CPU and the GPU. In our studies, communication time contributed to less than 1% of the total run time and a maximum speedup of about 12 over the serial CPU code was achieved. The GPU PBM achieved a speedup of about two times compared to the PBM's multi-core configuration on a desktop computer. The speed improvements are also reported for various CPU and GPU architectures and configurations.
All Science Journal Classification (ASJC) codes
- Chemical Engineering(all)
- Computer Science Applications
- Parallel computing
- Population balance model