摘要

It is well known that graphics processing units (GPUs) are able to accelerate highly parallelizable algorithms with a high speedup. However, for less-parallelizable algorithms such as the finite element method, novel schemes are needed to achieve a high speedup. In this paper, the dual-field domain decomposition (DFDD) method based on element-level decomposition (DFDD-ELD) is accelerated on a large GPU cluster. By using element-level subdomains, the DFDD-ELD computation can be easily mapped onto GPU's granular processors and is thus highly parallelizable. Various electromagnetic problems are simulated to demonstrate the speedup and scalability of DFDD-ELD on a GPU cluster. With a careful GPU memory arrangement and thread allocation, we are able to achieve a significant speedup by utilizing GPUs in a message-passing interface (MPI)-based cluster environment. The same acceleration strategy can be applied to the acceleration of the discontinuous Galerkin time-domain (DGTD) algorithms.

  • 出版日期2014-9