摘要

The parallel higher-order method of moments (HoMoM) with a GPU accelerated out-of-core LU solver is presented for analysis of radiation characteristics of a 1000-element antenna array over a full-size airplane. A parallel framework involving MPI and CUDA is adopted to ensure that the procedures run on a hybrid CPU/GPU cluster. An efficient two-level out-of-core scheme is designed to break the bottleneck of both GPUmemory and physical memory when solving electrically large and complex problems. To hide communication time between CPU and GPU, asynchronous communications are chosen to enable overlapping between communication and computation. For large problems that cannot fit in GPU memory or physical memory, the two-level out-of-core LU solver is able to achieve a speedup of about 1.6x over the traditional out-of-core LU solver based on a highly optimized math library.