摘要

Multifrontal is an efficient direct method for solving large-scale sparse and unsymmetric linear systems. The method transforms a large sparse matrix factorization process into a sequence of factorizations involving smaller dense frontal matrices. Some of these dense operations can be accelerated by using a graphic processing unit (CPU). We analyze the unsymmetric multifrontal method from both an algorithmic and implementational perspective to see how a CPU, in particular the NVIDIA Tesla C2070, can be used to accelerate the computations. Our main accelerating strategies include (i) performing BLAS on both CPU and CPU, (ii) improving the communication efficiency between the CPU and CPU by using page-locked memory, zero-copy memory, and asynchronous memory copy, and (iii) a modified algorithm that reuses the memory between different CPU tasks and sets thresholds to determine whether certain tasks be performed on the CPU. The proposed acceleration strategies are implemented by modifying UMFPACK, which is an unsymmetric multifrontal linear system solver. Numerical results show that the CPU-CPU hybrid approach can accelerate the unsymmetric multifrontal solver, especially for computationally expensive problems.

  • 出版日期2011-12