Supernodal sparse Cholesky factorization on graphics processing units

Zou, Dan<sup>*</sup>; Dou, Yong; Guo, Song; Li, Rongchun; Deng, Lin

doi:10.1002/cpe.3158

摘要

Sparse Cholesky factorization is the most computationally intensive component in solving large sparse linear systems and is the core algorithm of numerous scientific computing applications. A large number of sparse Cholesky factorization algorithms have previously emerged, exploiting architectural features for various computing platforms. The recent use of graphics processing units (GPUs) to accelerate structured parallel applications shows the potential to achieve significant acceleration relative to desktop performance. However, sparse Cholesky factorization has not been explored sufficiently because of the complexity involved in its efficient implementation and the concerns of low GPU utilization. In this paper, we present a new approach for sparse Cholesky factorization on GPUs. We present the organization of the sparse matrix supernode data structure for GPU and propose a queue-based approach for the generation and scheduling of GPU tasks with dense linear algebraic operations. We also design a subtree-based parallel method for multi-GPU system. These approaches increase GPU utilization, thus resulting in substantial computational time reduction. Comparisons are made with the existing parallel solvers by using problems arising from practical applications. The experiment results show that the proposed approaches can substantially improve sparse Cholesky factorization performance on GPUs. Relative to a highly optimized parallel algorithm on a 12-core node, we were able to obtain speedups in the range 1.59x to 2.31x by using one GPU and 1.80x to 3.21x by using two GPUs. Relative to a state-of-the-art solver based on supernodal method for CPU-GPU heterogeneous platform, we were able to obtain speedups in the range 1.52x to 2.30x by using one GPU and 2.15x to 2.76x by using two GPUs. Concurrency and Computation: Practice and Experience, 2013.

出版日期2014-11
单位中国人民解放军国防科学技术大学

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2019-08-15 08:50

Supernodal sparse Cholesky factorization on graphics processing units

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友