摘要

Parallel computing is a useful technology for scientific and engineering algorithms/applications. LU-SGS (lower-upper Symmetric-Gauss-Seidel method) is an efficient and robust scheme for CFD (Computational fluid dynamics) and has strong data dependence in its computation. In this paper, we present an efficient wavefront parallel algorithm for 3D (three dimensional) LU-SGS with structured meshes. The corresponding data structure and memory access method with better data locality and communication optimization is designed. The performances of the presented parallel algorithm are reported with different problem sizes. Some discussion and performance issues are also reported. The results show that the overall performance speedup of one Intel E5540 CPU (4 CPU cores) ranges from 2.23 to 2.95 compared with one E5540 core. The parallel efficiency of 1024, 128 processes are up to 35.68%, 72.69% compared with 32 processes on a distributed memory cluster system. The CFD simulation of M6 wing model shows the effect of the presented parallel algorithm.