Accelerating 2D orthogonal matching pursuit algorithm on GPU

作者:Dai, Yuan; He, Dongjian*; Fang, Yong; Yang, Long
来源:Journal of Supercomputing, 2014, 69(3): 1363-1381.
DOI:10.1007/s11227-014-1188-8

摘要

Two-dimensional orthogonal matching pursuit (2D-OMP) algorithm is an extension of the one-dimensional OMP (1D-OMP), whose complexity and memory usage are lower than the 1D-OMP when they are applied to 2D sparse signal recovery. However, the major shortcoming of the 2D-OMP still resides in long computing time. To overcome this disadvantage, we develop a novel parallel design strategy of the 2D-OMP algorithm on a graphics processing unit (GPU) in this paper. We first analyze the complexity of the 2D-OMP and point out that the bottlenecks lie in matrix inverse and projection. After adopting the strategy of matrix inverse update whose performance is superior to traditional methods to reduce the complexity of original matrix inverse, projection becomes the most time-consuming module. Hence, a parallel matrix-matrix multiplication leveraging tiling algorithm strategy is launched to accelerate projection computation on GPU. Moreover, a fast matrix-vector multiplication, a parallel reduction algorithm, and some other parallel skills are also exploited to boost the performance of the 2D-OMP further on GPU. In the case of the sensing matrix of size 128 256 (176 256, resp.) for a 256 256 scale image, experimental results show that the parallel 2D-OMP achieves 17 to 41 (24 to 62, resp.) speedup over the original C code compiled with the O optimization option. Higher speedup would be further obtained with larger-size image recovery.