Accelerating pairwise statistical significance estimation for local alignment by harvesting GPU's power

作者:Zhang, Yuhong*; Misra, Sanchit; Agrawal, Ankit; Patwary, Md Mostofa Ali; Liao, Wei-keng; Qin, Zhiguang; Choudhary, Alok
来源:BMC Bioinformatics, 2012, 13: S3.
DOI:10.1186/1471-2105-13-S5-S3

摘要

Background: Pairwise statistical significance has been recognized to be able to accurately identify related sequences, which is a very important cornerstone procedure in numerous bioinformatics applications. However, it is both computationally and data intensive, which poses a big challenge in terms of performance and scalability. Results: We present a GPU implementation to accelerate pairwise statistical significance estimation of local sequence alignment using standard substitution matrices. By carefully studying the algorithm's data access characteristics, we developed a tile-based scheme that can produce a contiguous data access in the GPU global memory and sustain a large number of threads to achieve a high GPU occupancy. We further extend the parallelization technique to estimate pairwise statistical significance using position-specific substitution matrices, which has earlier demonstrated significantly better sequence comparison accuracy than using standard substitution matrices. The implementation is also extended to take advantage of dual-GPUs. We observe end-to-end speedups of nearly 250 (370) x using single-GPU Tesla C2050 GPU (dual-Tesla C2050) over the CPU implementation using Intel(C) Core (TM) i7 CPU 920 processor. Conclusions: Harvesting the high performance of modern GPUs is a promising approach to accelerate pairwise statistical significance estimation for local sequence alignment.