Advances in high throughput DNA sequence data compression

作者:Sardaraz Muhammad; Tahir Muhammad; Ikram Ataul Aziz
来源:Journal of Bioinformatics and Computational Biology, 2016, 14(3): 1630002.
DOI:10.1142/S0219720016300021

摘要

<jats:p> Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted. </jats:p>

  • 出版日期2016-6