Determining the quality and complexity of next-generation sequencing data without a reference genome

作者:Anvar Seyed Yahya*; Khachatryan Lusine; Vermaat Martijn; van Galen Michiel; Pulyakhina Irina; Ariyurek Yavuz; Kraaijeveld Ken; den Dunnen Johan T; de Knijff Peter; 't Hoen Peter Ac; Laros Jeroen F J
来源:Genome Biology, 2014, 15(12): 555.
DOI:10.1186/s13059-014-0555-3

摘要

We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analysing k-mer frequencies. We show that kPAL can detect technical artifacts such as high duplication rates, library chimeras, contamination, and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL.

  • 出版日期2014