摘要

Read mapping is a key task in next-generation sequencing (NGS) data analysis. To achieve an optimal combination of accuracy, speed, and low memory footprint, popular mapping tools often focus on identifying one or a few best mapping locations for each read. However, for many downstream analyses such as prediction of genomic variants or protein binding motifs located in repeat regions, isoform expression quantification, metagenomics analysis, it is more desirable to have a comprehensive set of all possible mapping locations of NGS reads. In this paper, we introduce AMAS, a read mapping tool that exhaustively searches for possible mapping locations of NGS reads in a reference sequence within a given edit distance. AMAS features improvements of the mapping, partition, and filtration of adaptive seeds to speed up the read mapping. Performance results on simulated and real data sets show that AMAS run several times faster than other state-of-the-art read mappers while achieving similar sensitivity and accuracy. AMAS is implemented in C++ and is freely available at https://sourceforge.net/projects/ngsamas/.

  • 出版日期2016-8
  • 单位南阳理工学院