摘要

Apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3 (APOBEC3), is a well characterized enzyme that attacks virus genome replication cycle such as HIV, SIV, HBV, leading context dependent G-to-A changes, referred to as "hypermutation." Several methods have analyzed and described these hypermutation sites by aligning affected sequences to a reference sequence. In our previous study, we demonstrated that indels (insertions/deletions) in the sequences lead to an incorrect assignment of APOBEC3 targeted and non-target sites which can result in an incorrect identification of hypermutated sequences and erroneous biological inferences made based on hypermutation analysis. To date, several approaches have been developed in order to analysis of hypermutated sequences, yet no method has been developed to detect hypermutated reads in fastq and bam formats files. In this study, we propose a suitable method based on our recent proposed method (G2A3) that can identify hypermutated reads in fastq and bam formats datasets.

全文