摘要

Next generation sequencing (NGS) technologies boosted genomic and medical research, particularly for identification of disease-causing variants. Although most types of genetic variants could be identified through NGS data analysis, there are still some limitations, such as length variations of short tandem repeats (STRs). Many genetic diseases are known to be caused by expansions of STRs, especially neurological disorders, such as Huntington disease. However, almost none of existing tools could detect STRs expanded longer than sequencing read length based on NGS. To break through the limitation, we developed a novel method for detecting length variations of STRs and estimating the length of expansions based on paired-end NGS. We applied our method in a clinical study of motor neuron disease using whole-exome sequencing and successfully identified a disease-causing expansion of STR. Our method firstly used special features of depth of read coverage at STRs to address the variant calling problem. It has widely application value in human genetic disease research and inspirational value in developing new NGS data processing tools.

全文