A Short-Read Multiplex Sequencing Method for Reliable, Cost-Effective and High-Throughput Genotyping in Large-Scale Studies

作者:Cao, Hongzhi; Wang, Yu; Zhang, Wei; Chai, Xianghua; Zhang, Xiandong; Chen, Shiping; Yang, Fan; Zhang, Caifen; Guo, Yulai; Liu, Ying; Tang, Zhoubiao; Chen, Caifen; Xue, Yaxin; Zhen, Hefu; Xu, Yinyin; Rao, Bin; Liu, Tao; Zhao, Meiru; Zhang, Wenwei; Li, Yingrui; Zhang, Xiuqing; Tellier, Laurent C. A. M.; Krogh, Anders; Kristiansen, Karsten; Wang, Jun; Li, Jian*
来源:Human Mutation, 2013, 34(12): 1715-1720.
DOI:10.1002/humu.22439

摘要

Accurate genotyping is important for genetic testing. Sanger sequencing-based typing is the gold standard for genotyping, but it has been underused, due to its high cost and low throughput. In contrast, short-read sequencing provides inexpensive and high-throughput sequencing, holding great promise for reaching the goal of cost-effective and high-throughput genotyping. However, the short-read length and the paucity of appropriate genotyping methods, pose a major challenge. Here, we present RCHSBTreliable, cost-effective and high-throughput sequence based typing pipelinewhich takes short sequence reads as input, but uses a unique variant calling, haploid sequence assembling algorithm, can accurately genotype with greater effective length per amplicon than even Sanger sequencing reads. The RCHSBT method was tested for the human MHC loci HLA-A, HLA-B, HLA-C, HLA-DQB1, and HLA-DRB1, upon 96 samples using Illumina PE 150 reads. Amplicons as long as 950bp were readily genotyped, achieving 100% typing concordance between RCHSBT-called genotypes and genotypes previously called by Sanger sequence. Genotyping throughput was increased over 10 times, and cost was reduced over five times, for RCHSBT as compared with Sanger sequence genotyping. We thus demonstrate RCHSBT to be a genotyping method comparable to Sanger sequencing-based typing in quality, while being more cost-effective, and higher throughput.