摘要

Background/Aims: Current linkage studies detect and localize trait loci using genotypes sampled at hundreds of thousands of single nucleotide polymorphisms (SNPs). Such data should provide precise estimates of trait location once linkage has been established. However, correlations between nearby SNPs can distort the information about trait location. Traditionally, when faced with this dilemma, three approaches have been used: (1) ignore the correlation; (2) approximate the correlation; or, (3) analyze a single, approximately uncorrelated subset of the original dense data. Methods: Here, we examine and test a simple and efficient estimator of trait location that averages location estimates across random subsamples of the original dense data. Based on pairwise estimates of correlation, we ensure that the SNPs within each subsample are approximately uncorrelated. In addition, we use the nonparametric bootstrap procedure to compute narrow, high-resolution candidate gene regions (i.e. confidence intervals for the true trait location). Results: Using simulated data, we show that the three existing approaches to dense SNP linkage analysis (described above) can yield biased and/or inefficient estimation depending on the underlying correlation structure. With respect to mean squared error, our estimator outperforms the third approach, and is as good as, but usually better than the first and second approaches. Relative to the third approach, our estimator led to a 47.5% reduction in the candidate gene region length based on the analysis of 15 hypertension families genotyped at similar to 500,000 SNPs. Conclusion: The method we developed will be an important tool for constructing high-resolution candidate gene regions that could ultimately aid in targeting regions for sequencing projects.

  • 出版日期2010