摘要

We experimentally investigated more than 1,200 entries in dbSNP that would change amino-acids (nsSNPs), using various subsets of DNA samples drawn from 18 global populations (similar to 1,000 subjects in total). First, we mined the data for any SNP features that correlated with a high validation rate. Useful predictors of valid SNPs included multiple submissions to dbSNP, having a dbSNP validation statement, and being present in a low number of ESTs. Together, these features improved validation rates by almost 10-fold. Higher-abundance SNPs (e.g., T/C variants) also validated more frequently. Second, we considered derived alleles and noted a considerably (similar to 10%) increased average derived allele frequency (DAF) in Europeans vs. Africans, plus a further increase in some other populations. This was not primarily due to an SNP ascertainment bias, nor to the effects of natural selection. Instead, it can be explained as a drift based, progressive increase in DAF that occurs over many generations and becomes exaggerated during population bottlenecks. This observation could be used as the basis for novel DAF-based tests for comparing demographic histories. Finally, we considered individual marker patterns and identified 37 SNPs with allele frequency variance or F-ST values consistent with the effects of population-specific natural selection. Four particularly striking clusters of these markers were apparent, and three of these coincide with genes/regions from among only several dozen such domains previously suggested by others to carry signatures of selection.

  • 出版日期2006-2