Assessing similarity of DNA profiles

作者:Hepworth Graham*; Gordon Ian
来源:Journal of the Royal Statistical Society - Series C: Applied Statistics , 2011, 60: 125-133.
DOI:10.1111/j.1467-9876.2010.00742.x

摘要

The genetic similarity of strains of a pathogen can be assessed by using a matrix of dissimilarities that is derived from bands in their DNA profile which are present or absent. The dependence between elements of the dissimilarity matrix, if not accounted for, results in underestimation of the variance in comparisons between groups of strains which are differentiated according to the possession of an attribute. We examine a previously proposed statistic for determining whether a group of strains is more similar than expected. We show the limitations of this statistic and propose a new statistic which better addresses the hypotheses that are usually considered in this field of study. The statistic proposed is based on similarity between strains within the group of interest and with those outside. This statistic also needs to account for the dependence in the raw data, and we use the correlation between elements of the dissimilarity matrix to investigate how this dependence affects the underestimation of the variance. Using examples involving the pathogenic yeast Candida, we show how permutation tests can be applied to the differentiation of groups of strains.

  • 出版日期2011