摘要

We have examined the length distribution of perfect dimer repeats, where perfect means uninterrupted by any other base, using data from GenBank on primates and rodents, Virtually no lengths greater than 30 repeats are found, except for rodent AG repeats, which extend to 35, Comparable numbers of long AC and AG repeats suggest that they have not been selected for special functions or DNA structures. We have compared the data with predictions of two models: (1) a Bemoulli Model in which bases are assumed equally likely and distributed at random and (2) an Unbiased Random Walk. Model (URWM) in which repeats are permitted to change length by plus or minus one unit, with equal probabilities, and in which base substitutions are allowed to destroy long perfect repeats, producing two shorter perfect repeats. The source of repeats is assumed to be from single base substitutions from neighboring sequences, i,e., those differing from the perfect repeat by a single base. Mutation rates either independent of repeal length or proportional to length were considered. An upper limit to the lengths L approximate to 30 is assumed and isolated dimers are assumed unable to expand, so that there are absorbing barriers to the random walk at lengths 1 and L + 1, and a steady state of lengths is reached. With these assumptions and estimated values for the rates of length mutation and base substitution, reasonable agreement is found with the data for lengths > 5 repeats. Shorter repeats, of lengths less than or equal to 3 are in general agreement with the Bemoulli Model. By reducing the rate of length mutations for n less than or equal to 5, it is possible to obtain reasonable agreement with the full range of data, For these reduced rates, the times between length mutations become comparable to those suggested for a bottleneck in the evolution of Homo sapiens, which may be the reason for low heterozygosity of short repeats.

  • 出版日期1997-4

全文