摘要

The analysis of polygenetic characteristics for mapping quantitative trait loci (QTL) remains an important challenge. QTL analysis requires two or more strains of organisms that differ substantially in the (poly-)genetic trait of interest, resulting in a heterozygous offspring. The offspring with the trait of interest is selected and subsequently screened for molecular markers such as single-nucleotide polymorphisms (SNPs) with next-generation sequencing. Gene mapping relies on the co-segregation between genes and/or markers. Genes and/or markers that are linked to a QTL influencing the trait will segregate more frequently with this locus. For each identified marker, observed mismatch frequencies between the reads of the offspring and the parental reference strains can be modeled by a multinomial distribution with the probabilities depending on the state of an underlying, unobserved Markov process. The states indicate whether the SNP is located in a (vicinity of a) QTL or not. Consequently, genomic loci associated with the QTL can be discovered by analyzing hidden states along the genome. The aforementioned hidden Markov model assumes that the identified SNPs are equally distributed along the chromosome and does not take the distance between neighboring SNPs into account. The distance between the neighboring SNPs could influence the chance of co-segregation between genes and markers. To address this issue, we propose a nonhomogeneous hidden Markov model with a transition matrix that depends on a set of distance-varying observed covariates. The application of the model is illustrated on the data from a study of ethanol tolerance in yeast.

  • 出版日期2015-2-1