摘要

The cytosines in the CpG dinucleotides in mammalian DNA are very likely methylated. Through deamination, the C will be converted into T. But methylation is suppressed around genes in the areas, called CpG-islands, where CpG appears relatively frequently. These CpG-islands are known to appear in the significant parts of the genome. The ability to identify CpG-islands will therefore help us spot the significant regions of interest along the genome. To locate the CpG-islands is very costly using deterministic algorithms. We propose a stochastic algorithm to recognize and locate the CpG-islands in the gene sequence. We first use hidden Markov model (HMM) as a graph model to represent the problem of the CpG-island. Based on the HMM, we construct the corresponding CpG-Boltzmann model to recognize the CpG-island. Since the Boltzmann model has the property that its learning algorithm guarantees getting the global minimum, the proposed model has the recognition ability for the CpG-island with the least biased value. Finally, we have performed experiments to show the proposed model has better false positive rate compared with deterministic algorithms.

  • 出版日期2008