摘要

Haplotypes include essential SNP information used for a variety of purposes such as investigating potential links between certain diseases and genetic variations. Given a set of genotypes, the haplotype inference problem based on pure parsimony is the problem of finding a minimum set of haplotypes that explains all the given genotypes. The problem is especially important because, while it is fairly inexpensive to obtain genotypes, other approaches to obtaining haplotypes are significantly expensive. There are two types of methods proposed for the problem, namely exact and inexact methods. Existing exact methods guarantee obtaining purely parsimonious solutions but have exponential time-complexities and are not practical for large number or length of genotypes. However, inexact methods are relatively fast but do not always obtain optimum solutions. In this paper, an improved heuristic is proposed, based on which new inexact and exact methods are provided. Experimental results indicate that the proposed methods replace the state-of-the-art inexact and exact methods for the problem.

  • 出版日期2012-10-10