摘要

Motivation: Copy Number Variants (CNVs) are important genetic factors for studying human diseases. While high-throughput whole genome re-sequencing provides multiple lines of evidence for detecting CNVs, computational algorithms need to be tailored for different type or size of CNVs under different experimental designs.
Results: To achieve optimal power and resolution of detecting CNVs at low depth of coverage, we implemented a Hidden Markov Model that integrates both depth of coverage and mate-pair relationship. The novelty of our algorithm is that we infer the likelihood of carrying a deletion jointly from multiple mate pairs in a region without the requirement of a single mate pairs being obvious outliers. By integrating all useful information in a comprehensive model, our method is able to detect medium-size deletions (200-2000bp) at low depth (<10x per sample). We applied the method to simulated data and demonstrate the power of detecting medium-size deletions is close to theoretical values.
Availability: A program implemented in Java, Zinfandel, is available at http://www.cs.columbia.edu/similar to itsik/zinfandel/

  • 出版日期2011-7-28