摘要

In the biology of tissue development and diseases, DNA methylation plays an important role. For a deeper understanding, it is crucial to accurately compare DNA methylation patterns between groups of samples representing different conditions. A widely used method to investigate DNA methylation in the CpG context is bisulfite sequencing, which produces data on the single-nucleotide scale. While there are benefits to analyzing CpG sites on a basepair level, there are both biological and statistical reasons to test entire genomic regions for differential methylation. However, the analysis of DNA methylation is hampered by the lack of best practice standards. Here, we compared multiple approaches for testing predefined genomic regions for differential DNA methylation in bisulfite sequencing data. Nine methods were evaluated: BiSeq, COHCAP, Goeman's Global Test, Limma, methylKit/eDMR, RADMeth and three log-linear regression approaches with different distribution assumptions. We applied these methods to simulated data and determined their sensitivity and specificity. This revealed performance differences, which were also seen when applied to real data. Methods that first test single CpG sites and then test regions based on transformed CpG-wise P-values performed better than methods that summarize methylation levels or raw reads. Interestingly, smoothing of methylation levels had a negligible impact. In particular, Global Test, BiSeq and RADMeth/z-test outperformed the other methods we evaluated, providing valuable guidance for more accurate analysis of DNA methylation.

  • 出版日期2016-9