摘要

Gene expression microarrays are powerful tools for global comparison and estimation of gene expression. Many microarray studies have demonstrated biologically plausible results with only a few arrays, leading to a misperception that a handful of hybridized arrays can always find something meaningful. From a statistical point of view, it is important to prospectively estimate required sample sizes prior to undertaking a microarray experiment. However, all sample size calculations need to directly or indirectly estimate the unknown distribution of the effect sizes of gene expression intensities. A parametric mixture model has been developed for relating the sample size directly to the false discovery rate (FDR), the most popular multiple-comparison control criteria. In this paper, we extend the parametric mixture model and propose a robust semiparametric Dirichlet process mixture model, where the parametric distribution of gene expressions is no longer specified. This analysis is performed in a Bayesian inference framework using Markov-chain Monte Carlo steps. The usefulness of the method is illustrated by simulations and a real murine lung study.

  • 出版日期2010

全文