摘要

Extreme values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array preprocessing and high-level statistical analysis. This straightforward univariate transformation identifies extreme values in continuous features and can thus be used as a diagnostic tool for outliers. The use of the transformation and its effects is demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets.