摘要

While the greatest strength of systems biology may be to measure tens of thousands of variables across different genotypes, this simultaneously presents an enormous challenge to statistical analysis that cannot be completely solved with conventional approaches that identify and rank differences. Here we examine a diverse panel of conventional and transgenic, field-grown tomato fruits (Solanum lycopersicum L.) by liquid chromatography-mass spectrometry (LC-MS) metabolic fingerprinting. We used a progression of statistics to examine phenotypic variation observed. While clear trends were found by principal component analysis (PCA) related to genetic background and ripeness, it could not detect differences between transgenic genotypes and their nontransgenic parent variety. Partial least squares discriminant analysis (PLS-DA), a supervised method, identified 15 metabolic features of potential interest, but only five were significantly different between the transgenic lines and their nontransgenic parent. Weighted correlation network analysis (WGCNA) recognized relationships among these features and others, suggesting that a small suite of highly correlated compounds accumulated to significantly lower levels in the transgenic genotypes. We assert that metabolic fingerprinting with a series of statistical methods is an efficient and powerful approach to examine both large and small genetic effects on phenotypes of high value or interest.