Analysis of metabolomic PCA data using tree diagrams

作者:Werth, Mark T.; Halouska, Steven; Shortridge, Matthew D.; Zhang, Bo; Powers, Robert*
来源:Analytical Biochemistry, 2010, 399(1): 58-63.
DOI:10.1016/j.ab.2009.12.022

摘要

Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values >= 50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.

  • 出版日期2010-4-1