摘要

Principal component analysis (PCA), also known as empirical orthogonal EOF) analysis, is widely used for compression of high-dimensional datasets in such applications as climate diagnostics and seasonal forecasting. A critical question when using this method is the number of modes, representing meaningful signal, to retain. The resampling-based "Rule N" method attempts to address the question of PCA truncation in a statistically principled manner. However, it is only valid for the leading (largest) eigenvalue, because it fails to condition the hypothesis tests for subsequent (smaller) eigenvalues on the results of previous tests. This paper draws on several relatively recent statistical results to construct a hypothesis-test based truncation rule that accounts at each stage for the magnitudes of the larger eigenvalues. The performance of the method is demonstrated in an artificial data setting and illustrated with a real-data example.

  • 出版日期2016-4