摘要

Let X be a d-dimensional random vector and X (theta) its projection onto the span of a set of orthonormal vectors {theta (1),aEuro broken vertical bar,theta (k) }. Conditions on the distribution of X are given such that if theta is chosen according to Haar measure on the Stiefel manifold, the bounded-Lipschitz distance from X (theta) to a Gaussian distribution is concentrated at its expectation; furthermore, an explicit bound is given for the expected distance, in terms of d, k, and the distribution of X, allowing consideration not just of fixed k but of k growing with d. The results are applied in the setting of projection pursuit, showing that most k-dimensional projections of n data points in a%26quot;e (d) are close to Gaussian, when n and d are large and k=clog (d) for a small constant c.

  • 出版日期2012-6