摘要

Distance covariance and distance correlation have been widely adopted in measuring dependence of a pair of random variables or random vectors. If the computation of distance covariance and distance correlation is implemented directly accordingly to its definition then its computational complexity is O(n(2)), which is a disadvantage compared to other faster methods. In this article we show that the computation of distance covariance and distance correlation of real-valued random variables can be implemented by an O(nlogn) algorithm and this is comparable to other computationally efficient algorithms. The new formula we derive for an unbiased estimator for squared distance covariance turns out to be a U-statistic. This fact implies some nice asymptotic properties that were derived before via more complex methods. We apply the fast computing algorithm to some synthetic data. Our work will make distance correlation applicable to a much wider class of problems. A supplementary file to this article, available online, includes a Matlab and C-based software that realizes the proposed algorithm.

  • 出版日期2016-11
  • 单位国家自然科学基金委员会