摘要

We consider the problem of verifying the identity of a distribution: Given the description of a distribution over a discrete support p = (p(1), p(2), ... , pn), how many samples (independent draws) must one obtain from an unknown distribution, q, to distinguish, with high probability, the case that p = q from the case that the total variation distance (L-1 distance) parallel to p - q parallel to(1)>= epsilon? We resolve this question, up to constant factors, on an instance by instance basis: there exist universal constants c, c' and a function f(p, epsilon) on distributions and error parameters, such that our tester distinguishes p = q from parallel to p-q parallel to 1? ? using f(p, ?) samples with success probability > 2/3, but no tester can distinguish p = q from parallel to p - q parallel to(1)>= c center dot epsilon when given c' center dot f(p, c) samples. The function f(p, c) is upperbounded by a multiple of parallel to p parallel to 2/3/epsilon(2), but is more complicated, and is significantly smaller in some cases when p has many small domain elements, or a single large one. This result significantly generalizes and tightens previous results: since distributions of support at most n have L2/3 norm bounded by root n, this result immediately shows that for such distributions, O(root n/epsilon(2)) samples suffice, tightening the previous bound of O(root npolylog/c(4)) for this class of distributions, and matching the (tight) known results for the case that p is the uniform distribution over support n. The analysis of our very simple testing algorithm involves several hairy inequalities. To facilitate this analysis, we give a complete characterization of a general class of inequalities- generalizing Cauchy-Schwarz, Holder's inequality, and the monotonicity of Lp norms. Specifically, we characterize the set of sequences (a, b, c)(i) = a(1), b(1), c(1)), . . . , (a(r), b(r), c(r)) for which it holds that for all finite sequences of positive numbers (x)(j) = x(1), ... and (y)(j) = y(1),...,Pi(tau)(i=1) (Sigma(j)x(j)(ai)y(i)(bi))(ci)>= 1. For example, the standard Cauchy-Schwarz inequality corresponds to the sequences (a, b, c)(i) = (1, 0, 1/2), (0,1, 1/2), (1/2 , 1/2 , -1). Our characterization is of a non-traditional nature in that it uses linear programming to compute a derivation that may otherwise have to be sought throu.gh trial and error, by hand. We do not believe such a characterization has appeared in the literature, and hope its computational nature will be useful to others, and facilitate analyses like the one here.

  • 出版日期2017