A Lot of Data

作者:Johnson Kent*
来源:Philosophy of Science, 2011, 78(5): 788-799.
DOI:10.1086/662256

摘要

This article encourages the use of explicit methods in linguistics by attempting to estimate the size of a linguistic data set. Such estimations are difficult because redundant data can easily pad the data set. To address this, I offer some explicit operationalizations of the data and their features. For linguistic data, negative associations do not indicate true redundancy, and yet for many measures they can be mathematically impossible to ignore. It is proven that this troublesome phenomenon has positive Lebesguemeasure and is monotonically increasing and that these two features hold robustly in four different ways.

  • 出版日期2011-12