摘要

Data shuffling is a recently proposed technique for masking numerical data where the confidential values are shuffled between records while maintaining all monotonic relationships between the variables in the data set. Data shuffling is based on the multivariate normal copula which assumes that there is no tail dependence in the data set. In many practical situations, however, tail dependence plays a crucial role in decision making. Hence, it is desirable that the data masking procedure be capable of preserving tail dependence when present. In this study, we provide a new data shuffling approach based on t copulas that is capable of maintaining tail dependence in the masked data in a large number of applications.

  • 出版日期2011-3