A hybrid approach to prevent composition attacks for independent data releases

作者:Li, Jiuyong*; Baig, Muzammil M.; Sattar, A. H. M. Sarowar; Ding, Xiaofeng; Liu, Jixue; Vincent, Millist W.
来源:Information Sciences, 2016, 367: 324-336.
DOI:10.1016/j.ins.2016.05.009

摘要

Data anonymization is one of the main techniques used in privacy preserving data publishing, and many methods have been proposed to anonymize both individual data sets and multiple data sets. In real life, a data set is rarely isolated and two data sets published by different organizations may contain records pertaining to the same individual. For example, some patients might have visited two hospitals for the same disease, and their records are independently anonymized and published by the two hospitals. Although each published data set alone might pose a small privacy risk, the combination of two data sets may severely compromise the privacy of the individuals common to both data sets. An attack on individual privacy which uses independent data sets is called a composition attack. The topic of how to anonymize data sets to prevent a composition attack using independent data releases has not been widely investigated. In this paper, we propose a new principle to protect data sets from composition attacks. We propose a hybrid algorithm, which combines sampling, perturbation and generalization to protect data privacy from composition attacks. We experimentally demonstrate that the proposed anonymization technique significantly reduces the risk of composition attacks and also preserves good data utility.