摘要

Nowadays, more and more applications are dependent on storage and management of semi-structured information. For scientific research and knowledge-based decision-making, such data often needs to be published, e.g., medical data is released to implement a computer-assisted clinical decision support system. Since this data contains individuals' privacy, they must be appropriately anonymized before to be released. However, the existing anonymization method based on l-diversity for hierarchical data may cause serious similarity attacks, and cannot protect data privacy very well. In this paper, we utilize fuzzy sets to divide levels for sensitive numerical and categorical attribute values uniformly (a categorical attribute value can be converted into a numerical attribute value according to its frequency of occurrences), and then transform the value levels to sensitivity levels. The privacy model (alpha(h)(lev), k)-anonymity for hierarchical data with multi-level sensitivity is proposed. Furthermore, we design a privacy-preserving approach to achieve this privacy model. Experiment results demonstrate that our approach is obviously superior to existing anonymous approach in hierarchical data in terms of utility and security.