摘要

In this paper, we propose a solution on microaggregation problem based on the hierarchical tree equi-partition (HTEP) algorithm. Microaggregation is a family of methods for statistical disclosure control of microdata, that is, for masking microdata, so that they can be released without disclose private information on the underlying individuals. Knowing that the microaggregation problem is non-deterministic polynomial-time-hard, the goal is to partition N given data into groups of at least K items, so that the sum of the within-partition squared error is minimized. The proposed method is general and it can be applied to any tree partition problem aiming at the minimization of a total score. The method is divisive, so that the tree with the highest 'score' is split into two trees, resulting in a hierarchical forest of trees with almost equal 'score' (equipartition). We propose a version of HTEP for microaggregation (HTEPM), that is applied on the minimum spanning tree (MST) of the graph defined by the data. The merit of the HTEPM algorithm is that it solves optimally some instances of the multivariate microaggregation problem on MST search space in [GRAPHICS] . Experimental results and comparisons with existing methods from literature prove the high performance and robustness of HTEPM.

  • 出版日期2015-4-3