Differentially private high-dimensional data publication via grouping and truncating techniques

Wang, Ning; Gu, Yu; Xu, Jia; Li, Fangfang; Yu, Ge

doi:10.1007/s11704-017-6591-x

摘要

The count of one column for high-dimensional datasets, i.e., the number of records containing this column, has been widely used in numerous applications such as analyzing popular spots based on check-in location information and mining valuable items from shopping records. However, this poses a privacy threat when directly publishing this information. Differential privacy (DP), as a notable paradigm for strong privacy guarantees, is thereby adopted to publish all column counts. Prior studies have verified that truncating records or grouping columns can effectively improve the accuracy of published results. To leverage the advantages of the two techniques, we combine these studies to further boost the accuracy of published results. However, the traditional penalty function, which measures the error imported by a given pair of parameters including truncating length and group size, is so sensitive that the derived parameters deviate from the optimal parameters significantly. To output preferable parameters, we first design a smart penalty function that is less sensitive than the traditional function. Moreover, a two-phase selection method is proposed to compute these parameters efficiently, together with the improvement in accuracy. Extensive experiments on a broad spectrum of real-world datasets validate the effectiveness of our proposals.

出版日期2019-4
单位东北大学; 广西大学

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2024-05-10 09:27

Differentially private high-dimensional data publication via grouping and truncating techniques

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友