A Globally Optimal k-Anonymity Method for the De-Identification of Health Data

El Emam Khaled<sup>*</sup>; Dankar Fida Kamal; Issa Romeo; Jonker Elizabeth; Amyot Daniel; Cogo Elise; Corriveau Jean Pierre; Walker Mark; Chowdhury Sadrul; Vaillancourt Regis; Roffey Tyson; Bottomley Jim

doi:10.1197/jamia.M3144

摘要

Background: Explicit patient consent requirements in privacy laws can have a negative impact on health research, leading to selection bias and reduced recruitment. Often legislative requirements to obtain consent are waived if the information collected or disclosed is de-identified. Objective: The authors developed and empirically evaluated a new globally optimal de-identification algorithm that satisfies the k-anonymity criterion and that is suitable for health datasets. Design: Authors compared OLA (Optimal Lattice Anonymization) empirically to three existing k-anonymity algorithms, Datafly, Samarati, and Incognito, on six public, hospital, and registry datasets for different values of k and suppression limits. Measurement: Three information loss metrics were used for the comparison: precision, discernability metric, and non-uniform entropy. Each algorithm's performance speed was also evaluated. Results: The Datafly and Samarati algorithms had higher information loss than OLA and Incognito; OLA was consistently faster than Incognito in finding the globally optimal de-identification solution. Conclusions: For the de-identification of health datasets, OLA is an improvement on existing k-anonymity algorithms in terms of information loss and performance.

出版日期2009-10

全文

访问全文

收藏分享被引(149) 浏览

更新时间：2024-04-11 12:13

A Globally Optimal k-Anonymity Method for the De-Identification of Health Data

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友