摘要

Wireless trace data play an important role in wireless network researches. However, publishing the raw WLAN traces poses potential privacy risks of network users. Therefore, it is necessary to sanitize users' sensitive information before these traces are published, and provide high data utility for wireless network researches as well. Although some existing works based on various anonymization methods have started to address the problem of sanitizing WLAN traces, we find the anonymization techniques cannot provide strong and provable privacy guarantee by analyzing a real WLAN trace dataset. Differential Privacy is the only framework that can provide strong and provable privacy guarantee. However, our analysis shows that existing studies on differential privacy fail to provide effective data utility for query operations on multi-dimensional and large-scale datasets. Aiming at WLAN trace datasets that have unique characteristics of multi-dimensional and large-scale, this paper proposes a privacy-preserving data publishing algorithm which not only satisfies differential privacy but also realizes high data utility for query operations. We prove that the proposed sanitization algorithm satisfies E-differential privacy. Furthermore, the theoretical analysis shows the noise variance in our sanitization algorithm is O(logo(1)n/E2) which indicates the algorithm can achieve high data utility on large-scale datasets. Moreover, from the results of extensive experiments on an enterprise-scale WLAN trace dataset, we also show that our sanitization algorithm can provide high data utility for query operations.

全文