摘要

With the proliferation of smart grids, traditional utilities are struggling to handle the increasing amount of metering data. Outsourcing the metering data to heterogeneous distributed systems has the potential to provide efficient data access and processing. In an untrusted heterogeneous distributed system environment, employing data encryption prior to outsourcing can be an effective way to preserve user privacy. However, how to efficiently query encrypted multidimensional metering data stored in an untrusted heterogeneous distributed system environment remains a research challenge. In this paper, we propose a high performance and privacy-preserving query (P2Q) scheme over encrypted multidimensional big metering data to address this challenge. In the proposed scheme, encrypted metering data are stored in the server of an untrusted heterogeneous distributed system environment. A Locality Sensitive Hashing (LSH) based similarity search approach is then used to realize the similarity query. To demonstrate utility of the proposed LSH-based search approach, we implement a prototype using MapReduce for the Hadoop distributed environment. More specifically, for a given query, the proxy server will return K top similar data object identifiers. An enhanced Ciphertext-Policy Attribute-based Encryption (CP-ABE) policy is then used to control access to the search results. Therefore, only the requester with an authorized query attribute can obtain the correct secret keys to retrieve the metering data. We then prove that the P2Qscheme achieves data confidentiality and preserves the data owner's privacy in a semi-trusted cloud. In addition, our evaluations demonstrate that the P2Qscheme can significantly reduce response time and provide high search efficiency without compromising on search quality (i.e. suitable for multidimensional big data search in heterogeneous distributed system, such as cloud storage system).