A parallel computing framework for big data

作者:Chen, Guoliang; Mao, Rui*; Lu, Kezhong
来源:Frontiers of Computer Science, 2017, 11(4): 608-621.
DOI:10.1007/s11704-016-5003-y

摘要

Big data has received great attention in research and application. However, most of the current efforts focus on system and application to handle the challenges of "volume" and "velocity", and not much has been done on the theoretical foundation and to handle the challenge of "variety". Based on metric-space indexing and computationalcomplexity theory, we propose a parallel computing framework for big data. This framework consists of three components, i.e., universal representation of big data by abstracting various data types into metric space, partitioning of big data based on pair-wise distances in metric space, and parallel computing of big data with the NC-class computing theory.