An Optimized Iterative Semantic Compression Algorithm And Parallel Processing for Large Scale Data

作者:Jin, Ran*; Chen, Gang; Tung, Anthony K. H.; Shou, Lidan; Ooi, Beng Chin
来源:KSII Transactions on Internet and Information Systems, 2018, 12(6): 2761-2781.
DOI:10.3837/tiis.2018.06.018

摘要

With the continuous growth of data size and the use of compression technology, data reduction has great research value and practical significance. Aiming at the shortcomings of the existing semantic compression algorithm, this paper is based on the analysis of ItCompress algorithm, and designs a method of bidirectional order selection based on interval partitioning, which named An Optimized Iterative Semantic Compression Algorithm (Optimized ItCompress Algorithm). In order to further improve the speed of the algorithm, we propose a parallel optimization iterative semantic compression algorithm using GPU (POICAG) and an optimized iterative semantic compression algorithm using Spark (DOICAS). A lot of valid experiments are carried out on four kinds of datasets, which fully verified the efficiency of the proposed algorithm.