A General SIMD-Based Approach to Accelerating Compression Algorithms

作者:Zhao, Wayne Xin*; Zhang, Xudong; Lemire, Daniel; Shan, Dongdong; Nie, Jian Yun; Yan, Hongfei; Wen, Ji Rong
来源:ACM Transactions on Information Systems, 2015, 33(3): 15.
DOI:10.1145/2735629

摘要

Compression algorithms are important for data-oriented tasks, especially in the era of "Big Data." Modern processors equipped with powerful SIMD instruction sets provide us with an opportunity for achieving better compression performance. Previous research has shown that SIMD-based optimizations can multiply decoding speeds. Following these pioneering studies, we propose a general approach to accelerate compression algorithms. By instantiating the approach, we have developed several novel integer compression algorithms, called Group-Simple, Group-Scheme, Group-AFOR, and Group-PFD, and implemented their corresponding vectorized versions. We evaluate the proposed algorithms on two public TREC datasets, aWikipedia dataset, and a Twitter dataset. With competitive compression ratios and encoding speeds, our SIMD-based algorithms outperform state-of-the-art nonvectorized algorithms with respect to decoding speeds.