Distribution Sensitive Product Quantization

作者:Li, Linhao; Hu, Qinghua*; Han, Yahong; Li, Xin
来源:IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(12): 3504-3515.
DOI:10.1109/TCSVT.2017.2759277

摘要

Product quantization (PQ) seems to have become the most efficient framework of performing approximate nearest neighbor (ANN) search for high-dimensional data. However, almost all existing PQ-based ANN techniques uniformly allocate precious bit budget to each subspace. This is not optimal, because data are often not evenly distributed among different subspaces. A better strategy is to achieve an improved balance between data distribution and bit budget within each subspace. Motivated by this observation, we propose to develop an optimized PQ (OPQ) technique, named distribution sensitive PQ (DSPQ) in this paper. The DSPQ dynamically analyzes and compares the data distribution based on a newly defined aggregate degree for high-dimensional data; whenever further optimization is feasible, resources such as memory and bits can be dynamically rearranged from one subspace to another. Our experimental results have shown that the strategy of bit rearrangement based on aggregate degree achieves modest improvements on most datasets. Moreover, our approach is orthogonal to the existing optimization strategy for PQ; therefore, it has been found that distribution sensitive OPQ can even outperform previous OPQ in the literature.