Deep sentiment hashing for text retrieval in social CIoT

作者:Zhou, Ke; Zeng, Jiangfeng*; Liu, Yu; Zou, Fuhao
来源:Future Generation Computer Systems-The International Journal of eScience, 2018, 86: 362-371.
DOI:10.1016/j.future.2018.03.047

摘要

Sentiment-based text retrieval is an urgent and valuable task due to the explosive growth of sentiment expressed reviews from social networks like Twitter, Facebook, Instagram, etc. Social networks within the domain of Cognitive Internet of Things (CIoT) make it much easier to dynamically discover desirable services and valuable information. Information retrieval in social media is a daunting task which requires a lot of technical insights. As a powerful tool for large-scale information retrieval, hashing techniques have also been extensively employed for text retrieval. However, most existing text hashing methods are impractical for sentiment-expressed text retrieval mainly for three reasons: (1) the text representations are captured by shallow machine learning algorithms; (2) sentiment is rarely considered when measuring the similarity of two documents; and (3) unsupervised learning of hash functions is employed due to the lack of hash labels. To address these problems, in this paper, we put forward a general deep sentiment hashing model, which is composed of three steps. First, a hierarchical attention-based Long Short-Term Memory network (LSTM) is trained to obtain sentiment-specific document representations. Second, given the document embeddings, k-Nearest Neighbor (kNN) algorithm is used to construct a Laplacian matrix which is projected into hash labels via Laplacian Eigenmaps (LapEig) later. Third, we build a deep model for hash functions learning, which is supervised by both the generated hash labels and the original sentiment labels. Such joint supervision ensures that the ultimate hash codes produced by the learned hash functions maintain sentiment-level similarity. Experimental results turn out that the proposed approach achieves an effective and outstanding retrieval performance.