摘要

Mining features and opinion words is essential for fine-grained opinion analysis of customer reviews. It is observed that semantic dependencies naturally exist between features and opinion words, even among features or opinion words themselves. In this article, we employ a corpus statistics association measure to quantify the pairwise word dependencies and propose a generalized association-based unified framework to identify features, including explicit and implicit features, and opinion words from reviews. We first extract explicit features and opinion words via an association-based bootstrapping method (ABOOT). ABOOT starts with a small list of annotated feature seeds and then iteratively recognizes a large number of domain-specific features and opinion words by discovering the corpus statistics association between each pair of words on a given review domain. Two instances of this ABOOT method are evaluated based on two particular association models, likelihood ratio tests (LRTs) and latent semantic analysis (LSA). Next, we introduce a natural extension to identify implicit features by employing the recognized known semantic correlations between features and opinion words. Experimental results illustrate the benefits of the proposed association-based methods for identifying features and opinion words versus benchmark methods.

  • 出版日期2015-5
  • 单位南阳理工学院