摘要

With the rapid development of e-commerce business, the research of text mining with online reviews has become a prevalence topic. While an end-user is making a purchasing decision, he is not only interested in whether the product is recommended, he also cares about the sentiment orientation corresponds to the product's detailed features. So this paper aims to solve the problem of automatically extracting the products features of the online reviews. In his paper, we choose the N-Grams that are in the pattern of BNP (base noun phrase) as candidate feature items. Additionally, we take advantage of the boundary average entropy of N-Grams and the substring dependency relationships among the items to filter the result. Referring to the final experiment outcomes, we conclude that the current filtering condition improves the accuracy of the result comparing with the baseline method, which directly designate the BNP as feature items. The current method does not rely on the outside domain corpus for training and is free from manual intervention. Also, one more meaningful aspect of the research is that the output result is in a hierarchical presentation of tree form and it will be beneficial for the further research on the construction of domain knowledge ontology as a nice reference data structure.

全文