摘要

With the popularity and prosperity of e-commerce, text mining technologies for e-commerce information processing have been become more and more important. Product named entity normalization technology plays a vital role for the performance of e-commerce information processing because it can resolve the ambiguities of product named entities which is caused by the rich aliases and the complex structures of product names. This work proposed a relation based method for product named entity normalization. The proposed method first detected the relations between entities, and then used the relations to inference the full form of an entity. After that the similarities between the target entity with full form and the entries in a dictionary were calculated. The corresponding identifier of the most similar entries in the dictionary was chosen as the normalization result for the target entity. When calculating the similarity between two entities, the structures of the two entities were considered. Experiments on an annotated corpus consisting of web documents related to electronic product showed promising results of the proposed method, which achieved an accuracy of 88.09%.

  • 出版日期2012

全文