Structural analysis of chat messages for topic detection

作者:Dong Haichao; Hui Siu Cheung; He Yulan*
来源:Online Information Review, 2006, 30(5): 496-516.
DOI:10.1108/14684520610706398

摘要

Purpose - The purpose of this research is to study the characteristics of chat messages from analysing a collection of 33,121 sample messages gathered from 1,700 sessions of conversations of 72 pairs of MSN Messenger users over a four month duration from June to September of 2005. The primary objective of chat message characterization is to understand the proper-ties of chat messages for effective message analysis, such as message topic detection. Design/methodology/approach - From the study on chat message characteristics, an indicative term-based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalisation of chat messages and extraction of features from icon texts and URLs are incorporated for message pre-processing. Naive Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions. Findings - Indicative term-based approach is superior to the traditional document frequency based approach, for feature selection in chat topic categorization. Originality/value - This paper studies the characteristics of chat messages and proposes an indicative term-based categorization approach for chat topic detection.

  • 出版日期2006
  • 单位南阳理工学院