摘要

Purpose - Identification of topic changes within a user search session is a key issue in content analysis of search engine user queries. Recently. various studies have focused on new topic identification/session identification of search engine transaction logs, and several problems regarding the estimation of topic shifts and continuations were observed in these studies. This study aims to analyze the reasons for the problems that were encountered as a result of applying automatic new topic identification. Design/methodology/approach - Measures, such as cleaning the data of coalition words and analyzing the errors of automatic new topic identification, are applied to eliminate the problems in estimating topic shifts mid continuations. Findings - The findings show that the resulting errors of automatic new topic identification have a pattern, and further research is required to improve the performance of automatic new topic identification. Originality/value - Improving the performance of automatic new topic identification would be valuable to search engine designers, so that they can be develop new recommendation algorithms, as well as custom-tailored graphical user interface, for search engine users.

  • 出版日期2008