摘要

Online social media is able to convey rich and timely information about real-world events. Uncovering events on social media and sensing topics from them can acquire much valuable information, which has attracted significant research effort. However, due to the large scale of data, to detect events or topics in real time is still a challenging problem. In this paper, we propose a Pattern-based Topic Detection and Analysis System (PTDAS) on Weibo, a Twitter-like platform in China. As one of the key components of the whole system, a FP-growth-like algorithm is employed to mine cosine interesting patterns from a set of tweets, and then summarize them as topics. Specially, in order to discover topics in real-time, we parallelize the algorithm on Spark for efficient mining. Along with pattern-based topic detection, we also present some analytic techniques, including both topic evolving analysis and sentimental analysis. Extensive experiments on the real-world data set demonstrate the effectiveness and efficiency of PTDAS.