摘要
In recent years, the popularity of social networks has grown dramatically. Understanding and revealing the underlying community structure of these complex networks is an area of great interest, with a plethora of applications. In this paper, we present a methodology for identifying user communities on Twitter. Initially, Twitter features such as the shared content, the users' interactions and the following relationships between the users are utilized to define a number of similarity metrics. These metrics are then used to compute the similarity between each pair in a set of Twitter users and by extension to group these users into communities. Subsequently, we propose a novel method based on latent Dirichlet allocation to extract the topics discussed in each community and eliminate those which consist of everyday words. Additionaly, we introduce a method for automatically generating labels for the non-trivial topics. The methodology is evaluated with a real-world dataset created using the Twitter Searching API.
- 出版日期2017