摘要

Microblogs have become an important platform for people to publish, transform information and acquire knowledge. This paper focuses on the problem of discovering user interest in microblogs. In this paper, we propose a topic mining model based on Latent Dirichlet Allocation (LDA) named user-topic model. For each user, the interests are divided into two parts by different ways to generate the microblogs: original interest and retweet interest. We represent a Gibbs sampling implementation for inference the parameters of our model, and discover not only user's original interest, but also retweet interest. Then we combine original interest and retweet interest to compute interest words for users. Experiments on a dataset of Sina microblogs demonstrate that our model is able to discover user interest effectively and outperforms existing topic models in this task. And we find that original interest and retweet interest are similar and the topics of interest contain user labels. The interest words discovered by our model reflect user labels, but range is much broader.