摘要

Users of the Twitter microblogging platform share a considerable amount of information through short messages on a daily basis. Some of these so-called tweets discuss issues related to software and could include information that is relevant to the companies developing these applications. Such tweets have the potential to help requirements engineers better understand user needs and therefore provide important information for software evolution. However, little is known about the nature of tweets discussing software-related issues. In this paper, we report on the usage characteristics, content and automatic classification potential of tweets about software applications. Our results are based on an exploratory study in which we used descriptive statistics, content analysis, machine learning and lexical sentiment analysis to explore a dataset of 10,986,495 tweets about 30 different software applications. Our results show that searching for relevant information on software applications within the vast stream of tweets can be compared to looking for a needle in a haystack. However, this relevant information can provide valuable input for software companies and support the continuous evolution of the applications discussed in these tweets. Furthermore, our results show that it is possible to use machine learning and lexical sentiment analysis techniques to automatically extract information about the tweets regarding their relevance, authors and sentiment polarity.

  • 出版日期2017-9