摘要

Social networks such as Twitter are used by millions of people who express their opinions on a variety of topics. Consequently, these media are constantly being examined by sentiment analysis systems which aim at classifying the posts as positive or negative. Given the variety of topics discussed and the short length of the posts, the standard approach of using the words as features for machine learning algorithms results in sparse vectors. In this work, we propose using features derived from the ranking generated by an Information Retrieval System in response to a query consisting of the post that needs to be classified. Our system can be fully automatic, has only 24 features, and does not depend on expensive resources. Experiments on real datasets have shown that a classifier that relies solely on these features outperforms established baselines and can reach accuracies comparable to the state-of-the-art approaches which are more costly.

  • 出版日期2016-11-1