摘要

Crowdsourcing has emerged as a viable platform to implement the relevance assessment of information retrieval. However, since crowdsourcing touches independent and anonymous workers, the quality control on assessment results has become a hotspot within academia and industry. For this problem, we propose a two-stage iterative approach by integrating the ensemble classifier and expert guidance. Specifically, in first stage an ensemble classifier is employed to select unreliable assessment objects for experts to validate. Then in second stage, the expectation maximization is utilized to update all assessment results in terms of the validation feedback. This loop continues until the cost limit is reached. Simulation experiment demonstrates that compared with existing solutions, our approach can eliminate more noise and thereby achieve a higher accuracy, while maintaining an acceptable running time and a low labor cost.