Limitations of Majority Agreement in Crowdsourced Image Interpretation

作者:Salk Carl F*; Sturn Tobias; See Linda; Fritz Steffen
来源:Transactions in GIS, 2017, 21(2): 207-223.
DOI:10.1111/tgis.12194

摘要

Crowdsourcing can efficiently complete tasks that are difficult to automate, but the quality of crowd-sourced data is tricky to evaluate. Algorithms to grade volunteer work often assume that all tasks are similarly difficult, an assumption that is frequently false. We use a cropland identification game with over 2,600 participants and 165,000 unique tasks to investigate how best to evaluate the difficulty of crowd-sourced tasks and to what extent this is possible based on volunteer responses alone. Inter-volunteer agreement exceeded 90% for about 80% of the images and was negatively correlated with volunteer-expressed uncertainty about image classification. A total of 343 relatively difficult images were independently classified as cropland, non-cropland or impossible by two experts. The experts disagreed weakly (one said impossible while the other rated as cropland or non-cropland) on 27% of the images, but disagreed strongly (cropland vs. non-cropland) on only 7%. Inter-volunteer disagreement increased significantly with inter-expert disagreement. While volunteers agreed with expert classifications for most images, over 20% would have been mis-categorized if only the volunteers' majority vote was used. We end with a series of recommendations for managing the challenges posed by heterogeneous tasks in crowdsourcing campaigns.

  • 出版日期2017-4
  • 单位国际应用系统分析学会(IIASA)