Limitations of Majority Agreement in Crowdsourced Image Interpretation

Salk Carl F<sup>*</sup>; Sturn Tobias; See Linda; Fritz Steffen

doi:10.1111/tgis.12194

摘要

Crowdsourcing can efficiently complete tasks that are difficult to automate, but the quality of crowd-sourced data is tricky to evaluate. Algorithms to grade volunteer work often assume that all tasks are similarly difficult, an assumption that is frequently false. We use a cropland identification game with over 2,600 participants and 165,000 unique tasks to investigate how best to evaluate the difficulty of crowd-sourced tasks and to what extent this is possible based on volunteer responses alone. Inter-volunteer agreement exceeded 90% for about 80% of the images and was negatively correlated with volunteer-expressed uncertainty about image classification. A total of 343 relatively difficult images were independently classified as cropland, non-cropland or impossible by two experts. The experts disagreed weakly (one said impossible while the other rated as cropland or non-cropland) on 27% of the images, but disagreed strongly (cropland vs. non-cropland) on only 7%. Inter-volunteer disagreement increased significantly with inter-expert disagreement. While volunteers agreed with expert classifications for most images, over 20% would have been mis-categorized if only the volunteers' majority vote was used. We end with a series of recommendations for managing the challenges posed by heterogeneous tasks in crowdsourcing campaigns.

出版日期2017-4
单位国际应用系统分析学会(IIASA)

全文

访问全文

收藏分享被引(15) 浏览

更新时间：2024-04-14 21:43

Limitations of Majority Agreement in Crowdsourced Image Interpretation

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友