摘要

Subjective image quality assessment (IQA) based on pairwise comparison (PC) overcome the shortcomings of IQA based on category rating, such as an ambiguous scale definition. However, the testing scale of PC tests can be very large, as the number of image pairs for comparison is a quadratic form of the number of images. To conduct PC tests on a large-scale image set with limited budget, an active sampling strategy to reduce testing scale is required. The conventional active sampling strategies usually select the most informative sample and assume that any image pair's correct label can be obtained from any subjects who are attentive. However, this is not true for IQA, because of human visual system's limitation. If two images are similar, their difference can be too subtle for some subjects to perceive. It means that it takes subjects more effort to obtain correct preference labels of two similar images, and that it is even impossible to obtain the correct preference labels of two images that are too similar. To address this issue, we study the reliability of preference labels. Based on the combination of reliability and informativeness, we design a new active sampling framework. It not only considers the informativeness, but also adjusts the effort spent on an image pair according to its ambiguity. Experiments show that this adjustment can effectively improve the performance of sampling strategies only based on informativeness. Besides, the proposed method is expected to be applied to more general subjective tests based on PC beyond IQA.