摘要

Some of the video quality assessment (VQA) metrics assess the video quality using spatial and temporal features separately, however, the video distortions are highly related to spatial and temporal content simultaneously. The accuracy of VQA may be reduced by separated spatial and temporal aspects. The distortion of video is under the twin impacts of spatial and temporal distortions. 3D gradient is a spatiotemporal structural capturer to extract both spatial and temporal features at the same time. Hence, this paper proposed a novel VQA metric via spatiotemporal 3D gradient differencing to describe the degradation of video quality effectively. Firstly, 3D gradient is constructed to extract spatiotemporal feature of video sequences. And then the differencing of 3D gradients between the reference and the distorted video is used to predict perceptual local quality of frames. Sequence pooling strategy is applied to assemble the frame quality indices into the score of video quality. Experimental results show that the proposed metric has a good consistency with the subjective perception and performs better than traditional metrics.