摘要

This paper considers the problem of video streaming in low bandwidth networks and presents a complete framework that is inspired by the fovea-periphery distinction of biological vision systems. First, an application specific attention function that serves to find the important small regions in a given frame is constructed a priori using a back-propagation neural network that is optimized combinatorially. Given a specific application, the respective attention function partitions each frame into foveal and periphery regions and then a spatial-temporal pre-processing algorithm encodes the foveal regions with high spatial resolution while the periphery regions are encoded with lower spatial and temporal resolution. Finally, the pre-processed video sequence is streamed using a standard streaming server. As an application, we consider the transmission of human face videos. Our experimental results indicate that even with limited amount of training, the constructed attention function is able to determine the foveal regions which have improved transmission quality while the peripheral regions have an acceptable degradation.

  • 出版日期2010-11