摘要

In recent years, the interest in multiview video systems has increased. In these systems, a typical predictive coding approach exploits the inter-view correlation at a joint encoder, requiring the various cameras to communicate among them. However, many applications ask for simple sensing systems preventing the various cameras to communicate among them, and thus the adoption of a predictive coding approach. Wyner-Ziv (WZ) video coding is a promising solution for those applications since it is the WZ decoder task to (fully or partly) exploit the video redundancy. The rate-distortion (RD) performance of WZ video coding strongly depends on the quality of the so-called side information (51), which is a decoder estimate of the original frame to code. In multiview WZ (MV-WZ) video coding, the target is to exploit in the best way the available correlation not only in time, as for the monoview case, but also between views. Thus, the multiview SI results from the fusion of a temporally created SI and an inter-view created SI. In this context, the main objective of this paper is to propose a classification taxonomy to organize the many interview SI creation and SI fusion techniques available in the literature and to review the most relevant techniques in each class. The inter-view SI creation techniques are classified into two classes, notably matching and scene geometry based, while the SI fusion techniques are classified into three classes, notably time, view and time-view driven. After reviewing the most relevant inter-view SI creation and SI fusion techniques guided by the proposed classification taxonomy, conclusions are drawn about the current status quo, thus allowing to better identify the next research challenges in the multiview WZ video coding paradigm.

  • 出版日期2015-1