Fully Scalable Multiview Wavelet Video Coding

Slides:



Advertisements
Similar presentations
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Advertisements

INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors Chenggang Yan, Yongdong Zhang, Jizheng Xu, Feng Dai,
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.
{ Fast Disparity Estimation Using Spatio- temporal Correlation of Disparity Field for Multiview Video Coding Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen.
Limin Liu, Member, IEEE Zhen Li, Member, IEEE Edward J. Delp, Fellow, IEEE CSVT 2009.
Fast Mode Decision for Multiview Video Coding Liquan Shen, Tao Yan, Zhi Liu, Zhaoyang Zhang, Ping An, Lei Yang ICIP
Wei Zhu, Xiang Tian, Fan Zhou and Yaowu Chen IEEE TCE, 2010.
1 Wavelets and compression Dr Mike Spann. 2 Contents Scale and image compression Signal (image) approximation/prediction – simple wavelet construction.
Video Transmission Adopting Scalable Video Coding over Time- varying Networks Chun-Su Park, Nam-Hyeong Kim, Sang-Hee Park, Goo-Rak Kwon, and Sung-Jea Ko,
An Error-Resilient GOP Structure for Robust Video Transmission Tao Fang, Lap-Pui Chau Electrical and Electronic Engineering, Nanyan Techonological University.
Two-Dimensional Channel Coding Scheme for MCTF- Based Scalable Video Coding IEEE TRANSACTIONS ON MULTIMEDIA,VOL. 9,NO. 1,JANUARY Yu Wang, Student.
Scalable Wavelet Video Coding Using Aliasing- Reduced Hierarchical Motion Compensation Xuguang Yang, Member, IEEE, and Kannan Ramchandran, Member, IEEE.
Overview of Multi-view Video Coding Yo-Sung Ho; Kwan-Jung Oh; Systems, Signals and Image Processing, 2007 and 6th EURASIP Conference focused on Speech.
Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes Student: Chia-Yang Tsai Advisor: Prof. Hsueh-Ming Hang.
Interframe Wavelet Coding The Status of Interframe Wavelet Coding Exploration in MPEG ISO/IEC JTC1/SC29/WG11 MPEG2002/N4928 Klagenfurt, July 2002 Adaptive.
Efficient Fine Granularity Scalability Using Adaptive Leaky Factor Yunlong Gao and Lap-Pui Chau, Senior Member, IEEE IEEE TRANSACTIONS ON BROADCASTING,
Communication & Multimedia C. -Y. Tsai 2005/12/15 1 Vidwav Wavelet Video Coding Specifications Student: Chia-Yang Tsai Advisor: Prof. Hsueh-Ming Hang Institute.
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Statistical Multiplexer of VBR video streams By Ofer Hadar Statistical Multiplexer of VBR video streams By Ofer Hadar.
09/24/02ICIP20021 Drift Management and Adaptive Bit Rate Allocation in Scalable Video Coding H. Yang, R. Zhang and K. Rose Signal Compression Lab ECE Department.
An Introduction to H.264/AVC and 3D Video Coding.
1. 1. Problem Statement 2. Overview of H.264/AVC Scalable Extension I. Temporal Scalability II. Spatial Scalability III. Complexity Reduction 3. Previous.
EE 5359 H.264 to VC 1 Transcoding Vidhya Vijayakumar Multimedia Processing Lab MSEE, University of Arlington Guided.
Liquan Shen Zhi Liu Xinpeng Zhang Wenqiang Zhao Zhaoyang Zhang An Effective CU Size Decision Method for HEVC Encoders IEEE TRANSACTIONS ON MULTIMEDIA,
Online Dictionary Learning for Sparse Coding International Conference on Machine Learning, 2009 Julien Mairal, Francis Bach, Jean Ponce and Guillermo Sapiro.
Kai-Chao Yang Hierarchical Prediction Structures in H.264/AVC.
Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.
Picture typestMyn1 Picture types There are three types of coded pictures. I (intra) pictures are fields or frames coded as a stand-alone still image. These.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
A hardware-Friendly Wavelet Entropy Codec for Scalable video Hendrik Eeckhaut ELIS-PARIS Ghent University Belgium.
Marc CHAUMONT ICIP 2003 Fully scalable object based video coder based on analysis- synthesis scheme Marc Chaumont, Nathalie Cammas 1 and Stéphane Pateux.
Rate-GOP Based Rate Control for HEVC SHANSHE WANG, SIWEI MA, SHIQI WANG, DEBIN ZHAO, AND WEN GAO IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING,
Figure 1.a AVS China encoder [3] Video Bit stream.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Vamsi Krishna Vegunta University of Texas, Arlington
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
Journal of Visual Communication and Image Representation
Video Compression—From Concepts to the H.264/AVC Standard
3-D Direction Aligned Wavelet Transform for Scalable Video Coding Yu Liu 1, King Ngi Ngan 1, and Feng Wu 2 1 Department of Electronic Engineering The Chinese.
1 Yu Liu 1, Feng Wu 2 and King Ngi Ngan 1 1 Department of Electronic Engineering, The Chinese University of Hong Kong 2 Microsoft Research Asia, Beijing,
3-D WAVELET BASED VIDEO CODER By Nazia Assad Vyshali S.Kumar Supervisor Dr. Rajeev Srivastava.
CASA 2006 CASA 2006 A Skinning Approach for Dynamic Mesh Compression Khaled Mamou Titus Zaharia Françoise Prêteux.
SIMD Implementation of Discrete Wavelet Transform Jake Adriaens Diana Palsetia.
A Frame-Level Rate Control Scheme Based on Texture and Nontexture Rate Models for HEVC IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,
Outline  Introduction  Observations and analysis  Proposed algorithm  Experimental results 2.
Fine-granular Motion Matching for Inter-view Motion Skip Mode in Multi-view Video Coding Haitao Yanh, Yilin Chang, Junyan Huo CSVT.
Presenting: Shlomo Ben-Shoshan, Nir Straze Supervisors: Dr. Ofer Hadar, Dr. Evgeny Kaminsky.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
COMPLEXITY VARYING INTRA PREDICTION IN H.264
Compressive Coded Aperture Video Reconstruction
Adaptive Block Coding Order for Intra Prediction in HEVC
کاربرد نگاشت با حفظ تنکی در شناسایی چهره
Quad-Tree Motion Modeling with Leaf Merging
Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan.
MPEG4 Natural Video Coding
Progress & schedule Presenter : YY Date : 2014/10/3.
Wavelet-based Compression of 3D Mesh Sequences
Bongsoo Jung, Byeungwoo Jeon
Embedded Image Coding Based on Context Classification and
Yu Liu and King Ngi Ngan Department of Electronic Engineering
HALO-FREE DESIGN FOR RETINEX BASED REAL-TIME VIDEO ENHANCEMENT SYSTEM
Scalable light field coding using weighted binary images
Presentation transcript:

Fully Scalable Multiview Wavelet Video Coding Yu Liu and King Ngi Ngan Department of Electronic Engineering The Chinese University of Hong Kong Email: yliu@ieee.org knngan@ee.cuhk.edu.hk ISCAS2009, Taipei, Taiwan, May 24-27, 2009 Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Outline Introduction Proposed Method Experimental Results Conclusion Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Introduction Multiview Video Proposed Method Experimental Results Conclusion Introduction Related Work Introduction Multiview Video Captured simultaneously by several cameras from different viewpoints Straightforward solution: simulcast coding for each view H.264/AVC [1] or wavelet-based VIDWAV [2] Inter-view statistical dependencies Multiview video coding (MVC) Full scalability in multiview video coding Quality, spatial, temporal and view dimensions Scalable coding: H.264/AVC-based JSVM [3] and wavelet-based SVC [2] Can achieve part of the requirement, but view scalability is missing H.264/AVC-based JMVM [4] Some degree of temporal and view scalability But spatial or quality scalability is not supported Multiview video captured simultaneously by several cameras from different viewpoints, enables a wide variety of future multimedia applications. However, it also results in huge amounts of data to be stored or transmitted. The straightforward solution for this would be to encode each view independently using state-of-the-art video codec. However, multiview video contains a large amount of inter-view statistical dependencies, which should be exploited by multiview video coding (MVC). On the other hand, one of important multiview video coding research topic is to achieve full scalability in all four dimensions: quality, spatial, temporal and view dimensions. Most scalable video codecs can achieve part of the requirement, but view scalability is missing. Although the JMVM offers some degree of temporal and view scalability, spatial or quality scalability is not supported at all. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Related Work Schemes for wavelet-based MVC Introduction Proposed Method Experimental Results Conclusion Introduction Related Work Related Work Schemes for wavelet-based MVC Inherent scalability due to subband/wavelet decomposition 2D wavelet transform in spatial domain Motion compensated temporal filtering (MCTF) in temporal domain Disparity compensated view filtering (DCVF) in view domain But no view scalability [5] or limited view scalability [6] is supported Due to global MCTF/DCVF selection on Group of GOP basis An attempt to provide full view scalability by Garbas et al. [7] But the coding efficiency is not satisfying Schemes for wavelet-based MVC have been devised. They all have a common principle that the correlation of multiview video is exploited by MCTF in temporal domain, DCVF in view domain and 2D DWT in spatial domain. The inherent scalability of such wavelet decomposition is appealing. Unfortunately no view scalability or limited scalability is supported for those schemes, due to global MCTF/DCVF selection on group of GOP basis. An attempt to provide full view scalability is made by Garbas et al., but the coding efficiency is not satisfying. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Related Work Wavelet-based MVC Frameworks Introduction Proposed Method Experimental Results Conclusion Introduction Related Work Related Work Wavelet-based MVC Frameworks Different wavelet decomposition structures along temporal and view axes (a) Simulcast scheme[2] Only MCTF, but no DCVF (b) Regular scheme[5] Multilevel MCTF for the temporal subbands Only DCVF in the lowest temporal subbands (c) Adaptive scheme[5,6] MCTF and DCVF interleaved global MCTF/DCVF selection on Group of GOP basis Wavelet-based video coding provides an elegant and flexible way for the MVC framework by using high-dimensional wavelet transform. Figure (a) illustrates the simulcast-based wavelet decomposition structure along the temporal and view axis. There is only MCTF in temporal axis, but no DCVF in view axis. The regular scheme in Figure (b) is based on the assumption that the temporal correlation is always stronger than the inter-view one. Thus the DCVF is only performed in the lowest temporal subbands. However, the above assumption is not always true. A reasonable way to achieve better coding efficiency is to derive the optimal decomposition structure based the coding cost. Therefore, an adaptive scheme is further proposed. Instead of fixing the order of MCTF and DCVF, the MCTF and DCVF are interleaved based on the global correlation analysis on GoGOP basis. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Locally Adaptive Inter-View-Temporal Structure Introduction Proposed Method Experimental Results Conclusion Locally Adaptive Inter-View-Temporal Structure 2-D Pipeline-Based Lifting Scheme Locally Adaptive Inter-View-Temporal Structure Analysis of temporal and inter-view correlation [8] Temporal prediction mode is dominant for all sequences But sometimes inter-view prediction is more efficient than temporal prediction for a number of blocks According to the analysis of temporal and inter-view correlation on the macroblock level, the temporal prediction mode is dominant for all sequences, but sometimes inter-view prediction is more efficient than temporal prediction for a number of blocks. As verified by investigating the statistical dependencies between the temporal and inter-view pictures, considering a local correlation analysis model could further improve the coding efficiency. (a) Prediction modes with first order inter-view and temporal neighbor pictures, (b) Probabilities of prediction modes (T: temporal mode, V: inter-view mode) [8] on the macroblock level Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Locally Adaptive Inter-View-Temporal Structure Introduction Proposed Method Experimental Results Conclusion Locally Adaptive Inter-View-Temporal Structure 2-D Pipeline-Based Lifting Scheme Locally Adaptive Inter-View-Temporal Structure Locally adaptive inter-view-temporal decomposition Adopts the locally adaptive inter-view-temporal correlation analysis based on the macroblock level Instead of the global temporal and inter-view correlation analysis based on the whole picture Problem: how to implement the lifting steps of the MCTF and DCVF on the macroblock level within the same framework Solution: the weighting lifting [9] is employed for MCTF or DCVF Instead of the global temporal and inter-view correlation analysis based on the whole picture, our proposed wavelet decomposition structure adopts the locally adaptive inter-view-temporal correlation analysis based on the macroblock level. Now the problem is how to implement the lifting steps of the MCTF and DCVF on the macroblock level within the same framework. The solution is to employ the weighting lifting scheme. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Locally Adaptive Inter-View-Temporal Structure Introduction Proposed Method Experimental Results Conclusion Locally Adaptive Inter-View-Temporal Structure 2-D Pipeline-Based Lifting Scheme Locally Adaptive Inter-View-Temporal Structure Locally adaptive inter-view-temporal wavelet decomposition structure The number of levels of inter-view transform at each temporal subband is related to that of temporal transform, but not in excess of the maximum predetermined number of levels of inter-view transform. This figure shows an illustration of the proposed locally adaptive inter-view-temporal wavelet decomposition structure. The major difference between the proposed structure and conventional structure is that our inter-view transform is totally incorporated with temporal transform on the macroblock level within the same framework. The number of levels of inter-view transform at each temporal subband is related to that of temporal transform, but not in excess of the maximum predetermined number of levels of inter-view transform. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

2-D Pipeline-Based Lifting Scheme Introduction Proposed Method Experimental Results Conclusion Locally Adaptive Inter-View-Temporal Structure 2-D Pipeline-Based Lifting Scheme 2-D Pipeline-Based Lifting Scheme Selection of optimal decomposition structure Adaptive wavelet decomposition scheme on the basis of GoGOP (Group of GOP) Demands the significant memory requirement and computational complexity Suffers from the problem of temporal boundary effects across GOP And thus its coding performance depends on the temporal GOP size In order to solve those problems, a new 2-D pipeline-based lifting scheme is proposed An extension of 1-D pipeline-based lifting scheme [10] on the basis of macroblock level It does not physically break the multiview sequence into GoGOP but processes it without intermission At most 4x4 frames are involved for one-level transform The adaptive wavelet decomposition scheme is based on the hierarchical wavelet structure and the selection of optimal decomposition structure is performed on the basis of GoGOP. This significantly increases the memory requirement and computational complexity. In addition, it suffers from the problem of temporal boundary effects across GOP and thus its coding performance depends on the temporal GOP size. In order to solve those problems, a new 2-D pipeline processing-based lifting scheme on the basis of macroblock level is proposed, which is an extension of 1-D pipeline processing-based lifting scheme. This 2-D lifting scheme does not physically break the multiview sequence into GoGOP but processes it without intermission. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

2-D Pipeline-Based Lifting Scheme Introduction Proposed Method Experimental Results Conclusion Locally Adaptive Inter-View-Temporal Structure 2-D Pipeline-Based Lifting Scheme 2-D Pipeline-Based Lifting Scheme Pipeline processing-based lifting schemes 2-D case of (c) forward and (d) inverse inter-view-temporal transform 1-D case of forward and Inverse temporal transform Figures (a) and (b) show the pipeline-based lifting scheme for 1-D case of forward and inverse temporal transform. Here, we extend the scheme from 1-D temporal transform to 2-D inter-view-temporal transform, as shown in Figures (c) and (d). The 2-D scheme uses at most 4x4 frames for buffering. The video frames at the temporal axis are first pushed into the 4x4 buffering window sequentially, followed by those at the view axis. In other words, the 2-D pipeline-based lifting scheme consists of two separable 1-D pipeline processes along the temporal or view axis. Therefore, the multiview sequence can be processed continuously without being broken into the GoGOP. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Introduction Proposed Method Experimental Results Conclusion Experimental Results Experimental Results Two multiview QVGA (320x240) sequences: (a) Ballroom and (b) Race1 Eight views with frame rate of 25 Hz or 30 Hz, captured by a parallel camera array with 20 cm spacing First 128 frames of each sequence are encoded to generate only one bitstream for one multiview video (including eight views) PSNR is averaged over all views and bit rate is given per view Different temporal/inter-view wavelet decomposition schemes are compared: Simulcast scheme Regular scheme Adaptive scheme Proposed scheme In the experiments, two multiview QVGA video sequences are used. Different temporal/inter-view wavelet decomposition schemes are compared and are all built upon the wavelet-based SVC. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Introduction Proposed Method Experimental Results Conclusion Experimental Results Experimental Results The performance comparison of different wavelet decomposition schemes for tested multiview sequences: Ballroom and Race1 This Figure shows the R-D curves with respect to the coding results on the two multiview video with eight views. We can observe that the proposed scheme always achieves the best coding performance amongst the tested schemes. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Introduction Proposed Method Experimental Results Conclusion Experimental Results Experimental Results The performance comparison between the proposed scheme and simulcast scheme with different quality, temporal and view scalability for tested multiview sequences: Ballroom and Race1 This Figure presents the performance comparison between the proposed scheme and the simulcast scheme for the two multiview videos at different bit rates, frame rates and view rates. As seen from the R-D curves at different test points, the proposed scheme outperforms consistently the simulcast scheme, even for the two-view case. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Experimental Results Coding Performance Comparison Introduction Proposed Method Experimental Results Conclusion Experimental Results Experimental Results Coding Performance Comparison The PSNR gain can be up to 1.18 dB and 0.88 db over the regular and adaptive schemes at low bit rates Up to 2.49 dB coding gain is observed at low bit rates, compared with the simulcast scheme Notes: We do not demonstrate the coding results for spatial scalability. The reason is that all of these schemes naturally inherit this scalability from wavelet-based SVC. For the coding gain comparison, the PSNR gain can be up to 1.18 dB and 0.88 db over the regular and adaptive schemes at low bit rates. Up to 2.49 dB coding gain is observed at low bit rates, compared with the simulcast scheme. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Experimental Results Memory and Complexity Comparison Simulcast scheme Introduction Proposed Method Experimental Results Conclusion Experimental Results Experimental Results Memory and Complexity Comparison Simulcast scheme the least requirement of memory and computation, due to the only temporal correlation exploitation Regular scheme Slightly increase the requirement of memory and computation than the simulcast, due to only exploiting the inter-view correlation at the lowest temporal subbands Adaptive scheme Significantly increase the memory and computational complexity, due to the selection on the basis of GoGOP Proposed scheme Moderate memory and complexity, due to the 2-D pipeline-based inter-view-temporal lifting on macroblock basis For memory and complexity comparison, it is certain that the simulcast scheme has always the least requirement of memory and computation, due to the only temporal correlation exploitation. The regular scheme slightly increase the requirement of memory and computation than the simulcast, due to only exploiting the inter-view correlation at the lowest temporal subbands. The adaptive scheme significantly increase the memory and computational complexity, due to the selection on the basis of GoGOP. Our proposed scheme has moderate memory and complexity, due to the 2-D pipeline-based inter-view-temporal lifting on macroblock basis. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Introduction Proposed Method Experimental Results Conclusion Conclusion Conclusion Inter-View-Temporal Lifting-based Wavelet Coding Technique for Fully Scalable Multiview Video Coding An important scalability feature view scalability, besides other three scalabilities 2-D pipeline-based lifting scheme to remove both of temporal and view boundary effects Local adaptive inter-view-temporal structure to exploit local inter-view-temporal correlation on macroblock level In this presentation, an inter-view-temporal lifting-based wavelet coding technique is proposed for fully scalable multiview video coding, which provides the following advantages: Besides other three scalabilities, view scalability is also supported. 2-D pipeline-based lifting scheme is implemented to remove both of temporal and view boundary effects Local adaptive inter-view-temporal structure is employed to exploit local inter-view-temporal correlation on macroblock level. Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding

Thank You ! Q&A Y. Liu and K.N. Ngan Fully Scalable Multiview Wavelet Video Coding