Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression Gang Hua and Onur G. Guleryuz Rice University, Houston, TX DoCoMo.

Slides:



Advertisements
Similar presentations
Packet Video Error Concealment With Auto Regressive Model Yongbing Zhang, Xinguang Xiang, Debin Zhao, Siwe Ma, Student Member, IEEE, and Wen Gao, Fellow,
Advertisements

Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.
INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
Temporal Video Denoising Based on Multihypothesis Motion Compensation Liwei Guo; Au, O.C.; Mengyao Ma; Zhiqin Liang; Hong Kong Univ. of Sci. & Technol.,
Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of.
SRINKAGE FOR REDUNDANT REPRESENTATIONS ? Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel.
Video Coding with Linear Compensation (VCLC) Arif Mahmood, Zartash Afzal Uzmi, Sohaib A Khan Department of Computer.
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.
H.264 / MPEG-4 Part 10 Nimrod Peleg March 2003.
Scalable Wavelet Video Coding Using Aliasing- Reduced Hierarchical Motion Compensation Xuguang Yang, Member, IEEE, and Kannan Ramchandran, Member, IEEE.
Encoder and Decoder Optimization for Source-Channel Prediction in Error Resilient Video Transmission Hua Yang and Kenneth Rose Signal Compression Lab ECE.
Introduction to Video Transcoding Of MCLAB Seminar Series By Felix.
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
Image (and Video) Coding and Processing Lecture: Motion Compensation Wade Trappe Most of these slides are borrowed from Min Wu and KJR Liu of UMD.
Transform Domain Distributed Video Coding. Outline  Another Approach  Side Information  Motion Compensation.
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
Rate-Distortion Optimized Motion Estimation for Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Lab ECE Department University.
A REAL-TIME VIDEO OBJECT SEGMENTATION ALGORITHM BASED ON CHANGE DETECTION AND BACKGROUND UPDATING 楊靜杰 95/5/18.
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
4/24/2002SCL UCSB1 Optimal End-to-end Distortion Estimation for Drift Management in Scalable Video Coding H. Yang, R. Zhang and K. Rose Signal Compression.
09/24/02ICIP20021 Drift Management and Adaptive Bit Rate Allocation in Scalable Video Coding H. Yang, R. Zhang and K. Rose Signal Compression Lab ECE Department.
A Nonlinear Loop Filter for Quantization Noise Removal in Hybrid Video Compression Onur G. Guleryuz DoCoMo USA Labs
An Introduction to H.264/AVC and 3D Video Coding.
Video Compression Concepts Nimrod Peleg Update: Dec
Lossy Compression Based on spatial redundancy Measure of spatial redundancy: 2D covariance Cov X (i,j)=  2 e -  (i*i+j*j) Vertical correlation   
Predicting Wavelet Coefficients Over Edges Using Estimates Based on Nonlinear Approximants Onur G. Guleryuz Epson Palo Alto Laboratory.
WEIGHTED OVERCOMPLETE DENOISING Onur G. Guleryuz Epson Palo Alto Laboratory Palo Alto, CA (Please view in full screen mode to see.
MPEG MPEG-VideoThis deals with the compression of video signals to about 1.5 Mbits/s; MPEG-AudioThis deals with the compression of digital audio signals.
On Missing Data Prediction using Sparse Signal Models: A Comparison of Atomic Decompositions with Iterated Denoising Onur G. Guleryuz DoCoMo USA Labs,
1 Efficient Reference Frame Selector for H.264 Tien-Ying Kuo, Hsin-Ju Lu IEEE CSVT 2008.
: Chapter 12: Image Compression 1 Montri Karnjanadecha ac.th/~montri Image Processing.
Windows Media Video 9 Tarun Bhatia Multimedia Processing Lab University Of Texas at Arlington 11/05/04.
Iterated Denoising for Image Recovery Onur G. Guleryuz To see the animations and movies please use full-screen mode. Clicking on.
Motion-Compensated Noise Reduction of B &W Motion Picture Films EE392J Final Project ZHU Xiaoqing March, 2002.
Robust global motion estimation and novel updating strategy for sprite generation IET Image Processing, Mar H.K. Cheung and W.C. Siu The Hong Kong.
Videos Mei-Chen Yeh. Outline Video representation Basic video compression concepts – Motion estimation and compensation Some slides are modified from.
Royalty Cost Based Optimization for Video Compression Emrah Akyol, Onur G. Guleryuz, and M. Reha Civanlar DoCoMo USA Labs, Palo Alto, CA USA.
By: Hitesh Yadav Supervising Professor: Dr. K. R. Rao Department of Electrical Engineering The University of Texas at Arlington Optimization of the Deblocking.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Rate-distortion Optimized Mode Selection Based on Multi-channel Realizations Markus Gärtner Davide Bertozzi Classroom Presentation 13 th March 2001.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Compression of Real-Time Cardiac MRI Video Sequences EE 368B Final Project December 8, 2000 Neal K. Bangerter and Julie C. Sabataitis.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Applying 3-D Methods to Video for Compression Salih Burak Gokturk Anne Margot Fernandez Aaron March 13, 2002 EE 392J Project Presentation.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.
Rate-distortion Optimized Mode Selection Based on Multi-path Channel Simulation Markus Gärtner Davide Bertozzi Project Proposal Classroom Presentation.
Image Decomposition, Inpainting, and Impulse Noise Removal by Sparse & Redundant Representations Michael Elad The Computer Science Department The Technion.
Igor Jánoš. Goal of This Project Decode and process a full-HD video clip using only software resources Dimension – 1920 x 1080 pixels.
Nonlinear Approximation Based Image Recovery Using Adaptive Sparse Reconstructions Onur G. Guleryuz Epson Palo Alto Laboratory.
Video Compression—From Concepts to the H.264/AVC Standard
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Block-based coding Multimedia Systems and Standards S2 IF Telkom University.
Motion Estimation Multimedia Systems and Standards S2 IF Telkom University.
6/9/20161 Video Compression Techniques Image, Video and Audio Compression standards have been specified and released by two main groups since 1985: International.
Image Processing Architecture, © Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,
Multi-Frame Motion Estimation and Mode Decision in H.264 Codec Shauli Rozen Amit Yedidia Supervised by Dr. Shlomo Greenberg Communication Systems Engineering.
CMPT365 Multimedia Systems 1 Media Compression - Video Spring 2015 CMPT 365 Multimedia Systems.
Complexity varying intra prediction in H.264 Supervisors: Dr. Ofer Hadar, Mr. Evgeny Kaminsky Students: Amit David, Yoav Galon.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Distributed Compression For Still Images
Injong Rhee ICMCS’98 Presented by Wenyu Ren
Directional Multiscale Modeling of Images
Error Concealment In The Pixel Domain And MATLAB commands
Wavelet Based Still Image and Video Compression
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission Vineeth Shetty Kolkeri EE Graduate,UTA.
Standards Presentation ECE 8873 – Data Compression and Modeling
Progress & schedule Presenter : YY Date : 2014/10/3.
Presentation transcript:

Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression Gang Hua and Onur G. Guleryuz Rice University, Houston, TX DoCoMo USA Labs, Palo Alto, CA (Please view in full screen presentation mode to see the animations)

2 Outline   Problem Statement – –Quick intro to hybrid video compression. – –Example difficult video. – –Problems in temporal prediction. – –Quick results showing what the proposed work can do.   Our Solution: Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression – –Model. – –What we do, how we do it, and why it works.   Simulation results showing prediction examples & discussion.   Compression results.   Conclusion & future work. I will show results on video but I will also use classical images peppers/barbara to make intuitive points

3 Quick Digression: The Set of “Natural” Images The set of natural/interesting images Non-convex, star-shaped set Far from both barbara and peppers but still very useful in compressing barbara or peppers.

4 Setup: Hybrid Video Compression Predict current frame reference video frame prediction error Compress We will propose a new prediction technique: Spatial Sparsity Induced Temporal Prediction (SIP)

5 Current State of Prediction Predict current framereference frame Current prediction techniques only work well when current frame blocks are simple translations of past frame blocks (Sufficient, for most simple video). In this work, we will assume translations are accounted for. + - Transform Coder

6 Example Difficult Video (please note differences with traditional sequences like foreman ) (Commercial) (Trailer)

7 Some example problematic temporal evolutions for current techniques reference framecurrent frame Temporally decorrelated noise Simple fade from a blend of two scenes Special effects INTRA (non- differential) encoding (many bits)

8 Motion Compensation Transform Coder 1  z + + Frame to be coded Previously decoded frame, to be used as reference Coded differential Transform Decoder Sparsity Induced Prediction (causal information) SIP inside a generic hybrid coder Objective is to generate better motion compensated predictors.

9 MC reference frame current frame MC reference after SIP noise (denoised) lightning (removed!) cross-fade (fading scenes reduced and amplified as needed!) clutter (removed)

10 Loose Model (after translations are accounted for) = …=+ [] =0.5(+)+ = …[] []= … structured noise ! brightness change smooth light-map relevantnoisereference (ith pixel) current Straightforward with today’s know-how: use an overcomplete set of transforms threshold coefficients, … Denoising recipe will not work:  can we somehow optimize transforms?  use index sets of coefficients?, … We must find a common formulation for both of these cases (turns out to be very easy!) (Nx1) I will show more complicated variations as well

11 How We Do it “frame” coefficients of, Look at all images in terms of their “frame coefficients” (translation invariant decompositions generated with a 4x4 block DCT – poor person’s frame) (M times expansive, M=16) ( causal, least squares, per-coefficient estimate of the frame coefficients of ) Mini FAQ: Why a frame? Separating r and s becomes straightforward, i.e., easy rejection of s. Inverting overcomplete decompositions? Easy. Why DCT 4x4? Because it is fast. Can one use *lets, ? Subject to some caveats, yes.

12 Views of a DCT(4x4) frame = 0.5 (+) I rearranged the d to make nice pictures out of the coefficients

13 0 (gray=0) 0 are conveniently separated in the frame/overcomplete domain! are conveniently separated except for overlaps (but usually there are few overlaps and for fast processing in this version we will ignore them) can predict s in other ways and improve its rejection Automatic separation of relevant and irrelevant 00

… 0 0 … (blue= r is significant red = s is significant) Overlaps are few. However, it is clear that the prediction must suppress/amplify the same frequencies in a spatially adaptive fashion. Approaches that use filter dictionaries (i.e. Wiener interpolation filters, etc.) require very big dictionaries. … … … … Real Example

15 Fill all frame coefficients of the orange block and invert (encoder/decoder). Send/receive residual for red block, … (Less accurate prediction at singularity overlaps). previously encoded block to be encoded available coefficients coefficients associated with the block to be coded Causal Prediction of Frame Coefficients neighborhood

16  Simulation results that show the efficacy of causal predictions (compression results are later).  Showcase of the proposed work using standard test images to give an idea of the temporal evolutions that it can deal with.  Evolutions are frame-wide for ease of demonstration. Otherwise, the proposed algorithm is local and can easily take advantage of localized evolutions in an adaptive fashion.  All frames have additive Gaussian noise ( ) for added challenge and demonstration of noise robustness.  (The algorithm exploits the underlying non-convexity of the set of natural images.) Some Prediction Examples

17Problem Past frame Current frame Required processing for each predicted block (without looking at the predicted block!) Prediction Prediction Accuracy (PSNR) Noisy videoDenoise dB =peppers + noise (completely causal, no side information sent) (BLS-GSM 37.12dB)

18 Problem Past frame Current frame Required processing for each predicted block (without looking at the predicted block!) Prediction Prediction Accuracy (PSNR) Scene transition from a blend of two scenes. Denoise, find peppers (!) out of the blend of peppers & barbara, amplify peppers dB =(peppers+barbara)/2+ noise SNR=0dB! Must catch the red fish (completely causal, no side information sent) =peppers+ noise (de-Barbara-d )

19 Problem Past frame Current frame Required processing for each predicted block (without looking at the predicted block!) Prediction Prediction Accuracy (PSNR) Scene transition from a blend of three scenes. Denoise, find peppers out of the blend of peppers, barbara & boat, amplify peppers dB Must catch the red fish =(peppers+barbara+boat) /3 + noise

20 Problem Past frame Current frame Required processing for each predicted block (without looking at the predicted block!) Prediction Prediction Accuracy (PSNR) Scene transition with a cross fade (one scene fades out, the other fades in). Denoise, find barbara, reduce barbara, find peppers, amplify peppers dB =.3*peppers+.7*barbara + noise =.7*peppers+.3*barbara + noise

21 Problem Past frame Current frame Required processing for each predicted block (without looking at the predicted block!) Prediction Prediction Accuracy (PSNR) Scene transition from a blend with a brightness change. Denoise, find lightmap, invert lightmap, find peppers out of the blend of peppers & barbara, amplify peppers dB

22 Q: Does it work in practice? A: Yes. JM 10.2, IPP…, (MB level switch, no other overhead). QCIF video. ¼ pixel motion. Adaptive rounding on. Better ~20% gains in rate ~10% ~25% ~10% (a) (b) (c)(d)

23 Movie trailer ~18 % Our gains are reduced at lower bitrates because compression process tends to remove the effect of some of the problems we can deal with.

24Properties Decoder complexity translation invariant decomposition per-pixel: 3*4*4 multiplies, 4*4 divides, 4*4*4 additions (to compute ) reduce complexity by reducing causal neighborhood, less expansive decompositions, run only on high error blocks, etc. Encoder complexity = Decoder complexity + motion search (fast search, run only on high error blocks, etc.) Other work: Brightness compensation methods: Work only for brightness changes. Wiener Filter based sub-pixel interpolation : Filters have low-pass characteristics only. Need many filters in dictionary (too much overhead). Weighted prediction: Scene wide, only works on blends if blending frames are in the reference frame buffer. ~ Our work is more of an “all purpose cleaner” compared to early work

25Conclusion Images depicted in video are sparse and this can be taken advantage of in order to generate very interesting prediction results. The proposed work goes beyond early prediction solutions and adds new capabilities to the prediction. Many types of temporal evolutions in video can be easily managed, denoising accomplished, lightning removed, complicated fades handled, focus changes deblurred … Showcase of the power of sparse decompositions and how the underlying non- convexity can be utilized. Future Work: Manage overhead better. Improve performance. Reduce complexity.