Multi-Frame Motion Estimation and Mode Decision in H.264 Codec Shauli Rozen Amit Yedidia Supervised by Dr. Shlomo Greenberg Communication Systems Engineering Department Ben-Gurion University Beer - Sheva,Israel Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Presentation Content H.264 brief overview. Introduction to Multi-Frame Reference Motion Estimation & Mode Decision. Complexity Analysis. The proposed Algorithm. Status & Current results. Future work. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Video Standard H.264 / MPEG4 AVC H.261 H.262 / MPEG2 H.263 H.263+ H ITU-T Standards Joint ITU-T & ISO/MPEG Standards ISO/MPEG Standards MPEG1MPEG4 Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
New Features in H.264 Multi-Frame reference Motion Estimation. 7 partitioning modes in Inter frames. Multi-mode intra-prediction. Motion vector can point out of image border. 1/4-, 1/8-pixel motion vector precision. B-frame prediction weighting. 4 4 integer transform. UVLC (Uniform Variable Length Coding). NAL (Network Abstraction Layer). SP-slices. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
H.264 Encoder Entropy Coding Scaling & Inv. Transform Motion- Compensation Control Data Quant. Transf. coeffs Motion Data Intra/Inter Coder Contro l Decoder Motion Estimation Transform/ Scal./Quant. - Input Video Signal Split into Macroblocks 16x16 pixels Intra-frame Prediction De-blocking Filter Output Video Signal Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Motion Estimation in H.264 Various block sizes and shapes 8x8 0 4x x4 8x x8 Types 0 16x x16 MB Types 8x x8 1 0 Multiple Reference Frames for Motion Compensation [t-1][t-4][t-3][t-2][t-5] Each 16x16 MB can be partitioned in 259 different modes. Each block can be searched within the 16 preceding frames Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Why Multi-Frame Reference Motion Estimation? Hiding Objects – After revealing. The best match might be found in frame before the hiding. Periodic Movements – When the object is moving but repeat its original position every few frames. The best match might be found in the last frame where the object was with the same position. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Why Variable Block Sizes? Increased spatial & temporal correlation. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Complexity Analysis Each 16x16 MB can be decoded as Inter, Intra or skip. Intra: 2 modes - 4x4 or 16x16 Intra4x4 – 9 prediction modes (16x9 calculation of 16 pixels) Intra16x16 – 4 prediction mode (4 calculations of 256 pixels) Inter: 7 modes - 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 (259 partition options, 41 searches, 5 frames). search of [W x L] mode in window size (2W+1)(2L+1) requires (W+1)(L+1) calculation of W x L pixels. with Fix window size 33x33 - 5,190,400 MAC’s with relative window size (2W+1)(2L+1) -1,012,480 MAC’s Motivation for complexity reduction is obvious!!!
Rate Distortion A new decision criteria introduced by the standard. The Rate Distortion is taking in account the prediction error (Diff) and length of the needed bit- stream ( ). Now it is hard to tell in advanced which mode or reference frame will be selected. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Rate Distortion in H.264 Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Multi-Frame Reference Full Search Brute Force. MB Mode: 16x16 Search window size: (2*16+1)(2*16+1)=33x33 Number of reference frames: 5 Error criterion: MSE. Complexity (for one MB) : (16+1)(16+1)*16*16*5=369,920 MAC’s Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Multi-Frame Reference Usage Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
PSNR Gain of MFS GainFSMFS Bus Akiyo Coastguard Foreman Mother News Average Notice: This Gain achieved by using the Multi-Frame Reference feature only (without the different partition modes). Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Why Adaptive? In non adaptive Block Matching Algorithms: -The search area is constant in place and size. -The number of searches made for each Macro-Block is constant. -If there is fast motion in the scene and the search area is too small, the object will go out of the search window and will get poor results. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Why Adaptive? (cont ’ d) In adaptive Block Matching Algorithms: -The search area is not constant in size. -The method of search can be changed and the location of the search window can be changed. -This concludes to fewer searches and better PSNR results even in scenes with fast motion. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Motivation Frame [t-1] Frame [t] Obvious temporal and spatial correlation of MV’s and partition modes. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Adaptive Multi-Frame Block Matching Algorithm Step 1 - Predictors selection Step 2 - Thresholds setting. Step 3 - Applying Predictors. Step 4 - Decision & Refinements. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Temporal Neighbors Predictors [t-2][t-4][t-3][t-5] [t-1] Current Frame [t] Spatial Neighbors Predictors Step 1 – Predictors selection. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Step 2 – Threshold setting Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Step 3 – Applying Predictors [t-2][t-4][t-3][t-5] [t-1] Current Frame [t] MSE min MV o Calculate all MSE’s and set MVo as the MV which achieved the lowest prediction error. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Step 4 - Decision & Refinement [t-3] Current Frame [t] MSEmin<MSEpmin – Search is stopped (early termination) MV o Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Step 4 (cont ’ d) [t-3] Current Frame [t] MV o MSEpmin < MSEmin < MSEpavg – Refinement search is applied in [3x3] search window around MV o Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Step 4 (cont ’ d) [t-3] Current Frame [t] MV o MSEpavg < MSEmin Refinement with distinctive window size (3x3,4x4,5x5) is done around the three predictors from the initial set of predictors and which provided the minimal MSE’s. [t-1][t-4] MV 2 MV 1 Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
PSNR Gain AMFBMAFSMFS Bus Akiyo Coastguard Foreman Mother News Avearge Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Computational Complexity Reduction (compared to MFS) 9.31%Bus 5.42%Akiyo 9.05%Coastguard 9.42%Foreman 11.13%Mother 7.65%News 8.67%Average Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Our Goal Integration in the Open source software (JM) Matlab cant provide Rate-Distortion statistics. Improve the proposed methods. Predictors early elimination. Predictors priority mechanism. Improved refinement search pattern – less searches in the refinement step. Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel
Questions ? Communication Systems Engineering Department, Ben-Gurion University of the Negev, Beer-Sheva, Israel