Presentation is loading. Please wait.

Presentation is loading. Please wait.

Low complexity H.264 Encoder using machine learning.

Similar presentations


Presentation on theme: "Low complexity H.264 Encoder using machine learning."— Presentation transcript:

1 Low complexity H.264 Encoder using machine learning

2 H.264 encoder Transform & Quantization Motion Estimation Motion Compensation Picture Buffering Entropy Coding Intra Prediction Intra/Inter Mode Decision Inverse Quantization & Inverse Transform Deblocking Filter + - + + Video Input Bitstream Output

3 Block diagram for H.264 Decoder Motion Compensation Entropy Decoding Intra Prediction Intra/Inter Mode Selection Inverse Quantization & Inverse Transform Deblocking Filter + + Bitstream Input Video Output Picture Buffering

4 H.264 can achieve considerably higher coding efficiency. Efficiency comes at a cost in considerably increased complexity at the encoder mainly due to motion estimation and mode decision. Aim to reduce the complexity of the H.264 encoder using machine learning techniques. The idea behind using machine learning is to exploit structural similarities in video.

5 In the H.264 standard, the MB mode decision in Inter frames is the most computationally expensive process. variable block-size, motion estimation, quarter-pixel motion compensation, etc bring in this complexity.

6 Inter-prediction modes in H.264

7 It is important to emphasize that the most computational expensive process is ME. For example, assuming FS(full search) and M block types, N reference frames and a search range for each reference frame and block type equal to +/- W, we need to examine N x M x (2W + 1)^2 positions compared to only (2W + 1)^2 positions for a single reference/block type.

8 Machine learning Machine learning is a subfield of artificial intelligence. The major focus of machine learning research is to extract information from data automatically, by computational and statistical methods. Beware of ‘over-fitting ‘: over-fitting data to noise.

9 C4.5 Classifier C4.5 (know as a J48) is a system that constructs classifiers. With learnt data, a classifier accurately predicts the class to which a new case belongs. C4.5 first grows an initial Treeusing divide- and-conquer. Basic idea: grow a tree and reduce entropy in the subtrees.

10 Decisions are made on the basis of metrics. For each frame and each MB of pixels the follow metrics were calculated. The metrics that can be used are: MB mean, MB variance and Edges detection.

11 Training methods The process of obtaining data for training is done offline. In this supervised learning approach, we used the data of the first four frames of the video. Theresidual and current MB metrics and the MB mode selected by standard Intel® IPP H.264 are saved in a file. Trees arediscovered through C4.5 (J48) classifier algorithm.

12 Then, these Trees are implemented as if-else statements in the Intel® H.264 encoder. The purpose of these Trees is to replace the original complex Inter mode decision.

13

14 The C4.5 system consists of four principal programs: 1) decision tree generator 2) production rule generator :form production rules from unpruned tree 3) decision tree interpreter :classify items using a decision tree 4) production rule interpreter :classify items using a rule set

15 C4.5 algo demo: trainig data

16 Decision tree:can be implemented using if-else statements.

17 Next step While checking the machine learning algorithm simultaneously check results for the following schemes: 1.Intra directional mask approach. 2.Only-intra spatial-temporal prediction scheme. 3.Intra mode selection using edges. 4.Inter spatial-temporal prediction scheme.

18 References [1] Escribano Gerardo, “Low complexity MPEG-2 to H.264 Transcoding”, Doctoral dissertation, Albacete Espana, chapter 3 pg 39 – 48. [2] Jongho Kim, Kicheol Jeon, and Jechang Jeong, “H.264 Intra Mode Decision for Reducing Complexity Using Directional Masks and Neighboring Modes”, PSIVT 2006, LNCS 4319, pp. 959 – 968, 2006. [3] Xin, Vetro, “Fast Mode Decision for Intra-only H.264/AVC Coding”, TR2006-034 May 2006. [4] Pan, Lin, Rahardja, Lim, Wu, “Fast Mode Decision Algorithm for Intraprediction in H.264/AVC Video Coding”, IEEE Transactions On Circuits And Systems For Video Technology. Vol 15, No. 7, July 2005 [5] Cheng-Chang Lien, Chung-Ping Yu, “A Fast Mode Decision Method for H.264/AVC Using the Spatial- Temporal Prediction Scheme”, ICPR 2006 [6] Wu, Kumar, Quinlan, Ghosh, Yang, Motoda, McLachlan, Ng, Liu,Yu, Zhou, Steinbach, Hand, Steinberg, Verlag, “Top 10 algorithms in data mining ” London Limited 2007. [7] Fernández, Kalva, Cuenca, Orozco, “A first approach to speeding-up the inter mode selection in MPEG-2/H.264 transcoders using machine learning”, Multimed Tools Appl (2007) 35:225–240 [8] Intel Integrated Performance Primitives Reference Manual: Volume 2 [9] S. Saponara, M. Casula, F. Rovati, D. Alfonso, L. Fanucci, “Dynamic Control of Motion Estimation Search Parameters for Low Complex H.264 Video Coding”, IEEE Transactions on Consumer Electronics, Vol. 52, No. 1, FEBRUARY 2006. [10] S. Saponara, M. Melani, L. Fanucci, P. Terreni, “Adaptive algorithm for fast motionestimation in H.264/MPEG-4 AVC”, Proc. Eusipco2004, pp. 569 – 572, Wien, Sept. 2004. [11 ] P. Carrillo, H. Kalva and T. Pin, " Low complexity H.264 video encoding", SPIE. VOL.7443, PApER # 74430A, Aug. 2009


Download ppt "Low complexity H.264 Encoder using machine learning."

Similar presentations


Ads by Google