MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA.

MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA

Digital Video Transcoding November 07 Digital Video Transcoder “A” and “B” may differ in many aspects:  coding formats: e.g. MPEG-2 to H.264/AVC  bit-rate, frame rate, resolution …  features: error resilience features  contents: e.g. logo insertion Transcoder Coded digital video bit-stream “A” Coded digital video bit-stream “B”

Digital Video Transcoding November 07 Applications Media Storage  Transcode broadcasting MPEG-2 video to H.264/AVC format: enable long-time recording  Effective for multi-channel recording Home Gateway  Provide connection to IPTV set-top box Box only supports H.264/AVC Over wireless network with bandwidth limitation Other potential uses:  Export to mobile  Internet streaming  … …

Digital Video Transcoding November 07 Goals and Challenges H.264/AVC: latest video compression standard  Promises same quality as MPEG-2 at half the bit-rate  Is being widely adopted HD Consumer Storage, e.g., HD-DVD and Blu-Ray Mobile Devices, e.g., Apple iPod, iPhone, Sony PSP Convert MPEG-2 video to H.264/AVC format  More efficient storage, export to mobile devices, etc. Challenges  Yield similar quality as full re-encoding, but with much lower cost  Key to lower-cost/high-quality: how to intelligently reuse available information from the incoming bitstream  May be loosely considered as a “two-pass coder” Could achieve better quality than full re-encoding given same complexity

Digital Video Transcoding November 07 Outline Intra-only transcoding techniques  Efficient compressed domain processing Inter transcoding techniques  Motion mapping / motion reuse

Intra Transcoding Techniques

Digital Video Transcoding November 07 Intra Transcoder – Pixel Domain Q Inverse Q H.264 Entropy Coding IDCT VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Pixel-domain) Pixel Buffer Mode decision HT VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform Inverse HT

Digital Video Transcoding November 07 Compressed Domain Processing? Q Inverse Q H.264 Entropy Coding VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Comp-domain) Coeff Buffer Mode decision VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform

Digital Video Transcoding November 07 AVC 4x4 Transform Motivation:  DCT requires real-number operations, which may cause inaccuracies in inversion  Better prediction means less spatial correlation – no strong need for real-number operations H.264 uses a simple integer 4x4 transform  Approximation to 4x4 DCT  Transform and inverse transform note: ½ in inverse transform represents right shift, so it is non-linear

Digital Video Transcoding November 07 Intra Prediction in H.264/AVC Motivation: intra-frames are natural images, so they exhibit strong spatial correlation Pixels in intra-coded frames are predicted based on previously-coded ones  Prediction can be based on 4x4 blocks or 16x16 macroblocks (or 8x8 blocks for high profile) An encoded mode specifies which neighbor pixels should be used to predict, and how

Digital Video Transcoding November 07 Current block: Prediction blocks: VerticalHorizontalDiagonal_Down_Right 4x4 Intra Prediction Example

Digital Video Transcoding November 07 Compressed Domain Processing? Challenges  Different transforms MPEG-2 uses DCT, floating point H.264/AVC uses an integer transform  New prediction modes in H.264/AVC Can prediction be performed in compressed domain? Goals  Simpler computation and architecture

Digital Video Transcoding November 07 Compressed Domain Processing? Q Inverse Q H.264 Entropy Coding VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Comp-domain) Coeff Buffer Mode decision VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform

Digital Video Transcoding November 07 Intra Transcoder – Proposed Q Inverse Q Entropy Coding DCT-to-HT conversion (S-Transform) VLD/ IQ Input MPEG-2 Bitstream Pixel Buffer Mode decision (HT-domain) Inverse HT Intra Prediction (HT-domain) VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform

Digital Video Transcoding November 07 Techniques DCT-to-HT conversion Compressed (HT) domain prediction  Very simple for some prediction modes Compressed domain distortion calculation in mode decision Advantages  lower computational complexity  No quality loss

Digital Video Transcoding November 07 DCT-to-HT Conversion

Digital Video Transcoding November 07 DCT-to-HT Conversion: Transform Kernel Matrix

Digital Video Transcoding November 07 Fast Algorithm (1D)

Digital Video Transcoding November 07 Complexity Analysis Transform-domain DCT-to-HT (S-Transform): 704 operations  352 multiplications  352 additions Pixel-domain mapping (IDCT* followed by HT): 992 operations  256 multiplications  64 shifts  672 additions Advantage  29% saving in total operations  Two-stage vs. six-stage implementation  Better performance: no intermediate rounding * W.H. Chen, C.H. Smith, and S.C. Fralick, ``A Fast Computational Algorithm for the Discrete Cosine Transform,'' IEEE Trans. on Communications, Vol. COM-25, pp. 1004-1009, 1977

Digital Video Transcoding November 07 Intra Transcoder – Proposed Q Inverse Q Entropy Coding DCT-to-HT conversion (S-Transform) VLD/ IQ Input MPEG-2 Bitstream Pixel Buffer Mode decision (HT-domain) Inverse HT Intra Prediction (HT-domain) VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform

Digital Video Transcoding November 07 Conventional Mode Decisions Given all possible prediction modes, encoder needs to decide which one to use Low-complexity mode decision rule (RDO_Off): or High-complexity mode decision rule with rate distortion optimization (RDO_On): SATD CostRD Cost

Digital Video Transcoding November 07 Conventional RD Cost Computation Entire encoding/decoding need to be performed for every mode

Digital Video Transcoding November 07 Motivation & Previous Approaches RD_Cost based mode decision gives best performances, but very expensive to compute Previous efforts in fast intra mode decisions  Directional field  Edge histogram  Other pixel-domain approaches  They all lead to lower coding performance Our approach is based on transform domain processing – no loss in coding performance

Digital Video Transcoding November 07 Transform Domain RD Cost Computation  No inverse transform  Transformations of some prediction signals are easy to compute  Distortion calculated in transform domain

Digital Video Transcoding November 07 HT of DC Prediction HT No HT needs to be performed P dc has only one non-zero elements

Digital Video Transcoding November 07 HT of Horizontal Prediction Only one 1-D HT is needed P h has only four non-zero elements (the first column)

Digital Video Transcoding November 07 HT of Vertical Prediction Only one 1-D HT is needed P v has only four non-zero elements (the first row)

Digital Video Transcoding November 07 Calculate Distortion in Transform Domain Distortion in pixel domain: Distortion in transform domain:

Digital Video Transcoding November 07 Ranking-based Fast Mode Decision Two cost functions: SATD_Cost & RD_Cost Observation: the best mode according to RD_Cost usually has smaller SATD_Cost Proposed algorithm (mode reduction): to rank different modes using SATD_Cost, then calculate RD_Cost for top several modes  Algorithm can be conducted in transform domain

Digital Video Transcoding November 07 Verification Experiment Count the percentage of times when the best mode according to RD_Cost are within the best k modes ranked by SATD_Cost  k fixed as 3 in all simulations

Digital Video Transcoding November 07 Simulation Conditions Three transcoders  PDT – reference pixel domain transcoder, with fast IDCT implemented  TDT – transform domain transcoder  TDT-R – transform domain transcoder with ranking-based mode decision Test sequences  100 frames, CIF size, 30 fps  Input: MPEG-2 all-I at 6Mbps

Digital Video Transcoding November 07 Simulation – “Mobile”

Digital Video Transcoding November 07 Simulation – “Stefan”

Digital Video Transcoding November 07 Complexity: Run-time Results

Digital Video Transcoding November 07 Summary of Intra Transcoding Efficient transcoder architecture Efficient mode decision  Transform domain distortion calculation  Ranking-based mode decision Achieved virtually same quality as reference transcoder with significantly lower complexity

Inter Transcoding Techniques

Digital Video Transcoding November 07 Transcoder Architecture Inverse Q/ Inverse HT entropy coding MPEG-2 decoder Prediction Motion/mode mapping HT/Q Decoded picture and macroblock data Deblocking filter Pixel buffers Motion and modes

Digital Video Transcoding November 07 Assumptions Input  MPEG-2 frame pictures Output  H.264/AVC baseline profile (no B slices) and main profile  Frame pictures, MBAFF not considered  Block partition sizes considered for motion compensation: 16x16, 16x8, 8x16 and 8x8

Digital Video Transcoding November 07 Motion Mapping: Problems MPEG-2H.264/AVC Frame/field motion vectorFrame motion vector B, P picturesBaseline profile has no B picture support One motion vector per macroblock Motion vectors for different partition sizes: 16x16, 16x8, 8x16, 8x8

Digital Video Transcoding November 07 Motion Mapping Algorithm 1. Field-to-frame mapping: convert MPEG-2 field motion vectors (if any) to frame vector 2. Reference picture mapping: for B to P frame type conversion 3. Block size mapping: map the MPEG-2 motion vectors to target H.264/AVC motion vectors of different block size Algorithm: distance weighted average (DWA) 4. Motion refinement: (1+1/2+1/4) around estimated motion vectors for all block partitions Note: for B slice output, the above mapping is performed for motion vectors of both directions

Digital Video Transcoding November 07 Field-to-frame Conversion

Digital Video Transcoding November 07 Reference Picture Mapping IBBP t i =3 t o =1 IPPP Input Output IBBP IPPP Input Output MV i,forw MV i,back MV col MV o

Digital Video Transcoding November 07 Block Size Mapping: 16x8 8x16

Digital Video Transcoding November 07 Block Size Mapping: 8x8

Digital Video Transcoding November 07 Simulation Conditions Test sequences:  1920x1080i, 30fps, 450 frames MPEG-2 input:  30 Mbps, (30,3) H.264/AVC output:  UVLC, output bit-rate of interest ~10 Mbps  Baseline profile (needs to convert B pictures to P slices) & Main profile Comparison points  Mapping algorithm  B slices  RD optimization

Digital Video Transcoding November 07 Baseline output: no B slices

Digital Video Transcoding November 07 Main Output: with B slices

Digital Video Transcoding November 07 Complexity: Run-time Results

Digital Video Transcoding November 07 Conclusions Efficient motion mapping schemes that directly map MPEG-2 motion vectors to H.264/AVC motion vectors Evaluated the complexity-performance tradeoff of B-slices and RD optimization Achieved good rate-distortion performance with low complexity

Thank you

MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA.

Similar presentations

Presentation on theme: "MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA.

Similar presentations

Presentation on theme: "MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA."— Presentation transcript:

Similar presentations

About project

Feedback