Download presentation
Presentation is loading. Please wait.
Published byAnaya Bennitt Modified over 10 years ago
1
MPEG-2 to H.264/AVC Transcoding Techniques Jun Xin Xilient Inc. Cupertino, CA
2
Digital Video Transcoding November 07 Digital Video Transcoder “A” and “B” may differ in many aspects: coding formats: e.g. MPEG-2 to H.264/AVC bit-rate, frame rate, resolution … features: error resilience features contents: e.g. logo insertion Transcoder Coded digital video bit-stream “A” Coded digital video bit-stream “B”
3
Digital Video Transcoding November 07 Applications Media Storage Transcode broadcasting MPEG-2 video to H.264/AVC format: enable long-time recording Effective for multi-channel recording Home Gateway Provide connection to IPTV set-top box Box only supports H.264/AVC Over wireless network with bandwidth limitation Other potential uses: Export to mobile Internet streaming … …
4
Digital Video Transcoding November 07 Goals and Challenges H.264/AVC: latest video compression standard Promises same quality as MPEG-2 at half the bit-rate Is being widely adopted HD Consumer Storage, e.g., HD-DVD and Blu-Ray Mobile Devices, e.g., Apple iPod, iPhone, Sony PSP Convert MPEG-2 video to H.264/AVC format More efficient storage, export to mobile devices, etc. Challenges Yield similar quality as full re-encoding, but with much lower cost Key to lower-cost/high-quality: how to intelligently reuse available information from the incoming bitstream May be loosely considered as a “two-pass coder” Could achieve better quality than full re-encoding given same complexity
5
Digital Video Transcoding November 07 Outline Intra-only transcoding techniques Efficient compressed domain processing Inter transcoding techniques Motion mapping / motion reuse
6
Intra Transcoding Techniques
7
Digital Video Transcoding November 07 Intra Transcoder – Pixel Domain Q Inverse Q H.264 Entropy Coding IDCT VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Pixel-domain) Pixel Buffer Mode decision HT VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform Inverse HT
8
Digital Video Transcoding November 07 Compressed Domain Processing? Q Inverse Q H.264 Entropy Coding VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Comp-domain) Coeff Buffer Mode decision VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform
9
Digital Video Transcoding November 07 AVC 4x4 Transform Motivation: DCT requires real-number operations, which may cause inaccuracies in inversion Better prediction means less spatial correlation – no strong need for real-number operations H.264 uses a simple integer 4x4 transform Approximation to 4x4 DCT Transform and inverse transform note: ½ in inverse transform represents right shift, so it is non-linear
10
Digital Video Transcoding November 07 Intra Prediction in H.264/AVC Motivation: intra-frames are natural images, so they exhibit strong spatial correlation Pixels in intra-coded frames are predicted based on previously-coded ones Prediction can be based on 4x4 blocks or 16x16 macroblocks (or 8x8 blocks for high profile) An encoded mode specifies which neighbor pixels should be used to predict, and how
11
Digital Video Transcoding November 07 Current block: Prediction blocks: VerticalHorizontalDiagonal_Down_Right 4x4 Intra Prediction Example
12
Digital Video Transcoding November 07 Compressed Domain Processing? Challenges Different transforms MPEG-2 uses DCT, floating point H.264/AVC uses an integer transform New prediction modes in H.264/AVC Can prediction be performed in compressed domain? Goals Simpler computation and architecture
13
Digital Video Transcoding November 07 Compressed Domain Processing? Q Inverse Q H.264 Entropy Coding VLD/ IQ Input MPEG-2 Bitstream Intra Prediction (Comp-domain) Coeff Buffer Mode decision VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform
14
Digital Video Transcoding November 07 Intra Transcoder – Proposed Q Inverse Q Entropy Coding DCT-to-HT conversion (S-Transform) VLD/ IQ Input MPEG-2 Bitstream Pixel Buffer Mode decision (HT-domain) Inverse HT Intra Prediction (HT-domain) VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform
15
Digital Video Transcoding November 07 Techniques DCT-to-HT conversion Compressed (HT) domain prediction Very simple for some prediction modes Compressed domain distortion calculation in mode decision Advantages lower computational complexity No quality loss
16
Digital Video Transcoding November 07 DCT-to-HT Conversion
17
Digital Video Transcoding November 07 DCT-to-HT Conversion: Transform Kernel Matrix
18
Digital Video Transcoding November 07 Fast Algorithm (1D)
19
Digital Video Transcoding November 07 Complexity Analysis Transform-domain DCT-to-HT (S-Transform): 704 operations 352 multiplications 352 additions Pixel-domain mapping (IDCT* followed by HT): 992 operations 256 multiplications 64 shifts 672 additions Advantage 29% saving in total operations Two-stage vs. six-stage implementation Better performance: no intermediate rounding * W.H. Chen, C.H. Smith, and S.C. Fralick, ``A Fast Computational Algorithm for the Discrete Cosine Transform,'' IEEE Trans. on Communications, Vol. COM-25, pp. 1004-1009, 1977
20
Digital Video Transcoding November 07 Intra Transcoder – Proposed Q Inverse Q Entropy Coding DCT-to-HT conversion (S-Transform) VLD/ IQ Input MPEG-2 Bitstream Pixel Buffer Mode decision (HT-domain) Inverse HT Intra Prediction (HT-domain) VLD: variable length decoding (I)Q: (inverse) quantization IDCT: inverse discrete cosine transform HT: H.264/AVC 4x4 transform
21
Digital Video Transcoding November 07 Conventional Mode Decisions Given all possible prediction modes, encoder needs to decide which one to use Low-complexity mode decision rule (RDO_Off): or High-complexity mode decision rule with rate distortion optimization (RDO_On): SATD CostRD Cost
22
Digital Video Transcoding November 07 Conventional RD Cost Computation Entire encoding/decoding need to be performed for every mode
23
Digital Video Transcoding November 07 Motivation & Previous Approaches RD_Cost based mode decision gives best performances, but very expensive to compute Previous efforts in fast intra mode decisions Directional field Edge histogram Other pixel-domain approaches They all lead to lower coding performance Our approach is based on transform domain processing – no loss in coding performance
24
Digital Video Transcoding November 07 Transform Domain RD Cost Computation No inverse transform Transformations of some prediction signals are easy to compute Distortion calculated in transform domain
25
Digital Video Transcoding November 07 HT of DC Prediction HT No HT needs to be performed P dc has only one non-zero elements
26
Digital Video Transcoding November 07 HT of Horizontal Prediction Only one 1-D HT is needed P h has only four non-zero elements (the first column)
27
Digital Video Transcoding November 07 HT of Vertical Prediction Only one 1-D HT is needed P v has only four non-zero elements (the first row)
28
Digital Video Transcoding November 07 Calculate Distortion in Transform Domain Distortion in pixel domain: Distortion in transform domain:
29
Digital Video Transcoding November 07 Ranking-based Fast Mode Decision Two cost functions: SATD_Cost & RD_Cost Observation: the best mode according to RD_Cost usually has smaller SATD_Cost Proposed algorithm (mode reduction): to rank different modes using SATD_Cost, then calculate RD_Cost for top several modes Algorithm can be conducted in transform domain
30
Digital Video Transcoding November 07 Verification Experiment Count the percentage of times when the best mode according to RD_Cost are within the best k modes ranked by SATD_Cost k fixed as 3 in all simulations
31
Digital Video Transcoding November 07 Simulation Conditions Three transcoders PDT – reference pixel domain transcoder, with fast IDCT implemented TDT – transform domain transcoder TDT-R – transform domain transcoder with ranking-based mode decision Test sequences 100 frames, CIF size, 30 fps Input: MPEG-2 all-I at 6Mbps
32
Digital Video Transcoding November 07 Simulation – “Mobile”
33
Digital Video Transcoding November 07 Simulation – “Stefan”
34
Digital Video Transcoding November 07 Complexity: Run-time Results
35
Digital Video Transcoding November 07 Summary of Intra Transcoding Efficient transcoder architecture Efficient mode decision Transform domain distortion calculation Ranking-based mode decision Achieved virtually same quality as reference transcoder with significantly lower complexity
36
Inter Transcoding Techniques
37
Digital Video Transcoding November 07 Transcoder Architecture Inverse Q/ Inverse HT entropy coding MPEG-2 decoder Prediction Motion/mode mapping HT/Q Decoded picture and macroblock data Deblocking filter Pixel buffers Motion and modes
38
Digital Video Transcoding November 07 Assumptions Input MPEG-2 frame pictures Output H.264/AVC baseline profile (no B slices) and main profile Frame pictures, MBAFF not considered Block partition sizes considered for motion compensation: 16x16, 16x8, 8x16 and 8x8
39
Digital Video Transcoding November 07 Motion Mapping: Problems MPEG-2H.264/AVC Frame/field motion vectorFrame motion vector B, P picturesBaseline profile has no B picture support One motion vector per macroblock Motion vectors for different partition sizes: 16x16, 16x8, 8x16, 8x8
40
Digital Video Transcoding November 07 Motion Mapping Algorithm 1. Field-to-frame mapping: convert MPEG-2 field motion vectors (if any) to frame vector 2. Reference picture mapping: for B to P frame type conversion 3. Block size mapping: map the MPEG-2 motion vectors to target H.264/AVC motion vectors of different block size Algorithm: distance weighted average (DWA) 4. Motion refinement: (1+1/2+1/4) around estimated motion vectors for all block partitions Note: for B slice output, the above mapping is performed for motion vectors of both directions
41
Digital Video Transcoding November 07 Field-to-frame Conversion
42
Digital Video Transcoding November 07 Reference Picture Mapping IBBP t i =3 t o =1 IPPP Input Output IBBP IPPP Input Output MV i,forw MV i,back MV col MV o
43
Digital Video Transcoding November 07 Block Size Mapping: 16x8 8x16
44
Digital Video Transcoding November 07 Block Size Mapping: 8x8
45
Digital Video Transcoding November 07 Simulation Conditions Test sequences: 1920x1080i, 30fps, 450 frames MPEG-2 input: 30 Mbps, (30,3) H.264/AVC output: UVLC, output bit-rate of interest ~10 Mbps Baseline profile (needs to convert B pictures to P slices) & Main profile Comparison points Mapping algorithm B slices RD optimization
46
Digital Video Transcoding November 07 Baseline output: no B slices
47
Digital Video Transcoding November 07 Baseline output: no B slices
48
Digital Video Transcoding November 07 Main Output: with B slices
49
Digital Video Transcoding November 07 Main Output: with B slices
50
Digital Video Transcoding November 07 Complexity: Run-time Results
51
Digital Video Transcoding November 07 Conclusions Efficient motion mapping schemes that directly map MPEG-2 motion vectors to H.264/AVC motion vectors Evaluated the complexity-performance tradeoff of B-slices and RD optimization Achieved good rate-distortion performance with low complexity
52
Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.