Briefly introduction to image/ video coding standard and FGS for MPEG-4 卓傳育
Video Compression Standards ITU-T –International Telecommunication Union — Telecommunication Standardization (ITU-T) MPEG –Moving Picture Experts Group
International Telecommunication Union — Telecommunication Standardization (ITU-T) CCITT H.261 –ITU-T Study Group 15 –Videophone and video conferencing – : p x 64 kbps (p = 1 … 30) ITU-T H.263 –PSTN and mobil network: 10 to 24 kbps –1994: H.263, H.263+ … ITU-T H.26l –Merging to JVT in MPEG-4 Part 10
MPEG: Moving Picture Experts Group Coding of Moving Video and Audio MPEG-1: CD-I, for Digital Storage, … MPEG-2: … + TV, HDTV, for Broadcast – 1994 MPEG-3: HDTV -> merged into MPEG-2 MPEG-4: Coding of Audiovisual Objects-V.1:1998; V.2:1999 Extensions ongoing MPEG-7: MM Description Interface – Fall 2001 ‘ Describing ’ audiovisual material MPEG-21: Digital Multimedia Framewrok – 1 st parts early 2002 ‘ The Big Picture and The Glue ’
Block-Based Coding Why divide to blocks? Image->Blocks
H.261 Video Formats Video Forma t Luminance (Y)Chrominance(Cb, Cr) pixels/lin e lines/fram e pixels/linelines/fram e CIF QCIF Y pixel Cb, Cr pixel Block boundary
Arrangement of H QCIF CIF
Arrangements of data structure in H QCIF picture GOB (Group Of Block) Y1Y2 Y3Y4 UV MB (Macro Block)
Transform coding Encoder Decoder TQ Entropy coding Entropy coding Q -1 T -1 Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Inverse Zigzag Scan (1D->2D) Reconstructed Transform Coefficients Reconstructed Image block
Transform (0,1) (1,0) (-1,1)(1,1) (0.2,1.8) = 0.2(1,0)+1.8(0,1) = 1(1,1)+0.8(-1,1)
Basis of Transform Basis vectors{v 1,v 2, …,v n } Orthogonal : (v i ) · (v j ) = 0 if i!=j Normalized : (v i ) · (v i ) = 1 Orthonormal : orthogonal and normalized –eg. orthonormal : {(0,1),(1,0)} Orthogonal : {(1,1),(-1,1)}
Why DCT is used for image compressing KLT(Karhunen-Loeve transform): –Statistically optimal transform: minimal MSE for any specific bandwidth reduction –KLT depends on the type of signal statistics –No fast algorithm DCT approaches KLT for highly correlated signals: –sample values typically vary slowly from point to point across an image =>Highly correlated signals –Fast algorithm(but not optimal)
DCT-basis
DCT :Discrete Cosine Transform Frequency DomainSpatial Domain [8,8,8,8,8,8,8,8] [8,8,8,8,8,8,8,9] [8,8,10,9,7,8,8,9] [8,90,-100,3,4,-10,2,80] DCT [44,0,0,0,0,0,0,0] [44,-2,0,-2,0,-2,0,-2] [46,-2,-2,-4,-2,2,0,-2] [48,-56,146,6,74,-148,-158,-136]
DCT Example of DCT
Quantization 目的:提高壓縮倍率 缺點:還原後的值會有誤差 原則:希望還原後的值,與原值差距較小 再經過較佳的 IQ 再直接乘以 3 ( 一般的 IQ) 經過 Q( 整除以 3) 原值
/16 = -26 Example of JPEG Coding(Encoder)
Example of JPEG Coding(Encoder)
Zigzag Scan 2D->1D DC term AC term BACK
–3 1 –3 2 –6 2 –4 1 – – –1 –1 EOB 2D->1D Example of JPEG Coding(Encoder) Transform coding(DCT) Quantization Zigzag Scan Zigzag Scan Entropy Coding (bit stream)
Entropy Coding (Variable-Length Coding) Huffman coding Run-length coding Arithmetic coding
Huffman Coding 設法讓 ” 出現次數最多 ” 的字 (word) ,使用 最短的代碼 (code) Variable- Length Code Fixed- Length Code 1/24 1/63/4 出現機率 ‘D’‘C’‘B’‘A’ 範例 1*(3/4)+2*(1/6)+3*(1/24)+3 *(1/24) = *(3/4)+2*(1/6)+2*(1/24)+2 *(1/24) = 2 平均長度
DPCM : Differential PCM 若連續出現重複字 (word) 或相近字的機率 很高,則 coding ” 差值 ” 會比個別 coding 每 個字效果好 例如 ‘ AAFFFFFCCC ’ –PCM => ’ 65,65,70,70,70,70,70,67,67,67 ’ or ‘ 0,0,5,5,5,5,5,2,2,2 ’ –DPCM => ’ 0,0,5,0,0,0,0,-3,0,0 ’
Run-Length Coding – EOB ^^^^^^^ ^ ^^^ (3,1) (0,6) (1,3) s s s s s s
Video Compression Encoder For Still Image TQ Entropy coding Image block Transform Coefficients Zigzag Scan (2D->1D) Bitstream Encoder For Video Sequence Q -1 T -1 Reconstructed Transform Coefficients Reconstructed Image block MC -
H.261 Intra frame – 傳整個 frame 的 information Inter frame – 會 reference 上一張 frame 傳 motion vector 傳差值
H.261 Coder DCTQ Inverse DCT Motion Compensation Loop Filter Video in Inverse Q
Motion Estimation (32,16) (-10,4) (22,20) Referenced frame Current frame Macro block 16*16 31*31
Full-search algorithm Current original frame Current referenced frame Maximum check : 31*31=961
3-step search algorithm Current original frame Current referenced frame 距離 8->4->2->1 maximum check : =33
NTSS(new 3-step search) algorithm
FSS(4-step search) algorithm
BBGDS
Overview of Fine Granularity Scalability in MPEG-4 Video Standard Weiping Li, Fellow, IEEE
Illustration of video coding performance
Multi-layer Coding
SNR scalability decoder defined in MPEG-2
Layered scalable coding Tech. Temporal scalability
Layered scalable coding Tech. Spatial scalability
BIT-PLANE CODING OF THE DCT COEFFICIENTS
FGS USING BIT-PLANE CODING OF DCT COEFFICIENTS Overall Coding Structure of FGS Some Details of FGS Coding Profile Definitions in the Amendment of MPEG-4
Overall Coding Structure of FGS FGS encoder structure
Overall Coding Structure of FGS FGS decoder structure
Some Details of FGS Coding 1)Different Numbers of Bit-Planes for Individual Color Components 2)Variable-Length Codes 3)Decoding Truncated Bitstreams
Different Numbers of Bit- Planes for Individual Color Components
Variable-Length Codes Statistics of the (RUN, EOP) symbols in the four VLC tables
Coding patterns for syntax element fgs_cbp
Decoding Truncated Bitstreams Decoding of the truncated bitstream is not standardized in MPEG-4. One possible method –To look ahead 32 bits at every byte-aligned position in the bitstream. –If the 32 bits are not fgs vop start code, the first 8 bits of the 32 bits are information bits of the FGS frame to be decoded. The decoder slides the bitstream pointer by one byte and looks ahead another 32 bits to check for fgs vop start code.