Case Study ARM Platform-based JPEG Codec HW/SW Co-design Teaching Assistant : Yu-Ju Cho Advisor : Prof. An-Yeu Wu
Outline Introduction to JPEG Codec Lab ─ Case study Reference
ISO/IEC 10918-1 JPEG JPEG: Joint Photographic Experts Group JPEG voted as international standard in 1994 JPEG standard has four compression method Baseline sequential DCT-based coding Progressive DCT-based coding Lossless coding method Sampling and Quantization are not considered at loss-less coding scheme Hierarchical coding method
Baseline sequential V.S. Progressive DCT-based coding Compression Method T I S O 7 3 - 9 / d Baseline sequential V.S. Progressive DCT-based coding
Block Diagram of JPEG Encoder R G B Y Cb Cr 01001011101… DPCM: Differential Pulse Code Modulation RLC: Run-Length Code
Color Model in Video ─ YCrCb Y: Luminance Cb,Cr: Chrominance YCbCr color model is used in JPEG and MPEG
Color Model in Video ─ YCrCb CCIR-601 transform formula Color space transform is loss-less
Chroma Sub-sampling 4:1:1 and 4:2:0 are mostly used in JPEG and MPEG
Block Diagram of JPEG Encoder R G B Y Cb Cr 01001011101… DPCM: Differential Pulse Code Modulation RLC: Run-Length Code
2-D DCT (Discrete Cosine Transform) Frequency domain Space domain
Basis Image of 2-D DCT Horizontal Frequency Vertical Frequency Low High Vertical Frequency High
Frequency Distribution of 2-D DCT By frequency: By direction:
8 point 1-D DCT Algorithm (1/2) Better for VLSI design implementation!
8 point 1-D DCT Algorithm (2/2)
Implementation 2-D DCT Example: row-column decomposition Separable, row-column decomposition X Transport Memory (Y) Z 1D DCT Unit 1D DCT Unit Y=AX Z=YAT
Block Diagram of JPEG Encoder R G B Y Cb Cr 01001011101… DPCM: Differential Pulse Code Modulation RLC: Run-Length Code
Quantization Table for Luminance 16 11 10 24 40 51 61 12 14 19 26 58 60 55 13 67 69 56 17 22 29 87 80 62 18 37 68 109 103 77 35 64 81 104 113 92 49 78 121 120 101 72 95 98 112 100 99
Quantization Table for Chrominance 17 18 24 47 99 21 26 66 56
Block Diagram of JPEG Encoder R G B Y Cb Cr 01001011101… DPCM: Differential Pulse Code Modulation RLC: Run-Length Code
Predictive Coding of DC Coefficients Differential Pulse Code Modulation (DPCM) To Store the differential value is better than the exact value.
Zig-zag Scan (AC Coefficients)
Run-Length Coding(RLC) DC (R,L) => (0,-3)(0,-2)(0,-1)(0,-2)(0,-1)(2,-1)(EOB)
Huffman Coding for DC and AC Coefficient (R,L) => (0,-3)(0,-2)(0,-1)(0,-2)(0,-1)(2,-1)(EOB) Category AC Coefficient Range 1 -1,1 2 -3,-2,2,3 3 -7,…,-4,4,…,7 4 -15,…,-8,8,…,15 5 -31,…,-16,16,…,31 6 -63,…,-32,32,…,63 7 -127,…,-64,64,…,127 8 -255,…,-128,128,…,255 9 -511,…,-256,256,…,511 10 -1023,…,-512,512,…,1023 11 -2047,…,-1024,1024,…,2047 (0,2)(-3),(0,2)(-2),(0,1)(-1),(0,2)(-2),…(0,0) (Run,SSSS/Catagory) Huffman Table
Huffman Coding for DC and AC Coefficient Run/Size Code length Code word 0/0 (EOB) 14 1010 0/1 12 00 0/2 01 0/3 13 100 0/4 1011 0/5 15 11010 0/6 17 1111000 0/7 18 11111000 0/8 10 1111110110 0/9 16 1111111110000010 0/A 1111111110000011 1/1 1100 1/2 11011 1/3 1111001 1/4 19 111110110 Category Code length Code word 10 2 000 11 3 010 12 011 13 100 14 101 15 110 16 4 1110 17 5 11110 18 6 111110 19 7 1111110 8 11111110 9 111111110 Table for luminance DC coefficient differences Table for luminance AC coefficients (0,2)(3),(0,2)(-2),(0,1)(-1),(0,2)(-2),…(0,0) =>(01) (11) (01) (01) ……(1010)
An Example of Baseline DCT-based Coding For Y, (8*8 pixels *8 bits/pixel = 512 bits) FDCT -128 Q (6)(61),(0,2)(-3), (0,3)(4),(0,1)(-1), (0,3)(-4),(0,2)(2), (1,2)(2),(0,2)(-2), (0,2)(-2),(5,2)(2), (3,1)(1),(6,1)(-1), (2,1)(-1),(4,1)(-1), (7,1)(-1),(0,0) Zig-Zag (1110)(111101)(01)(00)(100) (100)(00)(0)(100)(001)(01) (10)(11011)(10)(01)(01)(01) (01)(11111110111)(10)(111010)(1)(1111011)(0)(11100)(0) (111011)(0)(11111010)(0)(1010) Huffman Run-length total 102 bits Q Table
Block Diagram of JPEG Encoder R G B Y Cb Cr 01001011101… DPCM: Differential Pulse Code Modulation RLC: Run-Length Code
Block Diagram of JPEG Decoder 01001011101…
JPEG Bitstream
Outline Introduction to JPEG Codec Lab ─ Case study Reference
File Structure
Read & Write Address FDCT IDCT Write_head 0xcc000000 0xcc000004 0xcc00000c 0xcc000010 0xcc000014 0xcc000018 0xcc00001c Write_head 0xcc000040 0xcc000044 0xcc000048 0xcc00004c 0xcc000050 0xcc000054 0xcc000058 0xcc00005c FDCT IDCT Read_head 0xcc000020 0xcc000024 0xcc000028 0xcc00002c 0xcc000030 0xcc000034 0xcc000038 0xcc00003c Read_head 0xcc000060 0xcc000064 0xcc000068 0xcc00006c 0xcc000070 0xcc000074 0xcc000078 0xcc00007c
Result for SW Simulation Original Encoder Decoder
Result for HW Simulation Original Encoder Decoder
Profiling Result of SW Simulation
Lab ─ Case Study Goal Principles Requirement Discussion Implement the JPEG codec system using ARM platform Principles Implement the ARM platform-based JPEG codec HW/SW co-design Requirement Analysis the profiling of pure software simulation Explain how to partition the HW/SW of JPEG codec Implement the JPEG codec with HW/SW co-design Discussion Explain where is the stack and heap ? And who initialize them
Outline Introduction to JPEG Codec Lab ─ Case study Reference
Reference Wen-Hsiung Chen, C. Harrison Smith, and S. C. Fralick, "A Fast Computational Algorithm for the Discrete Cosine Transform," IEEE Trans. Commun., vol. COM-25, pp. 1004-1009, Sept 1977. JPEG: Still Image Data Compression Standard by William B. Pennebaker and Joan L. Mitchell, Kluwer Academic Publishers, ISBN: 0442012721