Multimedia Communications EG 371 and EG 348 Dr Matthew Roach Lecture 4 Compression &formats cont. Sell yourself big time to the people. Did my PhD in the speech and image group Swansea. I also held a post here as a senior research assistant working on a BT sponsored project. Looking at classification of video some of the work is included in this course. For the last 2 years I have worked for a small systems engineering consultancy full time for the last year. In that time I’ve co-delivered training courses to a diverse range of companies, including Motorola, MoD, IEE, BSI. Get in contact with me via-email Multimedia communications EG-371 Dr Matt Roach
Lossless: Huffman compression reduces average code length to represent symbols of an alphabet occur frequently short length codes constructing a binary tree arranging the symbols adding two lowest probabilities Sum of last two symbols is 1. Code words formed tracing tree path assigning 0s and 1s to the branches Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Huffman coding Determine the Huffman code for the following set of symbols: Step 1 – List symbols in order of decreasing probability Symbol m0 m1 m2 m3 m4 Probability 0.10 0.36 0.15 0.20 0.19 Symbol m1 m3 m4 m2 m0 Probability 0.36 0.20 0.19 0.15 0.10 Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Step 2 – Get 2 symbols with lowest probability. Give the combined symbol a new name: m2(0.15) + m0(0.1.0) A(0.25) Step 3 – Create a new list and repeat the process: Symbol m1 A m3 m4 Probability 0.36 0.25 0.20 0.19 Symbol B m1 A Probability 0.39 0.36 0.25 Symbol B C Probability 0.39 0.61 Symbol D Probability 1.0 Multimedia communications EG-371 Dr Matt Roach
An alternative approach is to construct this tree m1 0.36 m3 0.20 A 0.25 m0 0.10 m4 0.19 m2 0.15 B 0.39 C 0.61 D 1.0 Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Assign bits (0,1) to the tree branches Codewords determined by tracing the path from root node to symbol leaf: 1 B 0.39 C 0.61 m1 0.36 A 0.25 m2 0.15 m4 0.19 m3 0.20 Root m0 0.10 Symbol Probability Codewords m0 0.10 011 m1 0.36 00 m2 0.15 010 m3 0.20 10 m4 0.19 11 Compression is achieved by allocating frequently occurring symbols with shorter codewords. Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach How much compression? 5 symbols 3-bits for each symbol. message [m0m1m2m3m4] require 5x3=15 bits. Huffman coding require 12 bits compression of 15:12 Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Example Consider the message babbage babble baggage label bagel Construct a Huffman code and determine the compression ratio. Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Solution construct a probability table counting occurrence of each letters Letter Occurrence Probability b 9 0.3 a 7 0.233 g 5 0.166 e l 4 0.133 Total: 30 1 Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Solution b 0.3 b 0.3 B 0.4 C 0.6 D 1.0 a 0.233 A 0.3 b 0.3 B 0.4 g 0.166 a 0.233 A 0.3 e 0.166 g 0.166 1 0.133 Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Solution Root 1 Symbol Probabilities Codewords b 0.4 00 a 0.233 10 g 0.166 11 e 010 l 0.133 011 C 0.6 B 0.4 1 1 b 0.3 A 0.3 a 0.233 g 0.166 1 e 0.166 l 0.133 Multimedia communications EG-371 Dr Matt Roach
Solution babbage babble baggage label bagel 5 symbols (bagel) 30 characters uncompressed 3 bits/symbol 90 bits. Huffman gives a compression of approx. 9:7 Letter Occurrence Codewords No. bits b 9 00 18 a 7 10 14 g 5 11 e 010 15 l 4 011 12 Total bits: 69 Multimedia communications EG-371 Dr Matt Roach
Multimedia Communications EG 371 and EG 348 Dr Matthew Roach Lecture 4 Formats cont. Sell yourself big time to the people. Did my PhD in the speech and image group Swansea. I also held a post here as a senior research assistant working on a BT sponsored project. Looking at classification of video some of the work is included in this course. For the last 2 years I have worked for a small systems engineering consultancy full time for the last year. In that time I’ve co-delivered training courses to a diverse range of companies, including Motorola, MoD, IEE, BSI. Get in contact with me via-email Multimedia communications EG-371 Dr Matt Roach
Common Container Formats AVI (.avi) M-JPEG, DivX, nearly any format (not Sorenson). Quicktime Locked Apple Sorenson codec, or for Cinepak (free), also mjpeg WMV (.wmv) MPEG4; nearly any codec, Microsoft spinoffs of MPEG-4 ASF ("Advanced Streaming Format", .asf) a subset of wmv, intended primarily for streaming: an early Microsoft implementation of an MPEG4 codec. Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Common Codecs MPEG-1 Old, supported by everything (at least up to 352x240), reasonably efficient. A good format for the web Video quality is not as crisp as MPEG-2 Small file size Good picture quality Compressed format Require special playback program Cannot Edit MPEG-2 A souped-up version of MPEG-1, with better compression. 720x480. Used in HDTV, DVD, and SVCD. Good Quality, Can burn onto DVD disc Large file size, 4.7GB for 2 hours of video MPEG-4 A family of codecs, some of which are open, others Microsoft proprietary. MJPEG ("Motion JPEG") A codec consisting of a stream of JPEG images. Common in video from digital cameras, but it doesn't compress well, so it's not good for web distribution. Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Common Codecs cont. WMV ("Windows Media Video") A collection of Microsoft proprietary video codecs. Since version 7, it has used a special version of MPEG4. Small file size Good picture quality Ideal for web transmission Compressed format Cannot Edit RM ("Real Media") a closed codec developed by Real Networks for streaming video and audio. Maybe also a container? DivX incomplete early MPEG-4 codec inside an AVI container; DivX 4 and later are a more full MPEG-4 codec.. No resolution limit. Requires more horsepower to play than mpeg1, but less than mpeg2. Hard to find mac and windows players. Good Quality with reasonably small file size Not a standard video format Cannot produce video onto DVD or CD Multimedia communications EG-371 Dr Matt Roach
Multimedia communications EG-371 Dr Matt Roach Common Codecs cont. DV ("Digital Video") Usually used for video grabbed via firewire off a video camera. Fixed at 720x480 @ 29.97FPS, or 720x576 @ 25 FPS. Not very highly compressed. Superb quality, record back to DV tape Large file size, 25GB for 60 min of video Sorenson 3: Apple's proprietary codec, commonly used for distributing movie trailers (inside a quicktime container). Quicktime 6: Apple's implementation of an MPEG4 codec. Good picture quality Ideal for web transmission Larger file size (compare to other streamable formats) Cannot Edit Multimedia communications EG-371 Dr Matt Roach