SWE 423: Multimedia Systems

SWE 423: Multimedia Systems
Chapter 7: Data Compression (1)

References Chapter 7 from our Textbook: “Multimedia Fundamentals: Media Coding and Content Processing” Slides from the reference book: Fundamentals of Multimedia

Outline Introduction Motivation for compression Coding requirements
Compression types General Data Compression Scheme Compression Techniques Entropy Encoding Run Length Encoding Huffman Coding

Introduction Video and audio have much higher storage requirements than text Data transmission rates (in terms of bandwidth requirements) for sending continuous media are considerably higher than text Efficient compression of audio and video data, including some compression standards, will be considered in this chapter

Motivation for Compression
Terminology 1 kbit = 1000 bit 1 Kbit = 1024 bit (= 210) 1 Mbit = 1024 x 1024 bit (= 210 * 210 = 220) Discrete Data: Considering a small window of 640 x 480 pixels on a display Text Vector Image Bitmap Image Continuous Data: Required storage space per second Uncompressed speech of telephone quality Uncompressed stereo audio signal of CD quality Video sequence Text: (assuming 8 x 8 pixels per character) and (2 bytes per character) Number of characters: (640)(480)/(8*8) = 4800 Storage Required for Screen: 4800 * 2 = 9600 bytes Vector images: (Assuming 500 lines, each line is defined by its coordinates in the x and y direction and by an 8-bit attribute field. Coordinates in the x direction require Ceil(log 640) = 10 bits , where as in the y direction 9 bits. Therefore, # bits per line = = 46 bits. Therefore, Storage per screen = 500 * 46/8 = 2.8 Kbytes

Motivation for Compression: Discrete Data
Text Assuming 2 bytes are used for every 8 x 8 pixel character, Character per screen page = ... Storage required per screen page = ... Vector Image Assuming that a typical image consists of 500 lines, each of which is defined by its coordinates in the x direction and the y direction, and an 8-bit attribute field Coordinates in the x direction require ... Coordinates in the y direction require ... Bits per line = ... Storage required per screen page Bitmap Image Assuming using 256 colors requiring a single byte per pixel

Motivation for Compression: Continuous Data
Uncompressed speech of telephone quality Assuming being sampled at 8 kHz and quantized using 8 bit per sample yielding a data stream of 64 Kbit/second Storage space required per second = ... Uncompressed stereo audio signal of CD quality Assuming being sampled at 44.1 kHz and quantized using 16 bits Data rate = ...

Video sequence Assuming 25 full frames per second, luminance and chrominance of each pixel are coded using 3 bytes, luminance sampled at 13.5 MHz while chrominance (R-Y and B-Y) is sampled at 6.75 MHz, each, and samples are uniformly coded using 8 bits. Bandwidth = ... Data Rate = ... Storage space required per second = ...

Processing uncompressed video data streams requires Storage space in the gigabyte Buffer space in the megabyte Data transfer rates of 140 Mbit/s [per unidirectional connection] These requirements can be considerably lowered by employing compression

Can Multimedia Data be Significantly Compressed?
Redundancy can be exploited to do compression Spatial redundancy correlation between neighboring pixels in image/video Spectral redundancy correlation among colors Psycho-visual redundancy Perceptual properties of human visual system

What Makes “Good” Compression
Quality of compressed and decompressed data should be as good as possible Compression/decompression process should be as simple as possible Decompression time must not exceed certain thresholds [De]/Compression requirements can be divided into Dialogue mode (video conferencing) Retrieval mode (digital libraries) Both

Coding Requirements: Dialogue Mode
End-to-end delay does not exceed 150 ms for compression and decompression alone. Ideally, compression and decompression should not exceed 50ms in order to ensure natural dialogue. In addition delay in the network, communications protocol processing in the end system, data transfer to and from the respective input and output devices.

Coding Requirements: Retrieval Mode
Fast forward and fast rewind with simultaneous display (or playback) of the data should be possible Random access to single images or audio passages in a data stream should be possible in less than 0.5 s. Maintains interaction aspects in retrieval systems Decompression of images, video or audio passages should be possible without interpreting all preceding data. Allows random access and editing

Coding Requirements: Both Modes
Support display of the same data in different systems Formats have to be independent of frame size and video frame rate Audio and video compression should support different data rates at different qualities Precisely synchronize audio and video Support for economical solution Software Few VLSI chips Enable cooperation of different systems Data generated on a multimedia system can be reproduced on another system (e.g. course materials).

Compression Types Physical versus logical Compression
Performed on data regardless of what information it contains Translates a series of bits to another series of bits Logical Knowledge-based e.g. United Kingdom to UK Spatial Compression – 2D or single image Temporal Compression – 3D or video Codec – Compression / Decompression Color / intensity … same thing

Compression Types Symmetric Asymmetric
Compression and decompression roughly use the same techniques and take just as long Data transmission which requires compression and decompression on-the-fly will require these types of algorithms Asymmetric Most common is where compression takes a lot more time than decompression In an image database, each image will be compressed once and decompressed many times Less common is where decompression takes a lot more time than compression Creating many backup files which will hardly ever be read

Compression Types Non-adaptive Adaptive
Contain a static dictionary of predefined substrings to encode which are known to occur with high frequency Adaptive Dictionary is built from scratch

Compression Types Lossless Lossy decompress(compress(data)) = data
Used for computer data, medical images, etc. Lossy decompress(compress(data))  data Some distortion A small change in pixel values may be invisible Suited for audio and video

General Data Compression Scheme
Encoder (compression) Input Data Codes / Codewords Storage or Networks Codes / Codewords Decoder (decompression) B0 = # bits required before compression B1 = # bits required after compression Compression Ratio = B0 / B1. Output Data

Compression Techniques
Coding Type Basis Technique Entropy Encoding Run-length Coding Huffman Coding Arithmetic Coding Source Coding Prediction DPCM DM Transformation FFT DCT Layered Coding Bit Position Subsampling Sub-band Coding Vector Quantization Hybrid Coding JPEG MPEG H.263 Many Proprietary Systems

Compression Techniques
Entropy Coding Semantics of the information to be encoded are ignored Lossless compression technique Can be used for different media regardless of their characteristics Source Coding Takes into account the semantics of the information to be encoded. Often lossy compression technique Characteristics of medium are exploited Hybrid Coding Most multimedia compression algorithms are hybrid techniques

Entropy Encoding Information theory is a discipline in applied mathematics involving the quantification of data with the goal of enabling as much data as possible to be reliably stored on a medium and/or communicated over a channel. According to Claude E. Shannon, the entropy  (eta) of an information source with alphabet S = {s1, s2, ..., sn} is defined as where pi is the probability that symbol si in S will occur.

Entropy Encoding In science, entropy is a measure of the disorder of a system. More entropy means more disorder Negative entropy is added to a system when more order is given to the system. The measure of data, known as information entropy, is usually expressed by the average number of bits needed for storage or communication. The Shannon Coding Theorem states that the entropy is the best we can do (under certain conditions). i.e., for the average length of the codewords produced by the encoder, l’,  l’ Entropy H of {pj: 1<= j <= M} satisfies 0 <= H <= log M.

Entropy Encoding Example 1: What is the entropy of an image with uniform distributions of gray-level intensities (i.e. pi = 1/256 for all i)? Example 2: What is the entropy of an image whose histogram shows that one third of the pixels are dark and two thirds are bright?

Entropy Encoding: Run-Length
Data often contains sequences of identical bytes. Replacing these repeated byte sequences with the number of occurrences reduces considerably the overall data size. Many variations of RLE One form of RLE is to use a special marker M-byte that will indicate the number of occurrences of a character “c”!# How many bytes are used above? When do you think the M-byte should be used? ABCCCCCCCCDEFGGG is encoded as ABC!8DEFGGG What if the string contains the “!” character? How much is the compression ratio for this example Note: This encoding is DIFFERENT from what is mentioned in your book

Many variations of RLE : Zero-suppression: In this case, one character that is repeated very often is the only character used in the RLE. In this case, the M-byte and the number of additional occurrences are stored. When do you think the M-byte should be used, as opposed to using the regular representation without any encoding?

Many variations of RLE : If we are encoding black and white images (e.g. Faxes), one such version is as follows: (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runk begin, col# runk end) (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runr begin, col# runr end) ... (row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runs begin, col# runs end)

Entropy Encoding: Huffman Coding
One form of variable length coding Greedy algorithm Has been used in fax machines, JPEG and MPEG

Algorithm huffman Input: A set C = {c1 , c2 , ... , cn} of n characters and their frequencies {f(c1) , f(c2 ) , ... , f(cn )} Output: A Huffman tree (V, T) for C. 1. Insert all characters into a min-heap H according to their frequencies. 2. V = C; T = {} 3. for j = 1 to n – 1 4. c = deletemin(H) 5. c’ = deletemin(H) f(v) = f(c) + f(c’) // v is a new node Insert v into the minheap H Add (v,c) and (v,c’) to tree T making c and c’ children of v in T 9. end for

Example

Most important properties of Huffman Coding Unique Prefix Property: No Huffman code is a prefix of any other Huffman code For example, 101 and 1010 cannot be Huffman codes. Why? Optimality: The Huffman code is a minimum-redundancy code (given an accurate data model) The two least frequent symbols will have the same length for their Huffman code, whereas symbols occurring more frequently will have shorter Huffman codes It has been shown that the average code length of an information source S is strictly less than  + 1, i.e.  l’ <  + 1

Entropy Encoding: Adaptive Huffman Coding
The Huffman method assumes that the frequencies of occurrence of all the symbols of the alphabet are known apriori. This is rarely the case in practice Semi-adaptive Huffman coding has been employed where data is read twice, the first pass being to determine the frequencies Disadvantage: Too slow for real-time applications Another solution is Adaptive Huffman Coding Employed by Unix’s “compact” program.

Decoder “mirrors” the operations of the encoder, as they both may occur at different times Main idea of the algorithm is as follows Encoder and Decoder both start with an empty Huffman Coding Tree No symbol is assigned codes yet. First symbol read is written on the output stream in its uncompressed form In fact, each uncompressed character being read for the first time is read this way That is why we need an escape character to determine when we read an uncompressed character for the first time. This escape character is denoted by NEW and given frequency 0 all the time This symbol is then added to the tree and a code is assigned to it.

Next time this symbol is encountered, its code will be written in the output screen and its frequency is increased by 1. Since this modifies the tree, it is checked whether it is a Huffman tree or not If not, it will be rearranged, through swaps, and new codes will be assigned Sibling Property Must be preserved during swaps All nodes are arranged in the order of increasing counts, left to right and bottom to top. During a swap, the farthest node with count N is swapped with the node whose count has just been increased to N + 1.

Example

SWE 423: Multimedia Systems

Similar presentations

Presentation on theme: "SWE 423: Multimedia Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SWE 423: Multimedia Systems

Similar presentations

Presentation on theme: "SWE 423: Multimedia Systems"— Presentation transcript:

Similar presentations

About project

Feedback