CS644 Advanced Topics in Networking Multimedia Mon, 2004/9/13
Video Basics Compression Performance issues Available tools Lossless compression MPEG-1,2,4, and 7 Performance issues Old & new Available tools
Video Basics: Representations RGB each pixel per red, green, blue YUV Y: luminance UV: chrominance better match the human visual system can distinguish the luminance (brightness) better than its hue (color)
Compression Techniques Lossless MPEG-1,2,4, adn 7
Lossless Compression Run Length Encoding AAABBCDDD => 3A2B1C4D Difference Pulse Code Modulation AAABBCDDD => A0001123333 Dictionary-based Lempel-Ziv (LZ) GIF: uses standardized colormap Huffman Coding
Huffman Coding 0 1 4 3 4 4 0 0 01 1 001 3 000 4 1 1 1 4 4 3 3 4 4 4 4 0 1 0 0 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 0 1 0 1 0 0 1 1 0 1 0 0 1 1 0 0 0 To perform Huffman coding, the first step is to use the frequency statistics to generate a prefix code representation of the indices, such as the one shown here. Notice how the more frequently appearing symbols are represented with a shorter code. All that is left then is to output the bit stream corresponding to the codeword representations of the indices, as illustrated here. As you can see, the compression pipeline is considerably more complicated than a simpler encoder, but the effort we spent in modeling the data means that we can reduce the input redundancies to the best of our capabilities. We can perform a simple analysis for the efficiency of this scheme. The input string consists of 6 characters, each of 8 bits in length, so 48 bits in total. The final representation of this information is only 13 bits, compressed to almost 4 times the original size. As you can see, even with this trivial example, the amount of reduction is no less than remarkable. To summarize, the Burrows-Wheeler compression procedure consists of four stages of transformation. The first stage aims to reduce simple repetitions of input symbols, and the second stage transforms the data by grouping symbols with similar context. The third stage utilizes the locality of symbols in the sorted block, and finally, the entropy coding stage outputs optimal representation of the original data. The reverse transform unravels the entropy-coded information in a similar manner and will restore the original data without any loss. Input = 6 x 8 = 48 bits Output = 13 bits Compression = 73 %
Wavelets “Mathematical functions that cut up data into different frequency components and study each component with a resolution matched to its scale”
Haar transform decomposes a discrete signal into two subsignals of half its length f = (4,6,10,12,8,6,5,5) (a|d) = (5x21/2, 11x21/2 7x21/2, 5x21/2| -21/2, -21/2, 21/2, 0) E = f12+ f22+ f32+ ... + f82 = 446 E(a) = 440, E(d) = 6
Haar transform
MPEG Moving Picture Experts Group in ISO/IEC specifies:
Multimedia Technology Standards Basic Set and Animation/graphics/text Audio/visual objects composed in 3-D space Allows generic management of delivery systems Basic Set Source Delivery Demultiplexer Audio/video MPEG - 4 1998-2001 MPEG - 7 Current challenge: Specify standard set of descriptors for various types of MM information MPEG - 1 MPEG - 2 Basic Set and supporting high quality and interlace video
MPEG-1 Started on 1988 A/V
MPEG-2 A/V for broadcast 4~15 Mbps for digital TV digital TV, HDTV, cable, satellite 4~15 Mbps for digital TV
MPEG-4 Goals for natural and synthetic video object-oriented coding low-bit rate for wireless MPEG-2 backward compatibility DCT -> Quantization -> Encoding
Compression:Temporal Redundancy Without B-frame With B-frame
MPEG-7 Goals Officially, “Multimedia Content Description Interface” Allow efficient search for multimedia content that is of interest to user using standardized descriptions MPEG-1,2,4: representation of content itself (the bit) MPEG-7: representation of information about the content (the bit about the bits)
What is Exactly Included? Not the analysis Not the search engine Just the description Feature Extraction Description Search Engine This is the scope of MPEG-7
MPEG-7 Components Descriptors (Ds) Description Schemes (DSs) representations of features define the syntax and semantics Description Schemes (DSs) specify structure and semantics of the relationships between components (Ds and DSs) Description Definition Language (D이) allow creation of new DSs and Ds allow extension and modification of DSs
Example of a client-server architecture in a MPEG-7 based data search Initial Descriptor Table 1011 text link feature selection Interface Interface Search Engine MPEG-7 stream MPEG-7 Decoder MPEG-7 database Query result (link address) AV content Interface Interface Presentation Engine AV content AV data Decoder AV stream AV content
Networking Issues Different from VoIP Higher b/w requirements No silence periods Complex codecs More loss tolerant
Video Streaming Old Issues Current Issues Variable Bit Rate Server-side transmission scheduling Network provisioning Current Issues Multimedia in “wireless” environment Batter-power limited, diverse technologies, low-layer dependent, application-dependent, etc.
Acknowledgements Kave Salamtian’s “Multimedia Storage and Compression”