Judith Molka-Danielsen, Oct. 29, 2001 [Week 10: Multimedia Management] IN350: Document Handling and Information Management Judith Molka-Danielsen, Oct. 29, 2001 11/15/2018
Multimedia Services: Stages of Processing 1. Capture Synchronize audio, video, and other media streams Compression in real-time 2. Presentation Decompression in real-time Low end-to-end delay (< 150ms ) 3. Storage Quick access for forward and backward retrieval Random access (< 1/2 s) 11/15/2018
Multimedia Services need Compression Typical Example of a Video Sequence 24 image frames/sec 24 bits/pixel = 3 byte/pixel Image resolution 1280 rows x 800 pixels 24x1280x800=24576000 bits=3072000 bytes Data rate at 24 frames/sec could store 8 seconds of a movie, or 200 images on a 600 Mbyte CD ROM Therefore, Compression is required 11/15/2018
Multimedia Services need Compression Transmission At 56 kbps access to the Internet, it would take 439 seconds (7 minutes) to download the previous image. Both fast access to the software and fast transmission is needed. 11/15/2018
General Requirements for Real Time Multimedia Services Low Delay (end-to-end) Low Complexity of compression algorithm Efficient Implementation of compression algorithm (in software, on hardware) High quality output after decompression 11/15/2018
What allows compression of Images? Redundancy – (Unlike text) 2 adjacent rows of a picture (image) are almost identical (the same). Tolerance – The human eye is tolerant to approximation errors in an image. Some information loss can be irrelevant. (Not true for text! You cannot have missing information in financial data.) 11/15/2018
Compression Coding Categories Entropy and Universal Coding – (Lossless) Run Length Coding – code represents a string of pixels as a compressed symbol. In a fax a line of white pixels is represented as one symbol. (Show example) Huffman Coding andAdaptive Algorithms – use the probability of occurance of a symbol to assign symbol. Arithmetic Coding – assume a stream of bits in a file is typical string, code prefix strings. (Lempel-Ziv) 2. Source Coding – (Can be lossy) Prediction, DCT (using FFT-fourier frequency transform), subband coding, quantization File formats that use both - JPEG, MPEG, H.261, DVI RTV, DVI PLV 11/15/2018
Why not use only Lossless compression on Images (as we do with text)? Image files are bigger than text. They need to be compressed more. The human eye cannot detect small losses in resolution. We can get more compressed files by using multiple compression techniques. (Both Lossy and Lossless on the same file.) 11/15/2018
What is Lossy Compression? Simplification – The pattern is a string of pixels, where the neighbor pixels might increase in brightness by 1 level or decrease in brightness by one level. It is hard to see this. So if you had prefix (1,-1) to order changes in brightness, you replace this by (0) no change. Then you have longer strings of 0s. Use RLE next. Median filters – Replace a whole neighborhood of pixels by the average pixel value. This reduces details. Then use RLE. (Most changes are at the edges of objects.) 11/15/2018
Source Coding – Discrete Cosine Transform This type of coding takes into account specific image characteristics and the human sensitiveness (vision). Luminance (brightness): is the overall response of the total light energy dependant on the wavelength of the light. (How light or dark is the pixel.) Saturation: is how pure the color is. Or how diluted it is by the addition of white. The human eye is more sensitive to certain wavelengths than to others. (more to:yellow, green; less to: red, violet). DCT – basic image block is made of 8x8 pixel blocks. 64 pixel values are transformed to a set of DCT weights for these characteristics. 11/15/2018
Discrete Cosine Transformation (DCT) DCT: transforms the image from the spacial domain to the frequency domain. typical pictures have minimal changes in colour or luminance between two adjacent pixels the frequency representation describes the amount of variation. There are small coefficients for high frequencies. The big coefficients of the low frequencies contain most of the information. Coefficients are the weights. Remove small coefficients and Round-off coefficient values. Less data can represent the whole image. 11/15/2018
Discrete Cosine Transformation (DCT) DCT can be used for photos. DCT can not be used for vector graphics or two coloured pictures (black text, white background) DCT can be a lossless technique: Use all coefficients. Do not round off values. Or Lossy information can be cut out. At the quantization step, the divisor selected for the matrix can reduce precision. 11/15/2018
Compression for images, video and audio Basic Capture Encoding Steps: 1. Image Preparation - choose resolution, frame rate. (Choose compression format). 2. Image Processing - DCT. Lossy or lossless. 3. Quantization - analog to digital conversion of data flow. 4. Second stage Encoding step – Use Run Length Encoding or Huffman on the digital string. Lossless. 11/15/2018
JPEG (Joint Photographics Experts Group) JPEG allows different resolution of individual components (Interleaved / non-interleaved) using lossless and lossy modes. There are 29 modes for compressing images). Lossy JPEG uses 1. Simplification (DCT) (lossy stage) and 2. Predictive coding (by Huffman or RLE) (lossless stage). 7 predictive methods for lossless JPEG, main categories: Predict next pixel on a line is same as previous on same line. Next pixel on a line is same as one directly above it. Next pixel on a line is related to previous 3, average. The step after the predicted pixel value is formed, is to compare the value with the actual pixel value. Encode the differences in values. If differences are large you don’t save. But info is not lost. 11/15/2018
JPEG image – original 921,600 byte file, Left is 252,906 bytes using JPEG lossless compression. Right is 33,479 bytes at first level of lossy compression that a person can detect. 11/15/2018
JPEG Compression ratios: 41.1:1, 64.9:1, 102.6:1 Distortions apparent. 22,418 bytes 8,978 bytes Compression ratios: 41.1:1, 64.9:1, 102.6:1 Distortions apparent. 14,192 bytes 11/15/2018
GIF (Graphics Interchange Format) GIF was developed by CompuServe to send images over telephone lines. It allows interlacing of image download to download faster. It supports only 256 colors or shades of gray. It uses Lempel-Ziv-Welch Unversal coding to encode whole bytes. This is faster than LZ that encodes at the bit level. 11/15/2018
Digital Video – a sequence of images It takes 50 images/second for full motion video. Each image in sequence closely resembles the previous image. Edges of moving objects change the most. Different applications have different needs. 11/15/2018
Digital Movie Editing – accuracy Digital Video – Needs Digital Movie Editing – accuracy Digitized TV – low quality image, high bit rate HDTV – higher quality image, higher bit rate Video Dics (DVD) –high capacity storage, TV quality, (CD ROM) lower capcity storage. Internet Video – low access speed, only support low quality picture Video Teleconferencing – low quality picture, but real time, audio cannot take delays. 11/15/2018
MPEG (Moving Pictures Experts Group) Algorithms for Compression of Audio and Video: A movie is presented as a sequence of frames, the frames typically have temporal correlations If parts of frame #1 do not change then they do not need to be encoded in Frame #2 and Frame #3 again. Moving vectors describe the re-use of a 16x16 pixel area in other frames. New areas are described as differences MP3 is just the 3rd Layer of the MPEG standard used for audio. Layer 1 and 2 are for video. 11/15/2018
MPEG Single frame from a high quality 5.26 second video sequence, compressed from original size of 5.648 Mbytes to 2.437 Mbytes with the MPEG-1 compressor, set to preserve essentially all of the quality of the video sequence. See the distortsions in the single frame. 11/15/2018
MPEG frame types MPEG frame types with different temporal correlations: (I-Frames) Intracoded full frames are transmitted periodically. (P-Frames) predictive frames and (B-Frames) bidirectional frames are used for temporal correlations. They can look to the “past” using (P-Frames) and to the “future” using (B-Frames). 11/15/2018
MPEG modes MPEG-1 compressed video, for storage on CDs, 40kbps – 1.2 Mbps per frame. On websites, can store movies on CD discs. MPEG-2 for high quality video or HDTV, 4-10 Mbps per frame. Can store full length high quality movies on DVD discs. MPEG-4 for low quality video teleconferencing. Supported on low bit rates 4.8 kbps – 64 kbps. 11/15/2018
Other Video Compression standards H.261 (Px64): Video compression for video conferences. Compression in real time for use with ISDN. Compressed data stream= px64kbps where p=1 to 30. There are 2 resolutions: CIF (Common Intermediate Format) QCIF (Quarter Intermediate Format) Digital Video Interactive (DVI): Intel/IBM technology for digital video. DVI PLV (Production Level Video, for VCR), DVI RTV (Real Time Video, lower quality). 11/15/2018
Storage Options – see additional notes. Disc Technology: CD ROM, RW, etc. DVD 11/15/2018
System Requirements for real time environments 1. The processing time for all system components must be predictable 2. Reservation of all types of resources must be possible 3. Short response time for time critical processes 4. The system must be stable at high load 11/15/2018
Problems with continuous media Computers handle data as discrete, so continuous media, like video and audio must be viewed as periodical and discrete by the computer. There are few file system services for continuous media applications. Kernel, and hardware availability varies over time. It is difficult to shedule processes. Many interupts are to be handled by the system. Reservation of resources, like CPU time, is not possible on a standard desktop PC. Delivery delays on then network make scheduling periodical processing of the media difficult. 11/15/2018
Possible reponses to Problems The hardware architecture and the system software in desktop computers are not adapted for handling continuous media. Reserve: network bandwidth, if you can. Hardware support: use a dedicated server or replacement of the single asynchronous bus. Operating system software is not suited to schedule multimedia services or reserve resources. But, application level languages such as SMIL can help. SMIL (Synchronized Multimedia Integration Language) is a XML-based language to mix media presentations over low speed connections (>=28.8 kbps). See www.w3c.org 11/15/2018