T305: DIGITAL COMMUNICATIONS Arab Open University-Lebanon Tutorial 121 T305: Digital Communications Block III – Video Coding.

Slides:



Advertisements
Similar presentations
Multimedia System Video
Advertisements

JPEG Compresses real images Standard set by the Joint Photographic Experts Group in 1991.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.
SWE 423: Multimedia Systems
School of Computing Science Simon Fraser University
Chapter 5 Making Connections Efficient: Multiplexing and Compression
School of Computing Science Simon Fraser University
Video enhances, dramatizes, and gives impact to your multimedia application. Your audience will better understand the message of your application.
JPEG.
Comp :: Fall 2003 Video As A Datatype Ketan Mayer-Patel.
CMPT 365 Multimedia Systems
T.Sharon-A.Frank 1 Multimedia Image Compression 2 T.Sharon-A.Frank Coding Techniques – Hybrid.
Multimedia Data The DCT and JPEG Image Compression Dr Mike Spann Electronic, Electrical and Computer.
5. 1 JPEG “ JPEG ” is Joint Photographic Experts Group. compresses pictures which don't have sharp changes e.g. landscape pictures. May lose some of the.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
1 JPEG Compression CSC361/661 Burg/Wong. 2 Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg.
Image Compression JPEG. Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg and can be embedded.
Image and Video Compression
Trevor McCasland Arch Kelley.  Goal: reduce the size of stored files and data while retaining all necessary perceptual information  Used to create an.
CM613 Multimedia storage and retrieval Lecture: Lossy Compression Slide 1 CM613 Multimedia storage and retrieval Lossy Compression D.Miller.
Introduction to JPEG Alireza Shafaei ( ) Fall 2005.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 8 – JPEG Compression (Part 3) Klara Nahrstedt Spring 2012.
ECE472/572 - Lecture 12 Image Compression – Lossy Compression Techniques 11/10/11.
1 Image Compression. 2 GIF: Graphics Interchange Format Basic mode Dynamic mode A LZW method.
MPEG MPEG-VideoThis deals with the compression of video signals to about 1.5 Mbits/s; MPEG-AudioThis deals with the compression of digital audio signals.
The Television Picture
Multimedia Data Video Compression The MPEG-1 Standard
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
Lab #5-6 Follow-Up: More Python; Images Images ● A signal (e.g. sound, temperature infrared sensor reading) is a single (one- dimensional) quantity that.
10/10/04 L5/1/28 COM342 Networks and Data Communications Ian McCrumRoom 5D03B Tel: voice.
Concepts of Multimedia Processing and Transmission IT 481, Lecture 5 Dennis McCaughey, Ph.D. 19 February, 2007.
T325: Technologies for digital media Second semester – 2011/2012 Tutorial 5 – Video and Audio Coding (1-2) Arab Open University – Spring
Video Video.
DIGITAL Video. Video Creation Video captures the real world therefore video cannot be created in the same sense that images can be created video must.
JPEG. The JPEG Standard JPEG is an image compression standard which was accepted as an international standard in  Developed by the Joint Photographic.
Indiana University Purdue University Fort Wayne Hongli Luo
CIS679: Multimedia Basics r Multimedia data type r Basic compression techniques.
JPEG CIS 658 Fall 2005.
Digital Image Processing Image Compression
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
An introduction to audio/video compression Dr. Malcolm Wilson.
Digital Video Digital video is basically a sequence of digital images  Processing of digital video has much in common with digital image processing First.
The task of compression consists of two components, an encoding algorithm that takes a file and generates a “compressed” representation (hopefully with.
JPEG.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 11 COMPRESSION.
Introduction to JPEG m Akram Ben Ahmed
Chapter 3: Data Representation Chapter 3 Data Representation Page 17 Computers use bits to represent all types of data, including text, numerical values,
JPEG. Introduction JPEG (Joint Photographic Experts Group) Basic Concept Data compression is performed in the frequency domain. Low frequency components.
MPEG CODING PROCESS. Contents  What is MPEG Encoding?  Why MPEG Encoding?  Types of frames in MPEG 1  Layer of MPEG1 Video  MPEG 1 Intra frame Encoding.
ARYAN INSTITUTE OF ENGINEERING AND TECHNOLOGY PROJECT REPORT ON TELEVISION TRANSMITTER Guided By: Submitted by: Janmejaya Pradhan Janmitra Singh Reg :
By Dr. Hadi AL Saadi Lossy Compression. Source coding is based on changing of the original image content. Also called semantic-based coding High compression.
1 Chapter 3 Text and image compression Compression principles u Source encoders and destination decoders u Lossless and lossy compression u Entropy.
JPEG Compression What is JPEG? Motivation
IMAGE COMPRESSION.
Chapter 9 Image Compression Standards
Data Compression.
JPEG Image Coding Standard
JPEG.
Data Compression.
Data Compression CS 147 Minh Nguyen.
CMPT 365 Multimedia Systems
Digital Image Processing
UNIT IV.
Judith Molka-Danielsen, Oct. 02, 2000
JPEG Still Image Data Compression Standard
Presentation transcript:

T305: DIGITAL COMMUNICATIONS Arab Open University-Lebanon Tutorial 121 T305: Digital Communications Block III – Video Coding

T305: DIGITAL COMMUNICATIONS 2 Arab Open University-Lebanon Tutorial 12 Introduction Digital video has a number of advantages over analogue for broadcast TV. The effect of transmission impairments on picture quality is far less than in the analogue case. In particular, ‘ghost’ pictures due to the presence of multiple signal transmission paths and reflections are eliminated. bandwidth is nearly always at a premium. Digital television allows more channels to be accommodated in a given bandwidth Different types of programme material such as teletext or sub-titles in several languages can be accommodated much more flexibly with digital coding.

T305: DIGITAL COMMUNICATIONS 3 Arab Open University-Lebanon Tutorial 12 Introduction The message coding technique adopted for Digital Video Broadcasting DVB was MPEG-2, a pre-existing standard (MPEG) which is appropriate for a wide range of video applications. MPEG stands for Motion Picture Experts Group. MPEG has defined a number of standards for the compression of moving pictures:

T305: DIGITAL COMMUNICATIONS 4 Arab Open University-Lebanon Tutorial 12 Introduction Both in films and television, moving scenes are shown as a series of fixed pictures, usually generated at a rate of about 25 per second. There is often very little change between consecutive pictures, and MPEG-2 coding takes advantage of this to achieve high degrees of compression.

T305: DIGITAL COMMUNICATIONS 5 Arab Open University-Lebanon Tutorial 12 The structure of video pictures Fig. A simple raster scanning pattern. The trace is blanked out during the flyback. Monochrome pictures The electron beam in the cathode ray tube (CRT) is made to scan the whole visible surface of the screen in a zig-zag pattern called a raster, shown schematically below.

T305: DIGITAL COMMUNICATIONS 6 Arab Open University-Lebanon Tutorial 12 Monochrome pictures During this forward motion, the beam current is modulated, that is, its magnitude is varied so as to produce a spot of varying brightness on the screen in such a way as to build up a display of the transmitted image. The beam moves much faster and is blanked (effectively zero current) as it flies back from right to left. When a scan is complete, the beam flies back to the top of the screen and starts producing the next picture. This is very much like the cinema in which the illusion of motion is created by a succession of still pictures. In early cinemas, mechanical constraints on the film projectors led to there being too few pictures or frames per second. This produced a flickering sensation, hence the name ‘flicks’.

T305: DIGITAL COMMUNICATIONS 7 Arab Open University-Lebanon Tutorial 12 Monochrome pictures In the case of digital picture coding, the analogue variation of brightness along each line has to be converted into a series of discrete digital samples. The picture is therefore coded as a series of dots called picture elements or pixels. Thus, the resolution of digital displays is normally expressed in terms of the number of pixels as, for instance, 800 × 600 pixels. Here, 800 is the number of pixels in one horizontal line and there are 600 lines on the visible portion of the screen. Other common standards include 640 × 480, the ‘cheap and cheerful’ standard, and 1024 × 768, or even 1600 × 1200 for detailed graphic work.

T305: DIGITAL COMMUNICATIONS 8 Arab Open University-Lebanon Tutorial 12 Adding color Vision is a very complex subject, but it is sufficient to appreciate that, in a human eye, there are three types of receptors sensitive to red, green and blue light. By combining sources of these three primary colors and choosing appropriate intensities for each, we can effectively produce any of the shades of color the human eye can perceive. Color television tubes, for instance, have three electron beams focused in such a way that each beam can only land on one of three separate sets of phosphor spots on the screen, each set producing one of the three primary colors when energized. The spots lie close together, so that the eye perceives a single color resulting from the combination of the local intensities of the three primaries

T305: DIGITAL COMMUNICATIONS 9 Arab Open University-Lebanon Tutorial 12 Adding color The black and white (that is, shades of grey) signal, known as the luminance signal, can be obtained from a suitably weighted sum of the three color signals. The luminance signal, Y, can be expressed as: Y=0.587G R B Two more signals are needed to enable the R,G,B color signals to be reconstituted before feeding to the display device. The signals used are: Cb= 0.564(B-Y)Cr=0.713(R-Y). Cb and Cr are known as the color difference, chrominance or chroma signals. They are defined by the ITU-T for digital video systems.

T305: DIGITAL COMMUNICATIONS 10 Arab Open University-Lebanon Tutorial 12 Digital conversion In a typical video camera, the image of the scene is scanned electronically to produce the chrominance and luminance signals for each picture line. The output of a video camera is essentially analogue and has to be converted to digital form in order to take advantage of the many digital processing techniques which are available. The output bandwidths of analogue broadcast TV cameras and many other analogue video system sources are typically about 5 to 6 MHz for the luminance signal and about half of this for each of the chrominance signals.

T305: DIGITAL COMMUNICATIONS 11 Arab Open University-Lebanon Tutorial 12 Digital conversion Sampling rates numerically equal to at least twice the bandwidths are required, and the ITU-T has standardized on sampling frequencies of 13.5 MHz for the luminance signal and 6.75 MHz for each of the chrominance signals. Both types of signal are quantized using eight bits per sample and linear quantization.  The compression rates that have been achieved for video coding are most impressive. MPEG-2 coding allows broadcast quality pictures to be transmitted at around 4 Mbit/s.

T305: DIGITAL COMMUNICATIONS 12 Arab Open University-Lebanon Tutorial 12 Sampling formats chrominance or color sampling rate is half the luminance sampling rate, the question arises as to when the chrominance samples are taken relative to the luminance samples. Figure (a) represents the luminance sampling. The figure represents part of a camera scanning raster and the circles show the times when the camera output luminance signal is sampled. The samples are taken consecutively along each line at the sampling rate (that is, a sample is taken every 1/(13.5 × 10 6 ) = μs).

T305: DIGITAL COMMUNICATIONS 13 Arab Open University-Lebanon Tutorial 12 Sampling formats Fig. Video sampling. The Cb and Cr chrominance signals are sampled at half the luminance rate and an obvious way of doing this is to take chrominance samples which coincide with alternate luminance ones. This is shown in Figure (b), and is known as 4:2:2 sampling. In many cases it is possible to reduce the number of chrominance samples while still producing acceptable pictures. There are various ways of doing this, one of which can be used for MPEG coding. It is known as 4:2:0 sampling and is shown in Figure (c), Each pair represented by a cross

T305: DIGITAL COMMUNICATIONS 14 Sampling formats lower resolution is acceptable, source intermediate format (SIF)

T305: DIGITAL COMMUNICATIONS 15 Arab Open University-Lebanon Tutorial 12 The coding of still pictures MPEG is designed to squeeze out as much redundancy as possible in order to achieve high levels of compression. This is done in two stages: Spatial compression uses the fact that, in most pictures, there is considerable correlation between neighboring areas in a picture (and, hence, a high degree of redundancy in the data directly obtained by sampling) to compress separately each picture in a video sequence. Temporal compression uses the fact that, in most picture sequences, there is normally very little change during the 1/25 s interval between one picture and the next. The resulting high degree of correlation between consecutive pictures allows a considerable amount of further compression.

T305: DIGITAL COMMUNICATIONS 16 Arab Open University-Lebanon Tutorial 12 The discrete cosine transform The first stage of spatial compression uses a variety of Fourier transform known as a discrete cosine transform (DCT) on 8 × 8 blocks of data.

T305: DIGITAL COMMUNICATIONS 17 Arab Open University-Lebanon Tutorial 12 The discrete cosine transform The discrete cosine transform (DCT) is the one used for JPEG and MPEG coding. It is a reversible transform: applied to n original samples, it yields n amplitude values, and applying the reverse transform to these n amplitudes enables one to recover the original sample values. If the high-frequency components are sufficiently small, then setting them to zero before carrying out the reverse transform will produce a picture which, to a human observer, is effectively the same as the original one. This is the essence of the compression process.

T305: DIGITAL COMMUNICATIONS 18 Arab Open University-Lebanon Tutorial 12 The discrete cosine transform The DCT can be used to take advantage of correlations between pixels horizontally along a picture line. But one could use the same technique on vertical columns of pixels, but a much higher degree of compression can be achieved by using it simultaneously in both directions. This is done by applying a two-dimensional DCT to rectangular 8 × 8 blocks of pixels. The two-dimensional DCT applied to the 64 luminance values of an 8 × 8 block yields 64 amplitudes of two- dimensional spatial cosine functions. These are shown below. The spatial frequencies range from 0 (dc term) to 7 in both directions. The luminance in each block varies as a cosine function in both the horizontal and vertical directions.

T305: DIGITAL COMMUNICATIONS 19 Arab Open University-Lebanon Tutorial 12 The discrete cosine transform Fig. Example of an 8 × 8 DCT. In Figure above, each amplitude applies to a different component and the way the amplitudes are ordered in the right-hand transform output block is shown below. Fig. Block notation.

T305: DIGITAL COMMUNICATIONS 20 Arab Open University-Lebanon Tutorial 12 The discrete cosine transform The output block is organized so that the horizontal frequencies increase from left to right and the vertical frequencies increase from top to bottom. The top-left component, with zero vertical and horizontal frequencies, is the dc term which represents the average luminance of the block.

T305: DIGITAL COMMUNICATIONS 21 Arab Open University-Lebanon Tutorial 12 Thresholding and requantization Humans are not very sensitive to fine detail at low luminance levels. This allows higher spatial frequency components to be eliminated. Also, in general, humans are less sensitive to the contribution of high-frequency components compared with lower ones. This is taken into account by using requantization: fewer bits are used for the higher- frequency components than for the low-frequency ones. The DCT output block of amplitudes is reproduced in Table 4.1.

T305: DIGITAL COMMUNICATIONS 22 Arab Open University-Lebanon Tutorial 12 Thresholding and requantization A requantization table which is stored in the encoder is used. One of the tables used for the luminance coding in MPEG-2 is shown in Table 4.2. Each amplitude value in the DCT output table is divided by the corresponding number in the quantization table and the result, rounded to the nearest integer, is used to replace the original amplitude.

T305: DIGITAL COMMUNICATIONS 23 Arab Open University-Lebanon Tutorial 12 Thresholding and requantization

T305: DIGITAL COMMUNICATIONS 24 Arab Open University-Lebanon Tutorial 12 Zig-zag scan and run-length encoding In general, the higher the frequencies, the more zeros there are in the requantized block. In order to take advantage of this, the requantized values are rearranged for further processing in the order shown below, which places them in order of ascending frequency for the horizontal and vertical directions combined. The result of this is that there are relatively long sequences consisting entirely of zeros. This is rather like the sequences of pels of the same ‘colour’ (i.e. black or white) in faxed documents and, as in that case, run-length encoding leads to useful compression

T305: DIGITAL COMMUNICATIONS 25 Arab Open University-Lebanon Tutorial 12 Zig-zag scan and run-length encoding The dc term is coded separately using differential coding. This just involves sending the difference between the value of the dc term and the value of the dc term of the contiguous block that was encoded immediately before. The small jumps in average luminance from one block to the next can cause the block structure used for coding to become apparent. This effect is known as blocking. Blocking is minimized by differential coding of the dc term because the small difference between consecutive values allows greater accuracy for the dc terms and thus reduces the size of the steps arising from the quantization process.

T305: DIGITAL COMMUNICATIONS 26 Arab Open University-Lebanon Tutorial 12 Zig-zag scan and run-length encoding For Table 4.3 values, zig-zag scanning gives 103, followed by two zeros, −2, 1, 1, two zeros, 1, 42 zeros, 1, and finally 12 zeros. Assuming that the difference between the current and previous dc terms was 4, the data after zig-zag scanning and run-length coding would be sent as: 4, 2, −2, 0, 1, 0, 1, 2, 1, 42, 1, 12

T305: DIGITAL COMMUNICATIONS 27 Arab Open University-Lebanon Tutorial 12 Huffman coding The final step in the coding of single pictures is to use Huffman coding for pairs of numbers resulting from the run-length encoding. Separate tables, stored in the encoder, are used for the run length and for the luminance values. The tables have been constructed from the statistics of typical luminance data. Summary of spatial coding The coding for the chrominance is essentially the same as for the luminance, but different quantization tables and Huffman code tables are used, based on the statistics of typical sets of chrominance data and on relevant features of our perception of color. The overall coding process for a single picture is summarized below. The decoder, at the receiving end, carries out the same processes in reverse using the same set of tables as the encoder.

T305: DIGITAL COMMUNICATIONS 28 Arab Open University-Lebanon Tutorial 12 Summary of spatial coding The coding techniques can be divided into two categories: Reversible or lossless coding, for which the exact data can be recovered after decoding. Huffman and run-length encoding are examples. The DCT is also effectively reversible, although some errors are, in fact, introduced through rounding and other effects. Reversible coding preserves all the information contained in the signal. Non-reversible or lossy coding, which causes some information to be lost irrecoverably. The requantization, which reduces the number of bits per sample, is non- reversible.

T305: DIGITAL COMMUNICATIONS 29 Arab Open University-Lebanon Tutorial 12 Fig. Summary of spatial coding

T305: DIGITAL COMMUNICATIONS 30 Arab Open University-Lebanon Tutorial 12 The coding of moving pictures – MPEG When pictures are produced at a rate of 25 per second, there cannot be much change between one picture and the next as a scene evolves in time, except, occasionally, when the camera or video editor cuts abruptly from one scene to another. There is, therefore, a considerable amount of temporal redundancy which is exploited in MPEG by, in effect, only transmitting the differences between one scene and the next. MPEG-1 was designed for use in conjunction with compact disks (CDs) used for multi-media material and, although it cannot cope with the much higher video bit-rates required for broadcast television, it can cope with the much lower rates required for audio coding.