Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Similar presentations


Presentation on theme: "A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation."— Presentation transcript:

1 A Picture is Worth a Thousand Words Milton Chen

2 What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation –1000 * 5 * 5 / 3  8,000 bits 75,000 bytes - ATSC/MPEG-2 –20 M / 30  600,000 bits

3 Frequency Response of the Eye Lens - low pass Photoreceptors - low pass Lateral inhibition - high pass –edge is important

4 Today’s Video Coding YUV (lossy) MotionDCT Quantize (lossy) EntropyOrder Designed for natural scenes => Higher frequency DCT coefficients are quantized more => Sharp edges are not well preserved

5 What’s Wrong with Today’s Video Coding Poor performance for –text (channel logo, stock ticks) –graphics –anything with sharp edges

6 Desirable Features Postproduction support Personalized delivery / presentation Interactive Error resilience More compression Facilitate search / indexing (MPEG-7)

7 Outline Why MPEG-4 Overview Systems Layer Visual Coding –Arbitrarily shaped video –Meshed video –Face and body

8 Goals of MPEG-4 One content –convergence of DTV, computer graphics, and WWW –broadcast, internet, local User interactivity Higher compression rates Robustness in mobile environment

9 MPEG-4 Applications Interactive TV (broadcast) –Home-shopping, Interactive game show Virtual workspace(internet) –virtual meeting, collaborative design Infotainment(local) –Virtual-City-Guide

10 MPEG-4 Key Concepts Independent coding of objects –allow user interactivity (client & server) –higher compression rates Provide tools as well as solutions –allow content specific and user defined compression algorithms

11 MPEG-4 History Started in July 1993 Originally for low-bit-rate applications Version 1 to be standardized by January 1999 Continue work on version 2, etc.

12 MPEG-4 Standard 1) Systems (manage streams, composition) 2) Visual (natural and synthetic) 3) Audio (natural and synthetic) 4) Conformance Testing 5) Reference Software 6) Delivery Multimedia Integration Framework (medium abstraction layer)

13

14

15

16 Previous Work in Object Coding Synthetic High System (Schreiber ‘59) Contour-Texture Approach (Kocher & Kunt ‘82) Object-Based Video Coder (Musmann et. al. ‘89) Talisman (Torborg & Kajiya ‘96) Blue screen matting (Vlahos ‘64)

17 Shape Coding Bitmap-based –1 means in, 0 means out –Chroma-keying, GIF89a –G4 fax standard Contour-based –chain code –polygon/curve approximation –Fourier descriptor

18 Chain Code Follows the contour and encode the direction of next boundary pel 4 or 8 directions for an avg. of 1.2 or 1.4 bits per boundary pel Extensions –length –angular resolution

19 Polygon Approximation Add control points until maximum error is below threshold Threshold <= 1.4 pel for CIF (352*288) video Extension –curves of various order

20 Fourier Descriptor Translation, rotation, and scale invariant Sample contour -> ( x i, y i ) i, ( y i+1 - y i ) / ( x i + 1 - x i ) Compute Fourier Series coefficients Good for recognition, but not an efficient shape coder

21 MPEG-4 Experiments Chroma-keying –color bleeding –need to decode whole frame to get shape Bitmap and contour-based coding are similar in: –error resilience –coding efficiency Bitmap-based is simpler for hardware due to regular memory access

22 MPEG-4 Shape Coding Three types of macroblocks –transparent, opaque, and object boundary Context-based arithmetic encoder Macroblocks can be subsampled Texture padded with 0 or mean value Transparency –constant: one 8 bit value –arbitrary: treat it like color

23 Meshed Video 2D mesh tessellates the video into patches Motion vector for each vertex Texture warped in each patch

24 Meshed Video - Motivation Motion Modeling –Translational-block motion does not model rotation, scaling, reflection, and shear Shape Modeling –Possible without depth

25 Meshed Video - Applications Compression –better motion compensation –transmit texture only at key frames –spatio-temporal interpolation (zooming, frame- rate up-conversion) Manipulation –augmented reality –transfiguration (replace billboards) Indexing / searching

26 Face Face object –Default face model with terminal –Facial Definition Parameter or user supplied model/texture –Facial Animation Parameter plus Amplification and Filters –Lip Shape Animation from phoneme

27 Facial Definition Parameter

28 Facial Animation Parameter

29 Body Like the face

30 Ultimate Compression Technique Computer Graphics ??? Block based DCT(MPEG-1/2) Arbitrary shaped video (MPEG-4) Meshed video (MPEG-4) Image based rendering Textured 3D graphics Geometry only 3D graphics


Download ppt "A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation."

Similar presentations


Ads by Google