Download presentation
Presentation is loading. Please wait.
1
A Picture is Worth a Thousand Words Milton Chen
2
What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation –1000 * 5 * 5 / 3 8,000 bits 75,000 bytes - ATSC/MPEG-2 –20 M / 30 600,000 bits
3
Frequency Response of the Eye Lens - low pass Photoreceptors - low pass Lateral inhibition - high pass –edge is important
4
Today’s Video Coding YUV (lossy) MotionDCT Quantize (lossy) EntropyOrder Designed for natural scenes => Higher frequency DCT coefficients are quantized more => Sharp edges are not well preserved
5
What’s Wrong with Today’s Video Coding Poor performance for –text (channel logo, stock ticks) –graphics –anything with sharp edges
6
Desirable Features Postproduction support Personalized delivery / presentation Interactive Error resilience More compression Facilitate search / indexing (MPEG-7)
7
Outline Why MPEG-4 Overview Systems Layer Visual Coding –Arbitrarily shaped video –Meshed video –Face and body
8
Goals of MPEG-4 One content –convergence of DTV, computer graphics, and WWW –broadcast, internet, local User interactivity Higher compression rates Robustness in mobile environment
9
MPEG-4 Applications Interactive TV (broadcast) –Home-shopping, Interactive game show Virtual workspace(internet) –virtual meeting, collaborative design Infotainment(local) –Virtual-City-Guide
10
MPEG-4 Key Concepts Independent coding of objects –allow user interactivity (client & server) –higher compression rates Provide tools as well as solutions –allow content specific and user defined compression algorithms
11
MPEG-4 History Started in July 1993 Originally for low-bit-rate applications Version 1 to be standardized by January 1999 Continue work on version 2, etc.
12
MPEG-4 Standard 1) Systems (manage streams, composition) 2) Visual (natural and synthetic) 3) Audio (natural and synthetic) 4) Conformance Testing 5) Reference Software 6) Delivery Multimedia Integration Framework (medium abstraction layer)
16
Previous Work in Object Coding Synthetic High System (Schreiber ‘59) Contour-Texture Approach (Kocher & Kunt ‘82) Object-Based Video Coder (Musmann et. al. ‘89) Talisman (Torborg & Kajiya ‘96) Blue screen matting (Vlahos ‘64)
17
Shape Coding Bitmap-based –1 means in, 0 means out –Chroma-keying, GIF89a –G4 fax standard Contour-based –chain code –polygon/curve approximation –Fourier descriptor
18
Chain Code Follows the contour and encode the direction of next boundary pel 4 or 8 directions for an avg. of 1.2 or 1.4 bits per boundary pel Extensions –length –angular resolution
19
Polygon Approximation Add control points until maximum error is below threshold Threshold <= 1.4 pel for CIF (352*288) video Extension –curves of various order
20
Fourier Descriptor Translation, rotation, and scale invariant Sample contour -> ( x i, y i ) i, ( y i+1 - y i ) / ( x i + 1 - x i ) Compute Fourier Series coefficients Good for recognition, but not an efficient shape coder
21
MPEG-4 Experiments Chroma-keying –color bleeding –need to decode whole frame to get shape Bitmap and contour-based coding are similar in: –error resilience –coding efficiency Bitmap-based is simpler for hardware due to regular memory access
22
MPEG-4 Shape Coding Three types of macroblocks –transparent, opaque, and object boundary Context-based arithmetic encoder Macroblocks can be subsampled Texture padded with 0 or mean value Transparency –constant: one 8 bit value –arbitrary: treat it like color
23
Meshed Video 2D mesh tessellates the video into patches Motion vector for each vertex Texture warped in each patch
24
Meshed Video - Motivation Motion Modeling –Translational-block motion does not model rotation, scaling, reflection, and shear Shape Modeling –Possible without depth
25
Meshed Video - Applications Compression –better motion compensation –transmit texture only at key frames –spatio-temporal interpolation (zooming, frame- rate up-conversion) Manipulation –augmented reality –transfiguration (replace billboards) Indexing / searching
26
Face Face object –Default face model with terminal –Facial Definition Parameter or user supplied model/texture –Facial Animation Parameter plus Amplification and Filters –Lip Shape Animation from phoneme
27
Facial Definition Parameter
28
Facial Animation Parameter
29
Body Like the face
30
Ultimate Compression Technique Computer Graphics ??? Block based DCT(MPEG-1/2) Arbitrary shaped video (MPEG-4) Meshed video (MPEG-4) Image based rendering Textured 3D graphics Geometry only 3D graphics
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.