MPEG-4, NETWORKED MULTIMEDIA STANDARD MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA DATA REPRESENTATION MULTIMEDIA SYSTEMS CAN BE MADE IF THERE IS REPRESENTATION OF MULTIMEDIA DATA. SUCH REPRESENTATION SHOULD BE STANDARDIZED SO EVERYBODY CAN USE IT THERE ARE MANY DIFFERENT SUCH STANDARDS, WE WILL GIVE EXAMPLE OF MOST A ADVANCED ONE MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 IS A STANDARD WHICH ENABLES EFFICIENT REPRESENTATION OF COMPLEX MULTIMEDIA DATA FOR FULLY INTERACTIVE APPLICATIONS THE MPEG-4 STANDARD IS OBJECT-BASED MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
VIDEO OBJECT-BASED TREATMENT AND COMPRESSION MAKE THE MPEG-4 STANDARD VERY FLEXIBLE MODERN AND EFFICIENT BASIC VIDEO AND AUDIO PARTS OF THIS STANDARD ARE USED IN DIGITAL TELEVISION AND BLUE RAY DISCS BUT MPEG-4 INVOLVES ALL DATA TYPES AND ALSO SPECIAL TOOLS FOR PARTICULAR APPLICATIONS AND WE GIVE EXAMPLES OF THEM NEXT MULTIMEDIA SYSTEMS IREK DEFEE
Synthetic Visual Tools in MPEG-4 MULTIMEDIA SYSTEMS IREK DEFEE
Motivation A new type of data is appearing in multimedia applications: Synthetic Both synthetic and natural data can co-exist in today’s applications This new data needs to be compressed and streamed in most applications New technologies are needed for: Compression and streaming of synthetic data MULTIMEDIA SYSTEMS IREK DEFEE
Some examples of applications 3D video Augmented reality Telepresence Scientific visualization Virtual reality … MULTIMEDIA SYSTEMS IREK DEFEE
MULTIMEDIA SYSTEMS IREK DEFEE
Application example: Telepresence The room is generated in computer MULTIMEDIA SYSTEMS IREK DEFEE
Application example: Scientific visualization MULTIMEDIA SYSTEMS IREK DEFEE
Application example: Virtual 3D world MULTIMEDIA SYSTEMS IREK DEFEE
Synthetic Visual Tools in MPEG-4 Version 1 Face animation 2D dynamic mesh Scalable coding of synthetic texture View dependent scalable coding of texture Version 2 Body animation 3D model compression MULTIMEDIA SYSTEMS IREK DEFEE
Face Animation MULTIMEDIA SYSTEMS IREK DEFEE
Example of applications Virtual meeting, tele-presence, video- conferencing, ... Virtual story teller, virtual actor, user interface, ... Games, avatars, ... MULTIMEDIA SYSTEMS IREK DEFEE
Face animation Face: an object ready for rendering and animation A realistic representation of a “human” face Capable of animation by a reasonable set of parameters driven by speech, facial expressions, or others MULTIMEDIA SYSTEMS IREK DEFEE
Shape, texture and expressions: Face animation Shape, texture and expressions: specified parameters in the incoming bitstream remote as well as local control of these parameters MULTIMEDIA SYSTEMS IREK DEFEE
Initial face object Gaze along the Z axis All face muscles relaxed Eyelids tangential to the iris Pupil one-third of full eye Lips: in contact; horizontal Mouth: closed; upper and lower teeth touching Tongue: flat; tip touching front teeth MULTIMEDIA SYSTEMS IREK DEFEE
Face animation parameters Three sets of parameters used to describe a face and its animation characterstics: Facial Definition Parameters (FDPs) Facial Animation Parameters (FAPs) Facial Interpolation Transform (FIT) MULTIMEDIA SYSTEMS IREK DEFEE
Facial Definition Parameters - FDPs Defines a specific face via 3D feature points 3D mesh/scene graph Face Texture Face Animation Table (FAT) MULTIMEDIA SYSTEMS IREK DEFEE
Face Feature Points Normalize animation parameters Find feature correspondence in different faces Roughly define shape MULTIMEDIA SYSTEMS IREK DEFEE
3D mesh and feature points MULTIMEDIA SYSTEMS IREK DEFEE
Texture MULTIMEDIA SYSTEMS IREK DEFEE
FDPs Face Description Parameters Two modes: To customize the face model at the receiver to a particular face To download a face model along with its animation information Generally, sent once per session for calibration and/or download could be sent more often for “special effects” MULTIMEDIA SYSTEMS IREK DEFEE
Face Animation Parameters - FAPs Represent a complete set of facial actions => allow representation of most of the natural facial expressions All FAPs involving translational movement: in terms of Facial Animation Parameter Units (FAPUs) Allows consistent interpretation of FAPs on any facial model. MULTIMEDIA SYSTEMS IREK DEFEE
FAPUs MULTIMEDIA SYSTEMS IREK DEFEE
FAPs Facial Action Parameters 2 high-level FAPS: 66 low level FAPs Viseme (visual correlate of phoneme) Expression (joy, anger, fear, disgust, sadness, surprise) textual description of expression parameters point to groups of FAPs used together to achieve an expression 66 low level FAPs associated with the displacement or rotation of the facial feature points MULTIMEDIA SYSTEMS IREK DEFEE
What is not standardized ? The way to extract the parameters Markers Speech driven Image analysis and feature extraction The choice on which parameters to code and with which precision Quantization Rate control MULTIMEDIA SYSTEMS IREK DEFEE
FITs Facial Interpolation Rules Specification of interpolation rules for some/all FAPs by the sender. Sender specifies FAP Interpolation Graph (FIG) and set of interpolation functions Allows higher degree of control over the animation results. MULTIMEDIA SYSTEMS IREK DEFEE
What is standardized? FDPs FAPs FITs BIFS Syntax and Semantics Rules for decoding and adaptation FAPs Bitstream syntax and semantics Rules for decoding and animation FITs Syntax and semantics Decoding rules MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 AUDIO This part of the standard is for audio signals IT COVERS VERY BROAD RANGE Speech Coding General Audio Coding Scalable Audio Coding MULTIMEDIA SYSTEMS IREK DEFEE
Media Objects and Associated Operations Natural audio Synthetic audio Control Operations on objects Synchronize Decode Compose into compound objects Present Interact MULTIMEDIA SYSTEMS IREK DEFEE
Advantages of Object Framework Each signal coded with most efficient coding system Natural Synthetic Composition of objects into audio scene Rate conversion Mix and Equalization Effects Final mix is done in the terminal MULTIMEDIA SYSTEMS IREK DEFEE
System Overview Objects are delivered separetely, synchronized, decoded and composed MULTIMEDIA SYSTEMS IREK DEFEE
Audio Object Functionalities Signal compression Scalability bit rate signal bandwidth presentation rate encoder or decoder complexity Extraction and re-use Robustness to channel errors MULTIMEDIA SYSTEMS IREK DEFEE
Scalability depending on the bit rate MULTIMEDIA SYSTEMS IREK DEFEE
Application Domains Profiles Speech low rate speech coders and TTS (Text-to-Speech) Scalable speech coders general audio coders all coders in scalable configuration Synthetic wavetable synthesis score driven synthesis TTS MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 Speech Coding: Overview Excellent compression by using source model Linear Predictive Coding (LPC) Pitch or noise excitation Better compression than “general audio” coders only for “clean speech” from single talker MULTIMEDIA SYSTEMS IREK DEFEE
Speech Coders Harmonic Vector Excitation Coder (HVXC) Code Excitation Linear Prediction (CELP) Wideband CELP MULTIMEDIA SYSTEMS IREK DEFEE
Communication Characteristics of Coders Low bit rate HVXC 1.2 kb/s to 1.7 kb/s var. rate 2.0 kb/s to 4.0 kb/s const. rate CELP 4.0 kb/s to 24 kb/s const. rate Low one-way delay HVXC 33.5 ms to 56 ms CELP 15 ms to 45 ms Not compromised for modem signals MULTIMEDIA SYSTEMS IREK DEFEE
Bit Rate Scalability Parameters coded using multi-stage Vector Quantization base plus enhancement layer Enhancement layers can be stripped in server channel decoder MULTIMEDIA SYSTEMS IREK DEFEE
Parameter Update Scalability Linear Prediction Model updated every frame interpolated every sub-frame Excitation gain updated every subframe MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 BIFS THIS MECHANISM IN THE MPEG-4 STANDARD IS CALLED MPEG-4 DATA INCLUDE VARIOUS MEDIA TYPES WHICH CAN BE USED AT THE SAME TIME THIS REQUIRES MECHANISM FOR THEIR ORGANIZATION IN TIME AND SPACE THIS MECHANISM IN THE MPEG-4 STANDARD IS CALLED BIFS – BINARY FORMAT FOR SCENES MULTIMEDIA SYSTEMS IREK DEFEE
BIFS: WHY? MPEG-4 is an object based system => A Scene Description is needed to compose BIFS is the MPEG-4 scene description protocol to compose MPEG-4 objects to describe interaction with MPEG-4 objects to animate MPEG-4 objects MULTIMEDIA SYSTEMS IREK DEFEE
Example of an MPEG-4 Audiovisual Scene (1) 2D Audio-visual scene Audio and Video + Scrolling Text and Still Images 2D Audio-visual sne Audio and Video + Still Images MULTIMEDIA SYSTEMS IREK DEFEE
Example of an MPEG-4 Audiovisual Scene (2) 3D Audio-visual scene 3D World + arbitrary shaped video + still images + 3D Objects MULTIMEDIA SYSTEMS IREK DEFEE
BIFS Scene Features (v2) Body Animation Advanced Audio Perceptual approach to modify natural source Acoustic properties for physical based audio rendering Stream and server control VCR controls and Application specific messaging Extensibility (Prototypes) Definition of new BIFS interfaces Hierarchical 3D objects Progressive loading and local degradation of 3D mesh Web interface Linking and embedding of a web page MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 Systems Principle We have data stream describing whole scene ... Scene Description Stream Object Descriptor Stream Visual Stream Audio Stream Interactive Scene Description We have data stream describing which objects are there We have data streams separate for each object MULTIMEDIA SYSTEMS IREK DEFEE
BIFS content in MPEG-4 system SCENE GRAPH MANAGEMENT FBA S&N Sound Interaction 2D+3D Nodes 3D Nodes Audio Nodes 2D Nodes DECODING D E C O I N G PRESENTATION C O M P S I T N DELIVERY BIFS-Update ES R E N D I N G BIFS Anim ES VRML Nodes MPEG-4 Nodes MPEG-4 Streams MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4: An integrated Multimedia System Decoding TransMux ... ... Elementary Streams FlexMux Composition and Network Rendering Primitive AV Objects ... Scene Description Information MPEG-4 Interactive Scene Display and Local User Interaction Organised by BIFS Object Descriptor MULTIMEDIA SYSTEMS IREK DEFEE
BIFS Delivery: BIFS Command Scene Graph Root Transform BikeSwitch Body Transform Right Leg Head Right Arm Left Arm Left Leg Body Bike -1 Switch BIFS-Command ES RS … CV MULTIMEDIA SYSTEMS IREK DEFEE
BIFS Delivery: BIFS Anim Scene Graph Root Transform BikeSwitch Body Transform Right Leg Head Left Arm Right Arm Left Leg Body Bike BIFS-Anim ES ... P P P I MULTIMEDIA SYSTEMS IREK DEFEE
BIFS Scene Compression factor 10-25 on scene text files Context dependency Hierarchical, linear quantization of scene data Differential multiple fields coding and mesh coding integration (v2) BIFS-Anim factor 15-30 compression of animation Linear quantization Predictive coding (including rotation and normals) Adaptive arithmetic encoding MULTIMEDIA SYSTEMS IREK DEFEE
BIFS Scene Features Audio video (objects) playback 2D Composition & Graphics 2D composition, Basic shapes, 3D Composition & Graphics Full VRML capabilities Advanced audio composition Interactivity and Behavior Local manipulation and animation of objects Scripting (javascript) Programming of behaviors Face Animation MULTIMEDIA SYSTEMS IREK DEFEE
MPEG-4 & BIFS based services Client Stored Content Broadcast BIFS Live Source / User Communication MULTIMEDIA SYSTEMS IREK DEFEE
Conclusion BIFS provides a rich toolkit for composition of MPEG-4 media in very flesible and general way BIFS can be profiled to fit best the application area Provides a good mix of Functionality Complexity Compression MULTIMEDIA SYSTEMS IREK DEFEE