Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to MPEG-4 MC2008 2018/11/19 MC2009.

Similar presentations


Presentation on theme: "Introduction to MPEG-4 MC2008 2018/11/19 MC2009."— Presentation transcript:

1 Introduction to MPEG-4 MC2008 2018/11/19 MC2009

2 Outline Multimedia MPEG-4 Profiles Key Features of MPEG-4 Systems
DMIF Audiovisual Objects and Scene Graph Editing, Composition and Rendering Coding Basics Coding Techniques 2018/11/19 MC2009

3 Multimedia What is multimedia? Why does multimedia need to be coded?
Combination of audio, video, image, graphic, and text. Coverage of all human I/O’s. Why does multimedia need to be coded? 2018/11/19 MC2009

4 2018/11/19 MC2009

5 Multimedia Coding for Different Applications
Mobile devices Low data-rate, error resilience, scalability Streaming service Scalability, low to medium data-range, interactivity On-disk distribution (DVD) Interactivity Broadcast On-demand services 2018/11/19 MC2009

6 Profiles in MPEG-4 Visual Profiles Audio Profiles Graphics Profiles
Scene Graph Profiles MPEG-J Profiles Object Descriptor Profile 2018/11/19 MC2009

7 NewPred 2018/11/19 MC2009

8 H.263 Baseline 2018/11/19 MC2009

9 Key Features of MPEG-4 Systems
Provides a consistent and complete architecture for the coded representation of the desired combination of streamed elementary audio-visual information. Covers a broad range of applications, functionality and bit rates. Through profile and level definitions, it establishes a framework that allows consistent progression from simple applications (e.g., an audio broadcast application with graphics) to more complex ones (e.g., a virtual reality home theater). 2018/11/19 MC2009

10 Key Features of MPEG-4 Systems (2)
A set of tools for the representation of the multimedia content a framework for object description (the OD framework), BIFS: a binary language for the representation (format) of multimedia interactive 2D and 3D scene description, SDM and SyncLayer: a framework for monitoring and synchronizing elementary data stream, and MPEG-J: programmable extensions to access and monitor MPEG-4 content. 2018/11/19 MC2009

11 Key Features of MPEG-4 Systems (3)
MPEG-4 System defines an efficient mapping of the MPEG-4 content on existing delivery infrastructures. FlexMux: an efficient and simple multiplexing tool to optimize the carriage of MPEG-4 data (into different QoS channels), Extensions allowing the carriage of MPEG-4 content on MPEG-2 and IP systems, and a flexible file format for authoring, streaming and exchanging MPEG-4 data. 2018/11/19 MC2009

12 MPEG-4 IS0/IEC 14496 Terminal Architecture
2018/11/19 MC2009

13 Systems Timing Model Buffer Model Multiplexing of Streams
Synchronization of Streams The Compression Layer Object Description Framework Scene Description Streams Audio-visual Streams Upchannel Streams 2018/11/19 MC2009

14 Systems Decoder Model 2018/11/19 MC2009

15 2018/11/19 MC2009

16 IS0/IEC 14496 Terminal Architecture
2018/11/19 MC2009

17 Network-based Multimedia System
一個網路為主的多媒體系統可以分成Application Layer, Compression Layer, Transport Layer, Transmission Layer四個階層來看。Traffic shaping及Scalable rate control(SRC)都是常用來消除由於網路的delay jitter及可用的網路資源(如頻寛和Buffer)的方法。 Traffic shaping是一個transport layer的方法,而本篇論文討論的SRC則是一個Compression Layer的方法。 Traffic shaping的基本觀念是在Encode Video之前就先把Traffic Pattern先shaping 到想要的特性,如訂出最大延遲時間及瞬間峰值等。然後整個系統從Sender到Receiver就依給定的QoS來配置適當的resource及優先權。 SRC則是反過來要在壓縮原始視訊讓壓縮的結果可以滿足現有的網路Resource需求,如每秒10個frame的播放速度及最多只能累積500ms的delay. 這篇paper中的SRC目標在於能夠有效率地管理及使用網路的頻寬,以提供夠好的視訊品質來支援目前的多媒體應目系統。 2018/11/19 MC2009

18 The Objectives of DMIF Delivery Multimedia Integration Framework to hide the delivery technology details from the DMIF User to manage real time, QoS sensitive channels to allow service providers to log resources per session for usage accounting to ensure interoperability between end-systems 2018/11/19 MC2009

19 2018/11/19 MC2009

20 DMIF Communication Architecture
signaling 2018/11/19 MC2009

21 High View of a Service Activation
2018/11/19 MC2009

22 Audiovisual Objects Audiovisual scene is with “objects”
Mixed different objects on the screen Visual Video Animated face & body; 2D and 3D animated meshes Text and Graphics Audio General audio – mono, stereo, and multichannel Speech Synthetic sounds (“Structured audio”) Environmental spatialization 2018/11/19 MC2009

23 Example of MPEG-4 Video Objects
Rectangular shape video object Arbitrary shape video object Animated Face 2018/11/19 MC2009 From Olivier Avaro

24 2018/11/19 MC2009

25 The Scene Graph 2018/11/19 MC2009

26 Description & Synchronization Delivery of streaming data
Composition Description & Synchronization Delivery of streaming data Interaction with media objects Management and identification of intellectual property 2018/11/19 MC2009

27 Major Components 2018/11/19 MC2009

28 Media Objects Composition Rendering Scene Graph 2018/11/19 MC2009

29 Adding or Removing Objects (1)
= + 2018/11/19 MC2009

30 Adding or Removing Objects (2)
2018/11/19 MC2009 From Igor S. Pandžić

31 Adding or Removing Objects (3)
Applications Video conferencing Real-time, automatic Separate foreground (communication partner) from background Object tracking in video May allow off-line and semi-automatic Separate moving object from others 2018/11/19 MC2009

32 MPEG-4 Coding Basics 2018/11/19 MC2009

33 Toolbox Approach tools for synthetic scenes tools for natural scenes
ALGORITHMS PROFILES 2018/11/19 MC2009

34 Coding Techniques Video objects Audio objects Face and Body 2D Mesh
Shape Motion vectors texture Audio objects MPEG AAC (Advanced Audio Coder) TTS (Text-To-Speech) Face and Body Animation parameters 2D Mesh Triangular patches Motion vector 2018/11/19 MC2009

35 Content-based Audio-Visual Representation
Audio-Visual Object (AVO) Video object component (video object plane, VOP) natural or synthetic 2D or 3D Audio object component mono, stereo or multi-channel 2018/11/19 MC2009

36 Video Object Planes (VOP)
Characteristics of VOP may have different spatial temporal resolutions may be associated with different degrees of accessibility  sub-VOPs may be separated or overlapping VOP type Traditional I, P, B type S-VOP (Sprite) for background 2018/11/19 MC2009

37 Video Object Plane Type
S-VOP Time S-VOP B-VOP B-VOP B-VOP B-VOP B-VOP B-VOP I-VOP P-VOP P-VOP 2018/11/19 MC2009

38 Content-based Object Manipulation
change of the spatial position of a VOP application of a spatial scaling factor to a VOP change of the speed with which an VOP moves insertion of new VOPs deletion of an object in the scene change of the scene area 2018/11/19 MC2009

39 Segmentation Process Depending on applications, segmentation can be perform Online (real-time) or offline (non-real-time) Automatic or semi-automatic Examples Video conferencing real-time, automatic separate foreground (communication partner) from background Object Tracking in Video May allow off-line and semi-automatic separate moving object from others 2018/11/19 MC2009

40 Compression Improved coding efficiency
5-64 kbps for mobile applications up to 20Mbps for TV/film applications subjectively better quality compared to existing standard Coding of multiple concurrent data streams can code multiple views of a scene efficiently, e.g. stereo video 2018/11/19 MC2009

41 Coding VO in MPEG-4 Reduce temporal redundancy
Motion estimation for arbitrary shaped VOPs padding and modified block (polygon) matching motion estimation P-VOP B-VOP time I-VOP 2018/11/19 MC2009

42 Encoding of Visual Objects
Binary alpha block Motion vector Context-based arithmetic encoding Texture DCT 2018/11/19 MC2009

43 New Coding Features For each macroblock, the motion vectors can be computed on a 16  16 or 8  8 block basis Unrestricted motion estimation: prediction can extend over image boundary Overlapped block motion compensation Each component of texture can range from 1 to 12 bits More robust coding 2018/11/19 MC2009

44 Robust Video Coding Resynchronization Data partition Reversible VLC
Allow insertion of resync marker within each VOP Video packet header: include macroblock number, qunatizer value and timing information Data partition Allow shape, motion and texture data to be separated within a packet Reversible VLC Offer partial recovery from errors. 2018/11/19 MC2009

45 Sprite VOP Represent background image
Can be used for very efficient coding of scenes involving camera pan and zoom Much larger than the size of image and thus require more memory 2018/11/19 MC2009

46 Example of Sprite VOP 2018/11/19 MC2009

47 Object Mesh Useful for animation, content manipulation, content overlay, merging natural and synthetic video and others Tesselate with triangular patches Define motion vector for each node 2D motion of video objects are represented by the motion vectors of the node points Motion compensation is achieved by warping of texture map corresponding to patches by affine transform 2018/11/19 MC2009

48 Example of Object Mesh 2018/11/19 MC2009

49 Face Animation Face model Low-level facial animation
Default face model Download from the encoder Low-level facial animation A set of 66 facial animation parameters High-level facial animation A set of primary facial expression like joy, sadness, surprise and disgust Speech animation 14 visemes for mouth shape Text-to-speech synthesizer 2018/11/19 MC2009

50 Facial Animation 2018/11/19 MC2009 From Eine Übersicht

51 Still Texture Coding Discrete Wavelet Transform (DWT)
Spatial and quality scalability Use 2D Daubechies (9, 3)-tap biorthogonal filter Lowest band is lossless coded by arithmetic coding Higher bands are coded by multilevel quantization, zero-tree scanning and arithmetic coding 2018/11/19 MC2009

52 Audio Coding Different bit-rates, different types of source material and different algorithms Combination of parameter based coding, LPC-based coding, time/frequency based coding High quality speech with 2 kbps: Harmonic Vector eXcitation Coding (HVXC) Text-to-Speech (TTS) 2018/11/19 MC2009

53 Natural Audio Coder General audio (AAC, TwinVQ)
Quality Cellular AM FM CD 2 4 8 16 32 64 kbit/s Parametric speech (HVXC) High quality speech (CELP) General audio (AAC, TwinVQ) Parametric audio (HILN) Telephone 2018/11/19 MC2009 From Olivier Dechazal

54 Multiview Video 2018/11/19 MC2009

55 Stereo Sequence Coding
Multiview profile of MPEG-2 Coding left view seqence Sl, first, for the right view sequence, each frame is predicated from the corresponding frame in Sl, based on an estimated disparity field and the prediction error image are coded. P B B B Right view I B B P Left view 2018/11/19 MC2009

56 Intermediate View Synthesis
xl,n xc,n xr,n 2018/11/19 MC2009

57 The mesh-based scheme yields a visually more accurate prediction
Original left Original right Regular mesh on the left image Corresponding mesh on the right image Predictive right image by BMA (32.03 dB) Predictive right image by mesh (27.48 dB) The mesh-based scheme yields a visually more accurate prediction 2018/11/19 MC2009

58 MPEG-4 Coding Techniques
Shape Coding Shape-adaptive DCT Object-based Inter-frame Coding Overlapped Motion Estimation Bit-plane Coding and FGS 2018/11/19 MC2009

59 Object-Based Coding 2018/11/19 MC2009

60 Shape Coding Bitmap Coding Contour Coding Quadtree Coding
Context-Based Arithmetic Encoding (CAE) Contour Coding Chain Coding Baseline Shape Coding Polygon Approximation Skeleton-Based Shape Coding Quadtree Coding 2018/11/19 MC2009

61 Context-Based Arithmetic Encoding
16 16 Transparent block Boundary blocks Opaque block BOUNDING BOX 2018/11/19 Conditional entropy coding MC2009

62 Context-Based Arithmetic Encoding
16 16 Transparent block Boundary blocks Conditional entropy coding Opaque block BOUNDING BOX 2018/11/19 MC2009

63 Chain Coding starting points 3 3 3 3 2 3 3 2 2 2 1 2 1 2 1 1 1 1
3 3 3 3 2 3 3 2 2 starting points 2 1 2 1 2 1 1 1 1 1 2 3 5 6 7 4 4 - connected 8 - connected 2018/11/19 MC2009

64 Chain Coding 0 7 0 6 6 5 6 4 4 3 3 2 0 1 2 starting points
1 2 3 5 6 7 4 4 - connected 8 - connected 2018/11/19 MC2009

65 Differential Chain Code
DCC records the move (forward, leftward or rightward) regarding two consecutive directional links. F F F R L F L R 2018/11/19 MC2009

66 Baseline Shape Coding S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14
Baseline (horizontal) Distance between contour sample S23 and the baseline : D(S23) Trace and get distances : TPs (S7, S9, S12, S22) 2018/11/19 MC2009

67 Polygon Approximation
d2 d1 d3 Select vertices that are optimal in the rate-distortion sense. Splines are adopted to approximate the contour. 2018/11/19 MC2009

68 Skeleton-Based Shape Coding
2018/11/19 MC2009

69 Quadtree Coding 2018/11/19 MC2009

70 Shape-adaptive DCT 2018/11/19 MC2009

71 Inter-frame Coding Reconstruction of Object Shape
MVS = MVPS + MVDS MVS: MV for shape MVPS: predication MVDS: difference (BAC) 2018/11/19 MC2009

72 The context for Inter-frame Coding
2018/11/19 MC2009

73 Overlapped Motion Estimation
2018/11/19 MC2009

74 Weighting Coefficients in Overlapped Motion Estimation
2018/11/19 MC2009

75 Fine Granularity Scalable
Bad Moderate Good Low High Channel bitrate 2018/11/19 MC2009

76 FGS Video Encoder Structure
2018/11/19 MC2009

77 Enhancement layer bitstream
Bit-plane Coding quantized residual 5 7 8 6 2 4 3 1 4 6 8 2 binary transfer MSB 1 1 LSB reordering ……… ……… run-length coding Enhancement layer bitstream 2018/11/19 MC2009

78 FGS Video Decoder Structure
2018/11/19 MC2009

79 Binary Shape Encoder 2018/11/19 MC2009

80 Padding 2018/11/19 MC2009


Download ppt "Introduction to MPEG-4 MC2008 2018/11/19 MC2009."

Similar presentations


Ads by Google