"Digital Media Primer" Yue-Ling Wong, Copyright (c)2013 by Pearson Education, Inc. All rights reserved.
Chapter 6 Fundamentals of Digital Video Part 6 MPEG Compression Streaming Video and Progressive Download
In this lecture, you will learn: Basic concepts of MPEG compression Types of MPEG Applications of MPEG What GOP (group of pictures) in MPEG is How GOP settings affect MPEG video file size How GOP settings affect MPEG picture quality What streaming video and progressive download are
MPEG Moving Pictures Experts Group Committee who derives standards for encoding video Allow high compression MPEG-1, MPEG-2, MPEG-4
What Happend to MPEG-3? NOT MP3 (which is audio format) Intended for HDTV HDTV specifications was merged into MPEG-2
MPEG-1 Video quality comparable to VHS Originally intended for Web and CD-ROM playback Frame sizes up to 352 240 pixels Video format for VCD (VideoCD) before DVD became widespread
MPEG-2 Supports DVD-video, HDTV, HDV standards For DVD video production: Export video into DVD MPEG-2 format For HDV video production: Export video into HDV's MPEG-2 format
MPEG-4 Newer standard of MPEG family Different encoding approach from MPEG-1 and MPEG-2 (will discuss after MPEG-1 and MPEG-2 compression in this lecture)
How the Compression Works MPEG-1 and MPEG-2 How the Compression Works
Properties of Typical Video Neighboring frames are very similar This property is called Temporal redundancy. MPEG compression exploits temporal redundancy to reduce video file size by looking for motion difference from one frame to the next This technique is called motion compensation.
Basic Ideas of Motion Compensation An video frame image is read in as a reference frame Read the next video frame image Compare the image content between this current frame (target frame) with the reference frame one block of pixels at a time
Comparing Reference and Target Frames Case 1: If a pixel block is identical at the same location in both frames: No need to encode the target frame's Just save an instruction to refer to the block in the reference image—requires less space than encoding the whole pixel block
Comparing Reference and Target Frames Case 2: If a pixel block at the same location in both frames is not identical: Search for the reference image for a match (because the content may be moved to another location) Case 2a: No match is found Case 2b: A match is found
Case 2a: No Match is Found The whole pixel block is encoded Thus, no saving in file size
Case 2b: A Match is Found The displacement information of the block is saved. Displacement information: A 2-dimensional value, indicating how much the block moves horizontally and vertically from the reference frame to the target frame Called motion vector Much smaller size than encoding the whole pixel blocks (thus the saving in file size)
Illustration of the Ideas Let's consider the first 4 frames of a video Frame 1 Frame 2 Frame 3 Frame 4 The square grid cell in a frame represents a block of pixels.
Illustration of the Ideas Frame 1 is read in. Frame 1 Frame 2 Frame 3 Frame 4
Illustration of the Ideas Frame 1 is used as a reference frame. Frame 1 Frame 2 Frame 3 Frame 4 The red color highlights the pixel blocks that are encoded. The whole frame of a reference frame is encoded. No saving in file size.
Illustration of the Ideas Frame 2 is read in. Frame 1 Frame 2 Frame 3 Frame 4
Illustration of the Ideas Each pixel block in Frame 2 is searched in Frame 1 trying to find a match. Frame 1 Frame 2 Frame 3 Frame 4
Illustration of the Ideas Each pixel block in Frame 2 is searched in Frame 1 trying to find a match. Frame 1 Frame 2 Frame 2 Frame 3 Frame 4 The blue color highlights some pixel blocks that are found in Frame 1. The yellow color highlights some pixel blocks that are found in Frame 1 (but later in the process they are found in the next reference frame.)
Illustration of the Ideas Frame 3 is read in. Frame 1 Frame 2 Frame 2 Frame 3 Frame 4
Illustration of the Ideas Each pixel block in Frame 3 is searched in Frame 1 trying to find a match. Frame 1 Frame 2 Frame 2 Frame 3 Frame 4 The process is repeated similar to that for Frame 2.
Illustration of the Ideas about Motion Vector Take Frames 1 and 2 as an example Frame 1 Frame 2
Illustration of the Ideas about Motion Vector Let's look at the pixel block showing the car's windshield Frame 1 Frame 2
Illustration of the Ideas about Motion Vector Let's arrange the frames vertically to see the displacement easier Frame 1 Frame 2
Illustration of the Ideas about Motion Vector Let's arrange the frames vertically to see the displacement easier Frame 1 Frame 2
Illustration of the Ideas about Motion Vector Let's arrange the frames vertically to see the displacement easier Frame 1 Frame 2
Illustration of the Ideas about Motion Vector Let's arrange the frames vertically to see the displacement easier Frame 1 Frame 2 The red line indicates the displacement of the pixel block.
So how are reference frames chosen? We will need to understand an important concept of group of pictures (GOP) first.
Motivation of Understanding GOP To understand: the terminology related to GOP because it is used by MPEG which in turns used by DVD video and HDTV it comes up in video production application dialog boxes when you export video into MPEG format the impact of the GOP parameters on video file size and quality why some video editing programs may not support frame accurate editing of MPEG
Group of Pictures (GOP) Specifies the grouping structure of frames Frame types in a structure: I-frames P-frames B-frames An MPEG video contains 1 or more repeating GOPs
Number of Frames in a GOP Each GOP in a video has a fixed number of frames The number is the N parameter Example: DVD-compliant MPEG-3: N = 15 HDV: N = 15
I-frames Stands for Intraframes Encoded only using the information within the frame--intracoding. Use spatial compression Similar to JPEG compression No temporal compression Least compressed of the three types Reference frame in the previous 4-frame example
I-frames Each GOP starts with an I-frame Each GOP has only one I-frame
P-frames Stands for predicted frames Encoded using the information from the previous I- or P-frames as the reference frame if a match of a pixel block is found The 2nd and 3rd frames in the previous 4-video example
B-frames Stands for bidirectional frames Between I- and P-frames Encoded using information from the previous and subsequent I- and/or P-frame as the reference frames
Example GOP DVD-compliant MPEG-2: I B B P B B P B B P B B P B B N = 15 i.e., number of frames in a GOP = 15
Example GOP DVD-compliant MPEG-2: I B B P B B P B B P B B P B B M = 3 i.e., number of frames between non-B-frames plus one = 2+1 = 3
Revisit the 4-frame video example Suppose N = 15, M = 3
Example GOP: N = 15, M = 3 I B B P B B P B B P B B P B B Frame 1
Example GOP: N = 15, M = 3 I B B P B B P B B P B B P B B Frame 1 I-frame: Least compressed
Based on changes from Frame 1 Example GOP: N = 15, M = 3 I B B P B B P B B P B B P B B Frame 1 Frame 2 Frame 3 Frame 4 P-frame: Based on changes from Frame 1
Based on changes from Frames 1 and 4 Example GOP: N = 15, M = 3 I B B P B B P B B P B B P B B Frame 1 Frame 2 Frame 3 Frame 4 B-frame: Most compressed. Based on changes from Frames 1 and 4
Based on changes from Frames 1 and 4 Example GOP: N = 15, M = 3 I B B P B B P B B P B B P B B Frame 1 Frame 2 Frame 3 Frame 4 B-frame: Most compressed. Based on changes from Frames 1 and 4
How GOP Settings Affect File Size Shorter GOP (i.e., lower N): larger file size Rationales: A MPEG-2 consists of a repeating GOP structure Each GOP contains one I-frame I-frames are the least compressed among the three types of frames, and thus take up more storage space Shorter GOP means more GOPs in a MPEG-2 More GOPs means more I-frames in a MPEG-2
How GOP Settings Affect Picture Quality Shorter GOP (i.e., lower N): better picture quality Rationales: I-frames are compressed based on the frame information, rather than predicted from other frames
How GOP Structure Affects Frame-Accurate Video Editing Some video editing programs may not support frame-accurate video editing of MPEG-2 because it is more complex than other video format Rationales: The information for a P-frame depends on the information of its previous I-frame. A B-frame depends on the information of its previous and subsequent I- or P-frames. Thus, it is more complex to trim out frames from a MPEG-2 than other video format.
Applications of MPEG-4 Cover a wide range of data rate Low end of the data rate: Video playback on mobile devices High end of the data rate: HDTV Handheld and portable game devices (e.g., Sony PSP)
MPEG-4 Coding Approach Uses media objects A scene may contain separate media objects Not the frame-based coding used in MPEG-1 and MPEG-2 Conventional frame-based video can be converted to MPEG-4 because a frame can be treated as a media object--a degenearted case.
Ways of Playing Video
Two Ways of Playing Video Play from disk Play over a network
Play from Disk An entire clip needs to be on disk before it can be played Played from hard drive, CD, or DVD
Play over a Network The video can be played while it is being downloaded Can be played from disk Streaming video Progressive download
Streaming Video Play video as soon as enough data has arrived Examples: Streaming QuickTime Real Video Window Media Video (WMV)
Streaming Video Require a streaming server to stream video Allow saving several different compression levels of a video in a single file The server chooses the compression level to match the speed of network connection Buffering: Wait time depends on network speed
Progressive Download Play video as soon as enough data has arrived Does not require special servers Example: QuickTime fast-start Created by saving the QuickTime movie as self-contained using QuickTime Pro
Review Questions Note to instructor: Depending on your preference, you may want to go over the review questions at the end of this lecture as an instant review or at the beginning of next lecture to refresh students' memory of this lecture.
Review Question True/False: The MP3 audio is a MPEG-3. False
Review Question ___ provides a video quality comparable to VHS and is the file format for VCD. MPEG-1 MPEG-2 MPEG-3 MPEG-4 A
Review Question ___ support the DVD-video, HDV, and HDTV standards. MPEG-1 MPEG-2 MPEG-3 MPEG-4 B
Review Question True/False: A typical MPEG-2 consists of a repeating GOP structure. True
Review Question Motion compensation is a key technique in ___. asymmetric lossless lossy spatial temporal E
Review Question ___ is encoded using only the information within that frame. B-frame I-frame P-frame B
Review Question ___ is encoded using only the previous I- or P-frame as the reference frame. B-frame I-frame P-frame C
Review Question ___ is encoded using the previous and subsequent I- and/or P-frame as the reference frames. B-frame I-frame P-frame A
Review Question ___ is the least compressed among the three frame types. B-frame I-frame P-frame B
Review Question The N parameter of the GOP refers to ___. the number of B-frames in a GOP the number of I-frames in a GOP the number of P-frames in a GOP the total number of frames in a GOP one plus the number of frames between the I- and P-frame, the P- and P-frame, and the P- and next GOP's I-frame D
Review Question The M parameter of the GOP refers to ___. the number of B-frames in a GOP the number of I-frames in a GOP the number of P-frames in a GOP the total number of frames in a GOP one plus the number of frames between the I- and P-frame, the P- and P-frame, and the P- and next GOP's I-frame E