Presentation is loading. Please wait.

Presentation is loading. Please wait.

EE569 Digital Video Processing 1 Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* –Why object-based?

Similar presentations


Presentation on theme: "EE569 Digital Video Processing 1 Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* –Why object-based?"— Presentation transcript:

1 EE569 Digital Video Processing 1 Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* –Why object-based? motion segmentation, shape coding, R-D optimization –scalability issues Spatial/temporal/quality scalabilities

2 EE569 Digital Video Processing 2 Object-based Video Coding Waveform-based coding discussed so far uses a simple source model (e.g., H.261/263/264, MPEG-1/-2) –Does not consider the semantic content (e.g. objects and their shape) of the video Object-based video coding identifies objects (or regions) in a video and encodes them. Potential benefits may include –Improved coding efficiency –Improved visual quality (e.g., no blocking artifacts) –Content description –Content-based interactivity Also called “ content-dependent video coding” –The buzz word for MPEG-4 but less successful than expected (so the important question is to understand why it does not work so well)

3 EE569 Digital Video Processing 3 Essential Tasks in Object-based Video Coding Object/region segmentation –Separate pixels based on their color, texture, motion characteristics –Closely related to motion detection and segmentation –Intrinsically ill-defined and desperate for a breakthrough 2D shape modeling and coding –Not all shapes are equally probable –Subtle implications into video coding (hidden pitfalls) 2D texture modeling and coding –Extension of existing block-based MCP into region-based –Deformable textures (tradeoff between spatial and temporal prediction)

4 EE569 Digital Video Processing 4 Object/Region Segmentation The major challenge in content/object-based coding Common approaches for segmentation in a still image: g ray-level thresholding, clustering, edge detection, region growing, splitting and merging Object segmentation in video –Motion information can be utilized, but how? –Should we trust more on motion or spatial clues?

5 EE569 Digital Video Processing 5 Motion-based Segmentation Motion-based segmentation: to segment an image using motion information –We can first estimate the motion field and then segment the motion field –However, estimation and segmentation are like two sides of the same coin +

6 EE569 Digital Video Processing 6 A Mind-bothering Example Frame 1Frame 2 It is easy to convince yourself that tree branches are moving, But how do we know the sky is still? What if it were also moving at the same speed (shouldn’t we observe the same intensity patterns because sky is a smooth region)?

7 EE569 Digital Video Processing 7 Implications into Video Coding True motion representation might be useful to computer vision and motion perception, but it is not indispensable in video coding The fundamental reason lies in the relationship between motion representation and video coding: how to tolerate the uncertainty in motion? The same issue remains in object-based image coding: how to tolerate the uncertainty in shape? (we will discuss this in more detail later)

8 EE569 Digital Video Processing 8 Simplified Segmentation: Change Detection To detect the changing parts in a video, from time t i to time t j, we compute a difference image and threshold the difference by T d ij (x,y) can be further processed, e.g., to remove isolated 1’s, or to group 1’s that are close by to each other f (x, y, t j ) f (x, y, t i )

9 EE569 Digital Video Processing 9 Change Detection: Pros and Cons Simple to implement; fast Detects all changes Detects even unwanted changes Positive and negative changes detected (occlusion) Difficult to quantify motion Requires a static reference frame

10 EE569 Digital Video Processing 10 Change Detection: An Example Monitor the traffic

11 EE569 Digital Video Processing 11 If without a static reference frame Background extraction methods –Ad-hoc median detector (your CA#6) –To eliminate the impact of (small) moving objects, use the “robust estimator” approach to iteratively remove the outliers –More sophisticated approaches involve the modeling of background by mixture of Gaussian distributions and graph-cut based optimization

12 EE569 Digital Video Processing 12 Simplified Segmentation: Global Motion Estimation Planar homography (feature-based) –Homogeneous coordinates –Conditions for planar homography –Homography estimation from feature correspondence Hierarchical model-based GME (feature-less) –Directly minimize an energy function (the MSE of MCP errors) –Solve the optimization problem in a coarse-to-fine fashion (more robust and efficient)

13 EE569 Digital Video Processing 13 Plane Homography

14 EE569 Digital Video Processing 14 Model-based GME Target function for minimization Solution: Gauss-Newton method where Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. “Hierarchical Model-Based Motion Estimation.” In Proc. of the Second European Conference on Computer Vision, pp. 237-252, 1992

15 EE569 Digital Video Processing 15 Multi-resolution GME

16 EE569 Digital Video Processing 16 Numerical Example

17 EE569 Digital Video Processing 17 Summary for Change Detection and Global Motion Estimation Motion segmentation becomes relatively easier to solve when either camera is still or background objects belong to a plane Latest advances include a joint motion segmentation and estimation using level-set methods (PDE-based formulation) Mansouri, A.-R.; Konrad, J., "Multiple motion segmentation with level sets," Image Processing, IEEE Transactions on, vol.12, no.2, pp. 201-220, Feb 2003

18 EE569 Digital Video Processing 18 2-D Shape Modeling and Coding Bitmap coding: a binary map specifying whether or not a pixel belongs to an object –A special case of the general alpha-map Contour coding: code only the contour of the object or the region –Chain codes –Polygon approximation –Spline approximation

19 EE569 Digital Video Processing 19 Image Matting (Soft segmentation) Not for coding but for interactive editing

20 EE569 Digital Video Processing 20 2-D Texture Modeling and Coding* Shape-adaptive DCT Shape-adaptive wavelet transform

21 EE569 Digital Video Processing 21 Roadmap Introduction Intra-frame coding –Review of JPEG Inter-frame coding –Conditional Replenishment (CR) –Motion Compensated Prediction (MCP) Scalable video coding –3D subband/wavelet coding and recent trend

22 EE569 Digital Video Processing 22 Scalable vs. Multicast What is scalable coding? Multicast Scalable coding foreman.yuv foreman128k.cod foreman256k.cod foreman512k.cod foreman1024k.cod foreman.yuv foreman.cod 1024512256128

23 EE569 Digital Video Processing 23 Spatial scalability 10111…0101000…110100

24 EE569 Digital Video Processing 24 Temporal scalability 10111…0101000…110100 Frame 0,1,2,3,4,5,… Frame 0,2,4,6,8,…Frame 0,4,8,12,… 30Hz15Hz 7.5Hz

25 EE569 Digital Video Processing 25 SNR (Rate) scalability 10111…0101000…110100 PSNR avg =30dB PSNR avg =35dB PSNR avg =40dB PSNR i : PSNR of frame i

26 EE569 Digital Video Processing 26 Scalability via Bit-Plane Coding A=  (a 0 +a 1 2+a 2 2 2 + … … +a 7 2 7 ) Least Significant Bit (LSB) Most Significant Bit (MSB) Example A=129  sign=+,a 0 a 1 a 2 …a 7 =10000001 sign=-, a 0 a 1 a 2 …a 7 =00110011  A=-(4+8+64+128)=-204 sign bit

27 EE569 Digital Video Processing 27 Why DPCM Bad for Scalability? Base layer Enhancement Layer 1 Enhancement Layer 2 I base PPP I enh1 I enh2 12 3 … Frame number P P P P P P suffer from drifting problem suffer from coding efficiency loss

28 EE569 Digital Video Processing 28 Fine Granular Scalability (FGS) ~2dB gap H.264 with/without FGS option Foreman sequence (5fps) Base layer 20 kbps Enhancement layer variable bit-rate Efficiency gap

29 EE569 Digital Video Processing 29 3D Wavelet/Subband Coding t x y 2D spatial WT+1D temporal WT

30 EE569 Digital Video Processing 30 Wavelet Video Coder Temporal Wavelet Transform Spatial Wavelet Transform Spatial Wavelet Transform 7 6 5 4 3 2 1 0 H H LLL LLH LH Original video frames H H H H H H H H H H H H H H H H Embedded Quantization & Entropy Coding Embedded Quantization & Entropy Coding [Taubman & Zakhor, 1994] [Ohm, 1994] [Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99]... and others

31 EE569 Digital Video Processing 31 Motion-Adaptive 3D Wavelet Transform Recall Haar transform Motion-adaptive Haar transform W,W -1 : forward and backward motion vector lifting-based implementation

32 EE569 Digital Video Processing 32 Lifting PU Even Frames Synthesis: Odd Frames Low Band High Band PU Even Frames Analysis: Odd Frames Low Band High Band Motion Compensation [Secker & Taubman, 2001] [Popescu & Bottreau, 2001]

33 EE569 Digital Video Processing 33 MC Wavelet Coding vs. H.264/AVC 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 36 34 32 30 28 26 24 22 20 38 Luminance PSNR (dB) bit-rate (Mbps) Scalable MC 5/3 Wavelet Non-scalableH.264/AVC Sequence: Mobile CIF H.264/AVC high complexity RD control CABAC PBBPBBP... 5 prev/3 future reference frames data courtesy of M. Flierl [Taubman & Secker, VCIP 2003] courtesy D. Taubman

34 EE569 Digital Video Processing 34 Wavelet Synthesis with Lossy Motion Vector MC Wavelet Transform MC Wavelet Transform Motion Estimat or Motion Estimat or Embedded Encoding Embedded Encoding Embedded Encoding Embedded Encoding Decoder Inverse Wavelet Transform Inverse Wavelet Transform Video in Video out [Taubman & Secker, ICIP03] Minimize J=D+ R Minimize J=D+ R

35 EE569 Digital Video Processing 35 R-D Performance with Lossy Motion Vector Bit - Rate (kbps) Video PSNR (dB) 0200400 60 0 80010001200 24 26 28 30 32 34 36 3840 Embedded wavelet coefficients Lossless motion Non-embeddedsingle-rate Embedded wavelet coefficients Lossy motion CIF Foreman [Taubman & Secker, VCIP 2003] courtesy D. Taubman

36 EE569 Digital Video Processing 36 ?? Internet video streaming Surprising Success of ITU-T Rec. H.263 What H.263 was developed for... Analog videophone... and what is was used for.

37 EE569 Digital Video Processing 37 What is Streaming Video? AccessSW Data path AccessSW Domain A Domain B Domain C Internet AccessSW Source Receiver 2 Receiver 1 Download mode: no delay bound Streaming mode: delay bound cnn.com RealPlayer

38 EE569 Digital Video Processing 38 Outline Challenges for quality video transport An architecture for video streaming –Video compression –Application-layer QoS control –Continuous media distribution services –Streaming server –Media synchronization mechanisms –Protocols for streaming media Summary

39 EE569 Digital Video Processing 39 Time-varying Available Bandwidth Data path AccessSW Domain A Domain B AccessSW Source Receiver 56 kb/s R>=56 kb/s R<56 kb/s cnn.com RealPlayer No bandwidth reservation

40 EE569 Digital Video Processing 40 Time-varying Delay Data path AccessSW Domain A Domain B AccessSW Source Receiver 56 kb/s cnn.com RealPlayer Delayed packets regarded as lost

41 EE569 Digital Video Processing 41 Effect of Packet Loss Data path AccessSW Domain A Domain B AccessSW Source Receiver No packet loss Loss of packets No retransmission

42 EE569 Digital Video Processing 42 Unicast vs. Multicast Unicast Multicast Pros and cons?

43 EE569 Digital Video Processing 43 Heterogeneity For Multicast Domain A Domain B Domain C Internet Source Receiver 1 Receiver 2 AccessSW AccessSW Gateway Ethernet Telephone networks Receiver 3 64 kb/s 1 Mb/s 256 kb/s Network heterogeneity Receiver heterogeneity What Quality?

44 EE569 Digital Video Processing 44 Outline Challenges for quality video transport An architecture for video streaming –Video compression –Application-layer QoS control –Continuous media distribution services –Streaming server –Media synchronization mechanisms –Protocols for streaming media Summary

45 EE569 Digital Video Processing 45 Architecture for Video Streaming

46 EE569 Digital Video Processing 46 Video Compression Layered Coder D D D + + Layer 0 Layer 1 Layer 2 1 Mb/s 256 kb/s 64 kb/s Layered video encoding/decoding. D denotes the decoder.

47 EE569 Digital Video Processing 47 Application of Layered Video Domain A Domain B Domain C Internet Source Receiver 1 Receiver 2 AccessSW AccessSW Gateway Ethernet Telephone networks Receiver 3 64 kb/s 1 Mb/s 256 kb/s IP multicast

48 EE569 Digital Video Processing 48 Application-layer QoS Control Congestion control (using rate control): –Source-based, requires rate-adaptive compression or rate shaping –Receiver-based –Hybrid Error control: –Forward error correction (FEC) –Retransmission –Error resilient compression –Error concealment

49 EE569 Digital Video Processing 49 Congestion Control Window-based vs. rate control (pros and cons?) Window-based controlRate control

50 EE569 Digital Video Processing 50 Source-based Rate Control

51 EE569 Digital Video Processing 51 Video Multicast How to extend source-based rate control to multicast? Limitation of source-based rate control in multicast Trade-off between bandwidth efficiency and service flexibility

52 EE569 Digital Video Processing 52 Receiver-based Rate Control Domain A Domain B Domain C Internet Source Receiver 1 Receiver 2 AccessSW AccessSW Gateway Ethernet Telephone networks Receiver 3 64 kb/s 1 Mb/s 256 kb/s IP multicast for layered video

53 EE569 Digital Video Processing 53 Error Control FEC –Channel coding –Source coding-based FEC –Joint source/channel coding Delay-constrained retransmission Error resilient compression Error concealment

54 EE569 Digital Video Processing 54 Channel Coding

55 EE569 Digital Video Processing 55 Delay-constrained Retransmission

56 EE569 Digital Video Processing 56 Outline Challenges for quality video transport An architecture for video streaming –Video compression –Application-layer QoS control –Continuous media distribution services –Streaming server –Media synchronization mechanisms –Protocols for streaming media Summary

57 EE569 Digital Video Processing 57

58 EE569 Digital Video Processing 58 Continuous Media Distribution Services Content replication (caching & mirroring) Network filtering/shaping/thinning Application-level multicast (overlay networks)

59 EE569 Digital Video Processing 59 Caching What is caching? Why using caching? WWW means World Wide Wait? Pros and cons?

60 EE569 Digital Video Processing 60 Outline Challenges for quality video transport An architecture for video streaming –Video compression –Application-layer QoS control –Continuous media distribution services –Streaming server –Media synchronization mechanisms –Protocols for streaming media Summary

61 EE569 Digital Video Processing 61 Streaming Server Different from a web server –Timing constraints –Video-cassette-recorder (VCR) functions (e.g., fast forward/backward, random access, and pause/resume). Design of streaming servers –Real-time operating system –Special disk scheduling schemes

62 EE569 Digital Video Processing 62 Media Synchronization Why media synchronization? Example: lip-synchronization (video/audio)

63 EE569 Digital Video Processing 63 Protocols for Streaming Video Network-layer protocol: Internet Protocol (IP) Transport protocol: –Lower layer: UDP & TCP –Upper layer: Real-time Transport Protocol (RTP) & Real-Time Control Protocol (RTCP) Session control protocol: –Real-Time Streaming Protocol (RTSP): RealPlayer –Session Initiation Protocol (SIP): Microsoft Windows MediaPlayer; Internet telephony

64 EE569 Digital Video Processing 64 Protocol Stacks

65 EE569 Digital Video Processing 65 Summary Challenges for quality video transport –Time-varying available bandwidth –Time-varying delay –Packet loss An architecture for video streaming –Video compression –Application-layer QoS control –Continuous media distribution services –Streaming server –Media synchronization mechanisms –Protocols for streaming media


Download ppt "EE569 Digital Video Processing 1 Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* –Why object-based?"

Similar presentations


Ads by Google