Video Motion Interpolation for Special Effect Applications Timothy K. Shih, Senior Member, IEEE, Nick C. Tang, Joseph C. Tsai, and Jenq-Neng Hwang, Fellow, IEEE IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 5, SEPTEMBER 2011
Outline Introduction System Overview Motion Layer Segmentation and Tracking Motion Interpolation Using Video Inpainting Experimental Results Conclusion
Introduction
Background Video forgery (video falsifying): A technique for generating fake videos by altering, combining, or creating new video contents For instance, the outcome of a 100 m race in the olympic game is changed.
Introduction Example of video forgery : Original video frame Falsifying result
Objective To create a forged video, which is almost indistinguishable from the original video To create special effects in video editing applications
Introduction To change the content of video, the following techniques are commonly used: object tracking motion interpolation video inpainting video layer fusing
Introduction Contributions of this paper: 1) It is the first time that video forgery is attempted based on video inpainting techniques. 2) A new concept called guided inpainting for motion interpolation of video objects is proposed. 3) A guided quasi-3-D (i.e., X, Y, and time) video inpainting mechanism is proposed.
System Overview
System Overview
System Overview 1) Motion Layer Segmentation: 2) Motion Prediction: Separates background and tracked object 2) Motion Prediction: Finds Reference Stick-Figure to predict cycle of motion 3) Motion Interpolation: Motion analysis Patch assertion Motion completion via inpainting.
System Overview 4) Background Inpainting: 5) Layer Fusion: Inpaints background of different camera motions 5) Layer Fusion: Merges an object layer and a background layer
Motion Layer Segmentation and Tracking
Motion Layer Segmentation and Tracking Separate target objects from the background Adopt Mean Shift Feature Space Analysis Algorithm[2] for color region segmentation [2] D. Comaniciu and P.Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.
Initial segmentation of objects from their background
ALGORITHM: REFERENCE STICK FIGURE TRACKING Mean Shift Algorithm Manually Selected C0’ C0 Frame 0
Fast Tracking Mechanism[6] ALGORITHM: REFERENCE STICK FIGURE TRACKING Bounding Box B1 Revised Fast Tracking Mechanism[6] [6] K. Hariharakrishnan and D. Schonfeld, “Fast object tracking using adaptive blockmatching,” IEEE Trans.Multimedia, vol. 7, no. 5, pp. 853–859, Oct. 2005. Frame 1
ALGORITHM: REFERENCE STICK FIGURE TRACKING Comparing Color Segments of C0’ and B1’ Mean Shift Algorithm B1’ C1’ Frame 1 Frame 0
ALGORITHM: REFERENCE STICK FIGURE TRACKING Applying Dilation C1’ C1* Frame 1
Comparing corresponding pixel p ALGORITHM: REFERENCE STICK FIGURE TRACKING ‧Set L2 = 2 ‧Using L in LUV color space Comparing corresponding pixel p C1* Co If (p in C1*) - (p in C0) > L2 Exclude p in C1’ Else Keep p in C1’ Frame 1 Frame 0
Motion Segmentation Different parts of the target may move in different directions Decomposing an object into different regions Using revised block searching algorithm[10] to compute motion map [10] J. Jia, Y.-W. Tai, T.-P.Wu, and C.-K. Tang, “Video repairing under variable illumination using cyclic motions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 5, pp. 832–839, May 2006.
Motion Segmentation The Mean Shift color segmentation can also be revised to deal with motion segmentation: Based on blocks, not pixels Important for video inpainting Ghost shadows can be eliminated
Corresponding result of color segmentation by using [2] Original Video Frame Corresponding result of color segmentation by using [2] The example of tracked object and estimated vectors
Motion Interpolation Using Video Inpainting
Motion Interpolation Using Video Inpainting Motions of the target object need to be interpolated. Video Inpainting : In order to obtain the interpolated figures Motion interpolation may create background holes.
Motion Interpolation of Target Objects A target object can be segmented into a layer. Motion interpolation is required to produce a slow motion of the target layer. Original Interpolated Interpolated Original tn tn+1 tn+2 tn+3
General Inpainting Strategy In order to obtain the interpolated figures Using a rule-based thinning algorithm[1] : To obain the stick figures of target objects Stick figures: Used to guide the selection of patches Copied from the original video [1] M. Ahmed and R. Ward, “A rotation invariant rule-based thinning algorithm for character recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 12, pp. 1672–1678, Dec. 2002.
General Inpainting Strategy Quasi-3-D video space (2-D plus tIme) Using 3-D patches in quasi-3-D video inpainting produce a smooth movement
Prediction and Interpolation of Cyclic Motion Consider the following scenario: 1) It’s common for target objects to perform actions in a repeated cycle . 2) A stick figure can be used to estimate the relative positions of patches . (e.g., head, body, and legs).
Prediction and Interpolation of Cyclic Motion Stick figures and the contours of target objects can be used to predict repeated cycles.
Prediction and Interpolation of Cyclic Motion Missing stick figure can be reproduced by: 1) Searching for similar reference stick figures in a repeated motion cycle 2) Interpolation of two known stick figures
ALGORITHM: REFERENCE STICK FIGURE SEARCHING
x-r …… …… x+r x ALGORITHM: REFERENCE STICK FIGURE SEARCHING r : the number of frames in a repeated cycle index range function of a given frame number x as idx(x) = [x + r − 2, x + r − 1, x + r, x + r + 1, x + r + 2] ∪ [x − r − 2, x − r − 1, x − r, x − r + 1, x − r + 2].
STICK FIGURE INTERPOLATION Thinning result Oa Ob Union of Oa and Ob
Motion Interpolation Alogorithm extending our image inpainting algorithm for motion interpolation consider a video as a 2-D plus time domain
Motion Interpolation Alogorithm Φ3 is a source space Ω3 is a target space Φ3 ∩ Ω3= ∅ (an empty set)
ALGORITHM: PATCH ASSERTION
ALGORITHM: PATCH ASSERTION Example for patch assertion: Patches on stick figure Result of patch assertion Contour ω of (a) (searched in the nearby motion cycle)
ALGORITHM: MOTION INTERPOLATION 3 The main algorithm Let ∂Ω3 be: a front surface on Ω3 adjacent to Φ3 Source Region 3 3 3 Target Region
ALGORITHM: MOTION INTERPOLATION 3 Given a 3-D patch Ψp centered at the point p Let Ψp ‘s priority P(p) = C(p) × D(p) 3 3 Source Region 3 3 3 Target Region
ALGORITHM: MOTION INTERPOLATION C(p) : Confidence term: The percentage of useful information inside a patch centered at p the size of 3-D patch is denoted as |Ψ3| = 27 pixels
ALGORITHM: MOTION INTERPOLATION D(p) : Data term Compute the percentage of edge pixels in the patch ( Instead of computing the isophote[3] ) var(Ψp ) : the color variation of the patch [21] 3 [21] T. K. Shih, N. C. Tang,W.-S. Yeh, T.-J. Chen, andW. Lee, “Video inpainting and implant via diversified temporal continuations,” in Proc. 2006 ACM Multimedia Conf., Santa Barbara, CA, Oct. 23–27, 2006, pp. 133–136. [3] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplar-based image inpainting,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1200–1212, Sep. 2004.
ALGORITHM: MOTION INTERPOLATION Example of motion interpolation via inpainting
Inpainting Camera Motions Using mechanism proposed in [22] Ensures that there is no “ghost shadows” created in the background Segment motions into different regions The inpainted area in the previous frame needs to be incorporated. [22] T. K. Shih, N. C. Tang, and J.-N. Hwang, “Ghost shadow removal in multilayered video inpainting,” in proc. IEEE 2007 Int. Conf.Multimedia Expo, Beijing, China, Jul. 2–5, pp. 1471–1474.
Layer Fusion Need to merge video layers to produce forged video. The fusion process merges an object layer and a background layer. (With contour of object layer computed based on the object tracking)
ALGORITHM: LAYER FUSION
Corresponding area on background layer δbkg ALGORITHM: LAYER FUSION Object Background Object layer obj Dilation area of obj δob j Corresponding area on background layer δbkg δob j pixel intensity → υL
ALGORITHM: LAYER FUSION Count υL (-2,70) Intensity Difference Histogram of the intensity difference in δob j and δbkg
Experimental Results
Experimental Results Block size used in Patch Assertion: 10-by-10 pixels Patch size used in Motion Interpolation: 3-by-3-by-3 pixels Hardware: CPU 2.1G with 2G RAM
Evaluation Motion Layer Segmentation Motion Prediction Motion Interpolation Background Inpainting Layer Fusion
Evaluation
Evaluation
Evaluation
Evaluation
Limitations of the Proposed Mechanism Shadows cannot be tracked precisely. Only use intensity to merge layers (without considering the chrominance information) A sophisticated 3-D reconstruction mechanism needs to be investigated. Only produces actions two times slower for slow motion.
Conclusion
Conclusion This paper Proposes an interesting technology to alter the behavior of moving objects in a video Effectively extends the inpainting technique to a quasi-3-D space Allows a video to be separated into several layers, played in different speeds, and then merged
Conclusion A series of difficult problems are solved. Solutions are successfully integrated. Mostly, it is a subjective feeling of how a fake video looks real. Necessary to develop an authoring tool to allow the users to specify the spatiotemporal fluctuation property