Unwrap Mosaics: A new representation for video editing Alex Rav-Acha et al. In SIGGRAPH 2008 발표 이성호 2009 년 1 월 22 일.

Slides:

Advertisements

Similar presentations

Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.

Advertisements

Bayesian Belief Propagation

For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.

Investigation Into Optical Flow Problem in the Presence of Spatially-varying Motion Blur Mohammad Hossein Daraei June 2014 University.

Analysis of Contour Motions Ce Liu William T. Freeman Edward H. Adelson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.

Character Animation from 2D Pictures and 3D Motion Data ACM Transactions on Graphics 2007.

Face Alignment with Part-Based Modeling

MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.

Modeling the Shape of People from 3D Range Scans

Shape From Light Field meets Robust PCA

GrabCut Interactive Image (and Stereo) Segmentation Carsten Rother Vladimir Kolmogorov Andrew Blake Antonio Criminisi Geoffrey Cross [based on Siggraph.

Forward-Backward Correlation for Template-Based Tracking Xiao Wang ECE Dept. Clemson University.

Robust Object Tracking via Sparsity-based Collaborative Model

Computer Vision Optical Flow

Boundary matting for view synthesis Samuel W. Hasinoff Sing Bing Kang Richard Szeliski Computer Vision and Image Understanding 103 (2006) 22–32.

A new approach for modeling and rendering existing architectural scenes from a sparse set of still photographs Combines both geometry-based and image.

Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.

Final Class: Range Data registration CISC4/689 Credits: Tel-Aviv University.

A Study of Approaches for Object Recognition

Probabilistic video stabilization using Kalman filtering and mosaicking.

Direct Methods for Visual Scene Reconstruction Paper by Richard Szeliski & Sing Bing Kang Presented by Kristin Branson November 7, 2002.

Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.

Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Stereo Computation using Iterative Graph-Cuts

Linearizing (assuming small (u,v)): Brightness Constancy Equation: The Brightness Constraint Where:),(),(yxJyxII t  Each pixel provides 1 equation in.

COMP 290 Computer Vision - Spring Motion II - Estimation of Motion field / 3-D construction from motion Yongjik Kim.

3D Rigid/Nonrigid RegistrationRegistration 1)Known features, correspondences, transformation model – feature basedfeature based 2)Specific motion type,

Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.

Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.

David Luebke Modeling and Rendering Architecture from Photographs A hybrid geometry- and image-based approach Debevec, Taylor, and Malik SIGGRAPH.

Mosaics CSE 455, Winter 2010 February 8, 2010 Neel Joshi, CSE 455, Winter Announcements  The Midterm went out Friday  See to the class.

Multimodal Interaction Dr. Mike Spann

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

Automatic Registration of Color Images to 3D Geometry Computer Graphics International 2009 Yunzhen Li and Kok-Lim Low School of Computing National University.

BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.

Linearizing (assuming small (u,v)): Brightness Constancy Equation: The Brightness Constraint Where:),(),(yxJyxII t  Each pixel provides 1 equation in.

The Brightness Constraint

Exploitation of 3D Video Technologies Takashi Matsuyama Graduate School of Informatics, Kyoto University 12 th International Conference on Informatics.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

High-Resolution Interactive Panoramas with MPEG-4 발표자 : 김영백 임베디드시스템연구실.

Detecting Curved Symmetric Parts using a Deformable Disc Model Tom Sie Ho Lee, University of Toronto Sanja Fidler, TTI Chicago Sven Dickinson, University.

Image stitching Digital Visual Effects Yung-Yu Chuang with slides by Richard Szeliski, Steve Seitz, Matthew Brown and Vaclav Hlavac.

#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS

CS 4487/6587 Algorithms for Image Analysis

1 Howard Schultz, Edward M. Riseman, Frank R. Stolle Computer Science Department University of Massachusetts, USA Dong-Min Woo School of Electrical Engineering.

Vehicle Segmentation and Tracking From a Low-Angle Off-Axis Camera Neeraj K. Kanhere Committee members Dr. Stanley Birchfield Dr. Robert Schalkoff Dr.

Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.

Joint Tracking of Features and Edges STAN BIRCHFIELD AND SHRINIVAS PUNDLIK CLEMSON UNIVERSITY ABSTRACT LUCAS-KANADE AND HORN-SCHUNCK JOINT TRACKING OF.

Non-Ideal Iris Segmentation Using Graph Cuts

Yizhou Yu Texture-Mapping Real Scenes from Photographs Yizhou Yu Computer Science Division University of California at Berkeley Yizhou Yu Computer Science.

Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.

Linearizing (assuming small (u,v)): Brightness Constancy Equation: The Brightness Constraint Where:),(),(yxJyxII t  Each pixel provides 1 equation in.

Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.

Representing Moving Images with Layers J. Y. Wang and E. H. Adelson MIT Media Lab.

Linearizing (assuming small (u,v)): Brightness Constancy Equation: The Brightness Constraint Where:),(),(yxJyxII t  Each pixel provides 1 equation in.

Motion estimation Parametric motion (image alignment) Tracking Optical flow.

Processing Images and Video for An Impressionist Effect Automatic production of “painterly” animations from video clips. Extending existing algorithms.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

Motion Estimation of Moving Foreground Objects Pierre Ponce ee392j Winter March 10, 2004.

Reconstruction of a Scene with Multiple Linearly Moving Objects Mei Han and Takeo Kanade CISC 849.

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

Robust Visual Motion Analysis: Piecewise-Smooth Optical Flow

Real Time Dense 3D Reconstructions: KinectFusion (2011) and Fusion4D (2016) Eleanor Tursman.

The Brightness Constraint

Dynamical Statistical Shape Priors for Level Set Based Tracking

Vehicle Segmentation and Tracking in the Presence of Occlusions

The Brightness Constraint

The Brightness Constraint

Analysis of Contour Motions

Presentation transcript:

Unwrap Mosaics: A new representation for video editing Alex Rav-Acha et al. In SIGGRAPH 2008 발표 이성호 2009 년 1 월 22 일

Abstract A new representation for video Modeling the image-formation process –From an object’s texture map to the image Modulated by an object-space occlusion mask –Recover “unwrap mosaic” Editing Re-composition –Into the original space –Resizing objects –Repainting textures –Copying/cutting/pasting objects –Attaching effects layers to deforming objects 2

Introduction Unwrap mosaics –A wide range of Editing –Deforming surface –Self-occlusion –Provides references for feature-point tracking 3

Reconstructiong 3D model Not easy Software packages –[2d3 Ltd. 2008; Thorm¨ahlen and Broszio 2008] Extensions to nonrigid scenes –[Bregler et al. 2000; Brand 2001; Torresani et al. 2008] Less reliable under occlusion 4

Dense reconstruction from video Restricted to rigid scenes –[Seitz et al. 2006] Using interactive tools –[Debevec et al. 1996; van den Hengel et al. 2007] –Also for rigid scenes Triangulation of the sparse points from nonrigid structure –Surprisingly troublesome 5

Unwrap mosaics Ro recover the object’s texture map –Rather than its 3D shape –Directly from video recovered texture map will be a –2D-to-2D mapping –Sequence of binary masks modeling occlusion –Unwrap mosaic 6

A video will typically be represented –by an assembly of several unwrap mosaics one per object, and one for the background. Edits –performed on mosaic itself –Without converting to 3D Main contribution –Recover the unwrap mosaic from images Energy minimization procedure 7

Figure 2: Reconstruction overview. Steps in constructing an unwrap mosaic representation of a video sequence. Steps 1 to 3a form an initial estimate of the model parameters, and step 3 is an energy minimization procedure which refines the model to subpixel accuracy. 8

Segmentation Segment the sequence into –independently moving objects “video cut and paste” [Li et al. 2005] Allow for user interaction 9

Tracking Recover the texturemap of –a deforming 3D object –from a sequence of 2D images texture map may be assumed to be constant –although the model is changing its shape Interestpoint detection and tracking –[Sand and Teller 2006, for example] 10

Embedding view the sparse point tracks –as a high dimensional projection of the 2D surface parameters 11

Mosaic stitching a map from the tracked points in each image –to the (u; v) parameter space. A variation of [Agarwala et al. 2004] –emerges naturally from the energy formulation. 12

Track refinement the mosaic is good enough –to create a reference template to match against the original frames reduces any drift that may have been present after the original tracking phase 13

Using the model for video editing Edit the texture map –For example by drawing on it –Warp it via the recovered mapping –Combine with the other layers of the original sequence Re-rendered mosaic will not exactly match the original sequence –warped by the 2D–2D mapping –masked by the occlusion masks –alpha-blended with the original image Remove layers –The removed area is filled in because the mapping is defined even in occluded areas. 14

Limitations Textured surfaces are required for point tracking –Low texture –One dimensional textures –motion blur The assumption of a smoothly varying smooth 3D surface –objects with significant protrusions the dinosaur in figure 11 The assumption of smoothly varying lighting –strong shadows will disrupt tracking limited to disc-topology objects –rotating cylinder will be reconstructed as a long tapestry see figure 11 15

The unwrap mosaic model Image generation model –how an image sequence is constructed from a collection of unwrap mosaics –a fitting problem Energy minimization –via nonlinear optimization Initial estimate for the –2D-2D mapping –Texture map –Can be obtained from sparse 2D tracking data. 16

17

18

Point-spread functions 19

20

21

22

23

24

Discrete energy formulation 25

26

Data cost 27

28

Constraints 29

Mapping smoothness 30

Visibility smoothness 31

Minimizing the energy 32

Minimizing over C: stitching 33

34

35

Reparametrization and embedding 36

37

38

Minimizing over w: dense mapping Simply use MATLAB’s griddata to interpolate –No guarantee of minimizing the original energy 39

Minimizing over w and b: dense mapping with occlusion 40

41

Lighting 42

User interaction and hinting Segmentatinon –the user selects objects in a small number of frames, –and the segmentation is propagated using optical flow or sparse tracks. mosaic coverage 43

44

Tuning parameters The robust kernel width τ –is set to match an estimate of image noise. –Set to 5/255 gray-levels Scale parameter τ3 –40 pixels Except the face, which had many outlier tracks spatial smoothness λwl –controls the amount of deformation of the mapping –Constant for all but the “boy” sequence 45

Results Synthetic sequence 46

Synthetic sequence There is no concept of a “ground truth” –visually evaluate the recovered mosaic (figure 7) About 30% of mosaic pixels visible –in any one frame Self occlusion near the nose 47

Face sequence Few high-contrast points coupled –with strong lighting 48

Giraffe sequence: foreground A logo –is placed on the foreground giraffe’s back and head With optical flow, –the annotation drifts by about 10 pixels in 30 frames, while the Unwrap mosaic –shows no visible drift. 49

Boy sequence –both sides of his head and torso are shown –variable focus –motion blur –considerable foreground occlusion Ear is doubled –could be fixed by editing as in section 4 50

Dinosaur sequence Not a smooth reparametrization –of the object’s true “texture map” iterative automatic segmentation –Might separate mosaics for each model component: arms, torso, tail Unwrap mosaic technique is –deformation agnostic 51

Related work Wang and Adelson’s paper [1994] –about how layers might apply to self-occludng objects Irani et al. [1995] –more elaborate parametric transformations Layered Depth Images [Shade et al. 1998] Still photos to mosaics [Bhat et al. 2007] –for rigid scenes Optic flow –energy-based formulation [Bruhn et al. 2005] 52

Related work Point tracking –Sand and Teller [2006] Fitting motion models –corresponding to multiple 3D motions Bhat et al. [2007] Discover texture coordinates –for predefined 3D surface models Zigelman et al. [2002] Use of energy formulations –to regularize models –and to make modelling assumptions coherent and explicit [Fleet et al. 2002, Brox et al. 2004, Bruhn et al. 2005] 53

Discussion Without 3D shape recovery –manipulations of the video which respect the three-dimensionaliy of the scene Viewing reconstruction –as an embedding of tracks into 2D Embedding would be unstable –For more constrained models –Large amounts of missing data that self-occlusion causes We use approximate algorithms –No guarantees of globally minimizing the overall energy 54

Generalizations Automatically segment the layers –Many existing methods could be used –Sparse point tracks are clustered into independent motions Optimizing each layer simultaneously To allow non-boolean masks b Apply matrix factorization to the input tracks –to recover a deformable 3D shape –can place constraints on the mapping which would allow outlier removal –allow exact visibility to be computed 55