Character Animation from 2D Pictures and 3D Motion Data ACM Transactions on Graphics 2007.

Slides:



Advertisements
Similar presentations
The fundamental matrix F
Advertisements

CSE473/573 – Stereo and Multiple View Geometry
CSE554Extrinsic DeformationsSlide 1 CSE 554 Lecture 9: Extrinsic Deformations Fall 2012.
Image-based Clothes Animation for Virtual Fitting Zhenglong Zhou, Bo Shu, Shaojie Zhuo, Xiaoming Deng, Ping Tan, Stephen Lin * National University of.
Computer vision: models, learning and inference
Chapter 6 Feature-based alignment Advanced Computer Vision.
PARAMETRIC RESHAPING OF HUMAN BODIES IN IMAGES SIGGRAPH 2010 Shizhe Zhou Hongbo Fu Ligang Liu Daniel Cohen-Or Xiaoguang Han.
Camera calibration and epipolar geometry
Shape from Contours and Multiple Stereo A Hierarchical, Mesh-Based Approach Hendrik Kück, Wolfgang Heidrich, Christian Vogelgsang.
Computer Graphics Lab Electrical Engineering, Technion, Israel June 2009 [1] [1] Xuemiao Xu, Animating Animal Motion From Still, Siggraph 2008.
A new approach for modeling and rendering existing architectural scenes from a sparse set of still photographs Combines both geometry-based and image.
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
Real-time Combined 2D+3D Active Appearance Models Jing Xiao, Simon Baker,Iain Matthew, and Takeo Kanade CVPR 2004 Presented by Pat Chan 23/11/2004.
High-Quality Video View Interpolation
Multiple-view Reconstruction from Points and Lines
Many slides and illustrations from J. Ponce
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
The Pinhole Camera Model
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
Global Alignment and Structure from Motion Computer Vision CSE455, Winter 2008 Noah Snavely.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Computer Graphics Shadows
Computer Graphics Mirror and Shadows
WP3 - 3D reprojection Goal: reproject 2D ball positions from both cameras into 3D space Inputs: – 2D ball positions estimated by WP2 – 2D table positions.
Computer Graphics Group Tobias Weyand Mesh-Based Inverse Kinematics Sumner et al 2005 presented by Tobias Weyand.
Multimodal Interaction Dr. Mike Spann
Epipolar geometry The fundamental matrix and the tensor
Automatic Registration of Color Images to 3D Geometry Computer Graphics International 2009 Yunzhen Li and Kok-Lim Low School of Computing National University.
13 th International Fall Workshop VISION, MODELING, AND VISUALIZATION 2008 October 8-10, 2008 Konstanz, Germany Strike a Pose Image-Based Pose Synthesis.
Light Using Texture Synthesis for Non-Photorealistic Shading from Paint Samples. Christopher D. Kulla, James D. Tucek, Reynold J. Bailey, Cindy M. Grimm.
Exploitation of 3D Video Technologies Takashi Matsuyama Graduate School of Informatics, Kyoto University 12 th International Conference on Informatics.
Plenoptic Modeling: An Image-Based Rendering System Leonard McMillan & Gary Bishop SIGGRAPH 1995 presented by Dave Edwards 10/12/2000.
Andrew Nealen / Olga Sorkine / Mark Alexa / Daniel Cohen-Or SoHyeon Jeong 2007/03/02.
AS-RIGID-AS-POSSIBLE SHAPE MANIPULATION
Affine Structure from Motion
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
David Levin Tel-Aviv University Afrigraph 2009 Shape Preserving Deformation David Levin Tel-Aviv University Afrigraph 2009 Based on joint works with Yaron.
EECS 274 Computer Vision Affine Structure from Motion.
Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.
2006/10/25 1 A Virtual Endoscopy System Author : Author : Anna Vilanova 、 Andreas K ö nig 、 Eduard Gr ö ller Source :Machine Graphics and Vision, 8(3),
Auto-calibration we have just calibrated using a calibration object –another calibration object is the Tsai grid of Figure 7.1 on HZ182, which can be used.
Reconstruction from Two Calibrated Views Two-View Geometry
Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
EECS 274 Computer Vision Projective Structure from Motion.
Computer vision: models, learning and inference
You can check broken videos in this slide here :
Epipolar geometry.
Common Classification Tasks
Structure from motion Input: Output: (Tomasi and Kanade)
Real-time Skeletal Skinning with Optimized Centers of Rotation
Combining Geometric- and View-Based Approaches for Articulated Pose Estimation David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory.
Uncalibrated Geometry & Stratification
Noah Snavely.
Turning to the Masters: Motion Capturing Cartoons
Prepared by: Engr . Syed Atir Iftikhar
Synthesis of Motion from Simple Animations
Two-view geometry.
Multi-view geometry.
Computer Graphics Lecture 15.
The Pinhole Camera Model
Calibration and homographies
Presentation transcript:

Character Animation from 2D Pictures and 3D Motion Data ACM Transactions on Graphics 2007

Introduction Overview Method Joint Selection Camera and model pose determination Initial model-fitting Animation restriction Future work

In recent years, research on 2D image manipulation has received a huge amount of interest. Very powerful solutions for problems such as matting [Sun et al. 2004], image completion [Drori et al. 2003], texture synthesis [Efros and Leung 1999], and rigid image manipulation [Igarashi et al. 2005] have been presented. Based on these and similar methods, it has now become possible to explore interesting new ideas to reanimate still pictures, for example, as done by Chuang et al. See video

Reconstruction of a textured 3D model Have to create fully textured 3D model and adapted it per image for highly detailed characters, the required manual model- adaption process would be impractical. apply very sophisticated 3D rendering simple 2D morphing and blending often leads to more convincing results than using sophisticated 3D reconstruction and rendering. Hence, here present a purely image-based approach to combine realistic images with realistic 3D motion data in order to generate visually convincing animations of 2D characters.

The 3D motion data consists of a sequence of poses of a predefined human skeleton model. Each pose contains the respective positions and orientations of the skeleton bones and joints in 3D space.

Step 1. reconstructing a camera model and a model pose that best fits the 2D character in the image. Step 2. prepares the subsequent animation by an initial model fitting of a generalized shape template to the user- selected pose of the character. Step 3. rendered in real time using projected 3D joint positions from different poses in the motion sequence to control deformation of the 2D shape in image space.

 Establish the necessary correspondences between the subject’s 2D shape and the 3D skeleton model

find camera projection model P Avariety of linear and nonlinear algorithms exist for this problem [Hartley and Zisserman 2003; Triggs et al. 2000], which estimate P by minimizing, for example, the reprojection error from 3D to 2D: However, since these methods often do not impose any constraints on P, the resulting projection generally does not correspond to a geometrically plausible Euclidean camera model.

parameterized camera projection P = KR[Id| − C] K :intrinsic calibration R:extrinsic calibration C: camera center δx and δy:scaling factors (p x, p y ): image center s : skew factor

To compute an optimized projection matrix P( △ f, α, β, γ, C) based on the remaining seven free parameters, we use an iterative Levenberg- Marquardt solver to minimize the reprojection error - with the aforementioned starting solution.

find the optimal 3D pose X o we couple the camera and model pose estimation into a single optimization problem with eight degrees of freedom by simply running the earlier algorithm over all poses contained in the motion data sequence, and set the pose resulting in the minimal reprojection error as Xo. Our algorithm might fail to converge to a proper solution in cases where the user-defined pose is too different from any pose within the elected 3D motion data sequence. In such cases, the user has the option to deactivate model joints, and to choose a specific, fixed field-of-view.

The basic idea is to use a set of generic shape templates T, which represents the topological changes of a generalized character model for different viewpoints of the camera. For a given image, a shape template can be automatically selected by exploiting the reconstructed extrinsic camera data R and C in relation to the best-matching model pose Xo.

For the deformation of T, we use an as-rigid-as- possible (ARAP) Igarashi et al. [2005] shape manipulation technique, deformation step T → I → F → D.

To generate textures for each animation layer, we decompose T’ by a front-to-back depth-peeling algorithm. For each of these layers, including the background, we complete disoccluded texture regions using an interactive method for image completion [Pavi´c et al. 2006].

since ARAP is aiming at the rigid preservation of the original 2D triangle shapes, this will not lead to plausible deformations because perspective scaling and distortion effects are completely ignored.

Our solution is to generalize the ARAP technique to an as-similar as-possible (ASAP) technique. we estimate their perspective distortion, and fit the distorted triangles f’ to I. each bone b = Xi − Xj defined by two neighboring joints in a pose X. we associate a local coordinate frame L, which changes according to the bone’s movement.

We first consider the triangles in T o which are uniquely associated with a single bone b o. The 2D vertices v of such a triangle are unprojected to 3D by mapping them to a point V o on the plane spanned by the vectors, l x and l y, of b o ’s local frame L o. In order to avoid a cardboard effect in the photo animation at grazing viewing angles,we “inflate” the shape templates by adding an offset αl z to the 3D position V o, where the factor α is zero for boundary vertices of the template T o, and increases for vertices closer to the bone.

V o is expressed in local coordinates with respect to b o – For another frame X t of the motion data, we obtain the3D position of a particular vertex by transforming its local coordinates V o back to global coordinates and then projecting this point back to the image plane v = PV t.

we simplify the rigid fitting step I → F by computing a closed form solution for the optimal rotation which minimizes the squared error over the vertices p i ∈ f and the vertices q i of the corresponding triangle in I.

One restriction of our technique is the fact that it obviously does not work for motions where the character changes its moving direction, or where it turns its head, for example. This would imply on-the-fly switching of the shape template, and even more importantly, resynthesizing occluded textures would be much more difficult.

Methods for automatic pose estimation would reduce the amount of user interaction in cases where the 2D character is in a relatively common pose, such as standing or walking. generate smooth transitions from the user-selected pose to the best-matching pose of the 3D motion, or allow for transitions between different 3D motions. integrate global effects such as shadows or reflections to improve the visual appearance of some scenes. a larger set of 3D motions would allow us to animate animals, or even plants.

Depending on the mismatch between the contour of the template and the character in the input image, the boundaries of D are “snapped” to the character’s silhouette using a 2D-snakes approach [Kass et al. 1987]. Those parts of the silhouette which do not contain enough information for a stable convergence, such as occluded regions of the body, can be manually improved by the user. This refined shape is defined as the final aligned shape T’.