Computational Photography and Video: Video Synthesis Prof. Marc Pollefeys.

Slides:

Advertisements

Similar presentations

Three Dimensional Viewing

Advertisements

Single-view metrology

Video Texture : Computational Photography Alexei Efros, CMU, Fall 2006 © A.A. Efros.

Image Quilting for Texture Synthesis & Transfer Alexei Efros (UC Berkeley) Bill Freeman (MERL) +=

Lecture 19: Single-view modeling CS6670: Computer Vision Noah Snavely.

Lecture 11: Structure from motion, part 2 CS6670: Computer Vision Noah Snavely.

list – let me know if you have NOT been getting mail Assignment hand in procedure –web page –at least 3 panoramas: »(1) sample sequence, (2) w/Kaidan,

Homographies and Mosaics : Computational Photography Alexei Efros, CMU, Fall 2005 © Jeffrey Martin (jeffrey-martin.com) with a lot of slides stolen.

Image Manifolds : Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson.

Scene Modeling for a Single View : Computational Photography Alexei Efros, CMU, Fall 2005 René MAGRITTE Portrait d'Edward James …with a lot of slides.

Announcements Project 2 questions Midterm out on Thursday

Lecture 7: Image Alignment and Panoramas CS6670: Computer Vision Noah Snavely What’s inside your fridge?

CS485/685 Computer Vision Prof. George Bebis

Projective geometry Slides from Steve Seitz and Daniel DeMenthon

Lecture 15: Single-view modeling CS6670: Computer Vision Noah Snavely.

Panorama signups Panorama project issues? Nalwa handout Announcements.

1Jana Kosecka, CS 223b Cylindrical panoramas Cylindrical panoramas with some slides from R. Szeliski, S. Seitz, D. Lowe, A. Efros,

Single-view Metrology and Camera Calibration Computer Vision Derek Hoiem, University of Illinois 02/26/15 1.

CSE 576, Spring 2008Projective Geometry1 Lecture slides by Steve Seitz (mostly) Lecture presented by Rick Szeliski.

Data-driven methods: Video : Computational Photography Alexei Efros, CMU, Fall 2007 © A.A. Efros.

Scene Modeling for a Single View : Computational Photography Alexei Efros, CMU, Spring 2010 René MAGRITTE Portrait d'Edward James …with a lot of.

Cameras, lenses, and calibration

3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.

University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.

3D Scene Models Object recognition and scene understanding Krista Ehinger.

Single-view metrology

Projective Geometry and Single View Modeling CSE 455, Winter 2010 January 29, 2010 Ames Room.

Single-view modeling, Part 2 CS4670/5670: Computer Vision Noah Snavely.

Texture Synthesis by Non-parametric Sampling Alexei Efros and Thomas Leung UC Berkeley.

Single-view 3D Reconstruction Computational Photography Derek Hoiem, University of Illinois 10/18/12 Some slides from Alyosha Efros, Steve Seitz.

Video Textures Arno Schödl Richard Szeliski David Salesin Irfan Essa Microsoft Research, Georgia Tech.

02/26/02 (c) 2002 University of Wisconsin, CS 559 Last Time Canonical view pipeline Orthographic projection –There was an error in the matrix for taking.

Week 5 - Wednesday.  What did we talk about last time?  Project 2  Normal transforms  Euler angles  Quaternions.

CS559: Computer Graphics Lecture 9: Projection Li Zhang Spring 2008.

Computer Graphics Bing-Yu Chen National Taiwan University.

Image stitching Digital Visual Effects Yung-Yu Chuang with slides by Richard Szeliski, Steve Seitz, Matthew Brown and Vaclav Hlavac.

Image-based Rendering. © 2002 James K. Hahn2 Image-based Rendering Usually based on 2-D imagesUsually based on 2-D images Pre-calculationPre-calculation.

Data-driven methods: Video & Texture Cs195g Computational Photography James Hays, Brown, Spring 2010 Many slides from Alexei Efros.

Single-view 3D Reconstruction Computational Photography Derek Hoiem, University of Illinois 10/12/10 Some slides from Alyosha Efros, Steve Seitz.

Multimedia Programming 21: Video and its applications Departments of Digital Contents Sang Il Park Lots of Slides are stolen from Alexei Efros, CMU.

Project 2 –contact your partner to coordinate ASAP –more work than the last project (> 1 week) –you should have signed up for panorama kitssigned up –sign.

Advanced Computer Graphics (Fall 2010) CS 283, Lecture 26 Texture Synthesis Ravi Ramamoorthi Slides, lecture.

Basic Perspective Projection Watt Section 5.2, some typos Define a focal distance, d, and shift the origin to be at that distance (note d is negative)

Single-view 3D Reconstruction

Computer Graphics Lecture 08 Fasih ur Rehman. Last Class Ray Tracing.

Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.

12/24/2015 A.Aruna/Assistant professor/IT/SNSCE 1.

Rendering Pipeline Fall, D Polygon Rendering Many applications use rendering of 3D polygons with direct illumination.

Last Two Lectures Panoramic Image Stitching

More with Pinhole + Single-view Metrology

12/15/ Fall 2006 Two-Point Perspective Single View Geometry Final Project Faustinus Kevin Gozali an extended tour.

CS559: Computer Graphics Lecture 9: 3D Transformation and Projection Li Zhang Spring 2010 Most slides borrowed from Yungyu ChuangYungyu Chuang.

Project 1 Due NOW Project 2 out today –Help session at end of class Announcements.

Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.

Project 2 went out on Tuesday You should have a partner You should have signed up for a panorama kit Announcements.

Video Textures Arno Schödl Richard Szeliski David Salesin Irfan Essa Microsoft Research, Georgia Tech.

Viewing. Classical Viewing Viewing requires three basic elements - One or more objects - A viewer with a projection surface - Projectors that go from.

Homographies and Mosaics : Computational Photography Alexei Efros, CMU, Fall 2006 © Jeffrey Martin (jeffrey-martin.com) with a lot of slides stolen.

Computer Graphics Projections.

Lec 23: Single View Modeling

- Introduction - Graphics Pipeline

CSCE 441 Computer Graphics 3-D Viewing

Scene Modeling for a Single View

Modeling 101 For the moment assume that all geometry consists of points, lines and faces Line: A segment between two endpoints Face: A planar area bounded.

Three Dimensional Viewing

Image Quilting for Texture Synthesis & Transfer

Projective geometry Readings

Announcements list – let me know if you have NOT been getting mail

Announcements Project 2 extension: Friday, Feb 8

Presentation transcript:

Computational Photography and Video: Video Synthesis Prof. Marc Pollefeys

Last Week: HDR

ScheduleComputational Photography and Video 24 FebIntroduction to Computational Photography 3 MarMore on Camera,Sensors and ColorAssignment 1 10 MarWarping, Mosaics and MorphingAssignment 2 17 MarBlending and compositingAssignment 3 24 MarHigh-dynamic rangeAssignment 4 31 MarVideo SynthesisProject proposals 7 AprEaster holiday – no classes 14 AprTBDPapers 21 AprTBDPapers 28 AprTBDPapers 5 MayProject update 12 MayTBDPapers 19 MayPapers 26 MayPapers 2 JuneFinal project presentation

Breaking out of 2D …now we are ready to break out of 2D But must we go to full 3D? 4D?

Today’s schedule Tour Into the Picture 1 Video Textures 2 1 Slides borrowed from Alexei Efros, who built on Steve Seitz’s and David Brogan’s 2 Slides from Arno Schoedl

We want more of the plenoptic function We want real 3D scene walk-throughs: Camera rotation Camera translation Can we do it from a single photograph? on to 3D…

Camera rotations with homographies St.Petersburg photo by A. Tikhonov Virtual camera rotations Original image

Camera translation Does it work? synthetic PP PP1 PP2

Yes, with planar scene (or far away) PP3 is a projection plane of both centers of projection, so we are OK! PP1 PP3 PP2

So, what can we do here? Model the scene as a set of planes! Now, just need to find the orientations of these planes.

Some preliminaries: projective geometry Ames Room alt.link

Silly Euclid: Trix are for kids! Parallel lines???

(0,0,0) The projective plane Why do we need homogeneous coordinates? – represent points at infinity, homographies, perspective projection, multi-view relationships What is the geometric intuition? – a point in the image is a ray in projective space (sx,sy,s) Each point (x,y) on the plane is represented by a ray (sx,sy,s) –all points on the ray are equivalent: (x, y, 1)  (sx, sy, s) image plane (x,y,1) y x z

Projective lines What does a line in the image correspond to in projective space? A line is a plane of rays through origin –all rays (x,y,z) satisfying: ax + by + cz = 0 A line is also represented as a homogeneous 3-vector l lp

l Point and line duality – A line l is a homogeneous 3-vector – It is  to every point (ray) p on the line: l p=0 p1p1 p2p2 What is the intersection of two lines l 1 and l 2 ? p is  to l 1 and l 2  p = l 1  l 2 Points and lines are dual in projective space given any formula, can switch the meanings of points and lines to get another formula l1l1 l2l2 p What is the line l spanned by rays p 1 and p 2 ? l is  to p 1 and p 2  l = p 1  p 2 l is the plane normal

Ideal points and lines Ideal point (“point at infinity”) – p  (x, y, 0) – parallel to image plane – It has infinite image coordinates (sx,sy,0) y x z image plane Ideal line l  (0, 0, 1) – parallel to image plane

Vanishing points Vanishing point – projection of a point at infinity – Caused by ideal line image plane camera center ground plane vanishing point

Vanishing points (2D) image plane camera center line on ground plane vanishing point

Vanishing points Properties – Any two parallel lines have the same vanishing point v – The ray from C through v is parallel to the lines – An image may have more than one vanishing point image plane camera center C line on ground plane vanishing point V line on ground plane

Vanishing lines Multiple Vanishing Points – Any set of parallel lines on the plane define a vanishing point – The union of all of these vanishing points is the horizon line also called vanishing line – Note that different planes define different vanishing lines v1v1 v2v2

Vanishing lines Multiple Vanishing Points – Any set of parallel lines on the plane define a vanishing point – The union of all of these vanishing points is the horizon line also called vanishing line – Note that different planes define different vanishing lines

Computing vanishing points Properties – P  is a point at infinity, v is its projection – They depend only on line direction – Parallel lines P 0 + tD, P 1 + tD intersect at P  V P0P0 D

Computing vanishing lines Properties – l is intersection of horizontal plane through C with image plane – Compute l from two sets of parallel lines on ground plane – All points at same height as C project to l points higher than C project above l – Provides way of comparing height of objects in the scene ground plane l C

Fun with vanishing points

“Tour into the Picture” (SIGGRAPH ’97) Horry, Anjyo, Arai Create a 3D “theatre stage” of five billboards Specify foreground objects through bounding polygons Use camera transformations to navigate through the scene

The idea Many scenes (especially paintings), can be represented as an axis- aligned box volume (i.e. a stage) Key assumptions: – All walls of volume are orthogonal – Camera view plane is parallel to back of volume – Camera up is normal to volume bottom How many vanishing points does the box have? – Three, but two at infinity – Single-point perspective Can use the vanishing point to fit the box to the particular Scene!

Fitting the box volume User controls the inner box and the vanishing point placement (# of DOF???) Q: What’s the significance of the vanishing point location? A: It’s at eye level: ray from COP to VP is perpendicular to image plane.

High Camera Example of user input: vanishing point and back face of view volume are defined

High Camera Example of user input: vanishing point and back face of view volume are defined

Low Camera Example of user input: vanishing point and back face of view volume are defined

Low Camera Example of user input: vanishing point and back face of view volume are defined

High CameraLow Camera Comparison of how image is subdivided based on two different camera positions. You should see how moving the vanishing point corresponds to moving the eyepoint in the 3D world.

Left Camera Another example of user input: vanishing point and back face of view volume are defined

Left Camera Another example of user input: vanishing point and back face of view volume are defined

Right Camera Another example of user input: vanishing point and back face of view volume are defined

Right Camera Another example of user input: vanishing point and back face of view volume are defined

Left CameraRight Camera Comparison of two camera placements – left and right. Corresponding subdivisions match view you would see if you looked down a hallway.

2D to 3D conversion First, we can get ratios leftright top bottom vanishing point back plane

Size of user-defined back plane must equal size of camera plane (orthogonal sides) Use top versus side ratio to determine relative height and width dimensions of box Left/right and top/bot ratios determine part of 3D camera placement leftright top bottom camera pos 2D to 3D conversion

Depth of the box Can compute by similar triangles (CVA vs. CV’A’) Need to know focal length f (or FOV) Note: can compute position on any object on the ground – Simple unprojection – What about things off the ground?

DEMO Now, we know the 3D geometry of the box We can texture-map the box walls with texture from the image link to web page with example code TIP demo

Foreground Objects Use separate billboard for each For this to work, three separate images used: – Original image. – Mask to isolate desired foreground images. – Background with objects removed

Foreground Objects Add vertical rectangles for each foreground object Can compute 3D coordinates P0, P1 since they are on known plane. P2, P3 can be computed as before (similar triangles)

Foreground TIP movie UVA example

See also… Tour into the picture with water surface reflection Tour into the picture with water surface reflection Tour into the Video: Tour into the Video – by Kang + Shin

Today’s schedule Tour Into the Picture 1 Video Textures 2 1 Slides borrowed from Alexei Efros, who built on Steve Seitz’s and David Brogan’s 2 Slides from Arno Schoedl

Markov Chains probability of going from state i to state j in n time steps: and the single-step transition as: The n-step transition satisfies the Chapman-Kolmogorov equation,Chapman-Kolmogorov equation that for any 0<k<n:

Regular Markov chain: class of Markov chains where the starting state of the chain has little or no impact on the p(X) after many steps. Markov Chains

Markov Chain What if we know today and yestarday’s weather?

Text Synthesis [ Shannon,’48] proposed a way to generate English- looking text using N-grams: – Assume a generalized Markov model – Use a large text to compute prob. distributions of each letter given N-1 previous letters – Starting from a seed repeatedly sample this Markov chain to generate new letters – Also works for whole words WE NEEDTOEATCAKE

Mark V. Shaney (Bell Labs) Results (using alt.singles corpus): – “As I've commented before, really relating to someone involves standing next to impossible.” – “One morning I shot an elephant in my arms and kissed him.” – “I spent an interesting evening recently with a grain of salt”

Video Textures Arno Schödl Richard Szeliski David Salesin Irfan Essa Microsoft Research, Georgia Tech Link to local versionGondry Example

Still photos

Video clips

Video textures

Problem statement video clipvideo texture

Our approach How do we find good transitions?

Finding good transitions Compute L 2 distance D i, j between all frames Similar frames make good transitions frame ivs. frame j

Markov chain representation Similar frames make good transitions

Transition costs Transition from i to j if successor of i is similar to j Cost function: C i  j = D i+1, j

Transition probabilities Probability for transition P i  j inversely related to cost: P i  j ~ exp ( – C i  j /  2 ) high  low 

Preserving dynamics

Cost for transition i  j C i  j = w k D i+k+1, j+k

Preserving dynamics – effect Cost for transition i  j C i  j = w k D i+k+1, j+k

Dead ends No good transition at the end of sequence

Future cost Propagate future transition costs backward Iteratively compute new cost F i  j = C i  j +  min k F j  k

Future cost Propagate future transition costs backward Iteratively compute new cost F i  j = C i  j +  min k F j  k

Future cost Propagate future transition costs backward Iteratively compute new cost F i  j = C i  j +  min k F j  k

Future cost Propagate future transition costs backward Iteratively compute new cost F i  j = C i  j +  min k F j  k

Propagate future transition costs backward Iteratively compute new cost F i  j = C i  j +  min k F j  k Q-learning Future cost

Future cost – effect

Finding good loops Alternative to random transitions Precompute set of loops up front

Visual discontinuities Problem: Visible “Jumps”

Crossfading Solution: Crossfade from one sequence to the other.

Morphing Interpolation task:

Morphing Interpolation task: Compute correspondence between pixels of all frames

Morphing Interpolation task: Compute correspondence between pixels of all frames Interpolate pixel position and color in morphed frame based on [Shum 2000]

Results – crossfading/morphing

Jump Cut Crossfade Morph video

Crossfading

Frequent jump & crossfading

Video portrait Useful for web pages

Combine with IBR techniques Video portrait – 3D

Region-based analysis Divide video up into regions Generate a video texture for each region

Automatic region analysis

User selects target frame range User-controlled video textures slowvariablefast

Video-based animation Like sprites computer games Extract sprites from real video Interactively control desired motion ©1985 Nintendo of America Inc.

Video sprite extraction Blue screen matting and velocity estimation

Video sprite control Augmented transition cost:

Video sprite control Need future cost computation Precompute future costs for a few angles. Switch between precomputed angles according to user input [GIT-GVU-00-11]

Interactive fish

Summary Video clips  video textures – define Markov process – preserve dynamics – avoid dead-ends – disguise visual discontinuities

Discussion Some things are relatively easy

Discussion Some are hard

A final example

Michel Gondry train video

Video Sprites Link to Video Sprites summary

Trainable Videorealistic Speech Animation SIGGRAPH 2002 paper by Tony Ezzat, Gadi Geiger, and Tomaso Poggio – At Center for Biological and Computational Learning, MIT MikeTalk link (from 1998) MikeTalk link Mary101 (web page of this research) Mary101