Answering ‘Where am I?’ by Nonlinear Least Squares

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

A Robust Super Resolution Method for Images of 3D Scenes Pablo L. Sala Department of Computer Science University of Toronto.
Lecture 11: Two-view geometry
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Support Vector Machines
Dense 3D reconstruction with an uncalibrated stereo system using coded structured light Ryo Furukawa Faculty of Information Sciences, Hiroshima City University,
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Computer vision: models, learning and inference
Two-View Geometry CS Sastry and Yang
Chapter 6 Feature-based alignment Advanced Computer Vision.
A Global Linear Method for Camera Pose Registration
Robust Object Tracking via Sparsity-based Collaborative Model
Camera calibration and epipolar geometry
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry Some material taken from:  David Lowe, UBC  Jiri Matas, CMP Prague
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
Single-view geometry Odilon Redon, Cyclops, 1914.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
Advanced Computer Vision Structure from Motion. Geometric structure-from-motion problem: using image matches to estimate: The 3D positions of the corresponding.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Structure Computation. How to compute the position of a point in 3- space given its image in two views and the camera matrices of those two views Use.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #15.
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
Mosaics CSE 455, Winter 2010 February 8, 2010 Neel Joshi, CSE 455, Winter Announcements  The Midterm went out Friday  See to the class.
Image Stitching Ali Farhadi CSE 455
CSE 185 Introduction to Computer Vision
CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.
Satellites in Our Pockets: An Object Positioning System using Smartphones Justin Manweiler, Puneet Jain, Romit Roy Choudhury TsungYun
Lecture 12: Image alignment CS4670/5760: Computer Vision Kavita Bala
Geometry and Algebra of Multiple Views
December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.
CSCE 643 Computer Vision: Structure from Motion
3D Reconstruction Jeff Boody. Goals ● Reconstruct 3D models from a sequence of at least two images ● No prior knowledge of the camera or scene ● Use the.
Efficient computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel.
Pyramidal Implementation of Lucas Kanade Feature Tracker Jia Huang Xiaoyan Liu Han Xin Yizhen Tan.
Single-view geometry Odilon Redon, Cyclops, 1914.
Figure 6. Parameter Calculation. Parameters R, T, f, and c are found from m ij. Patient f : camera focal vector along optical axis c : camera center offset.
COS429 Computer Vision =++ Assignment 4 Cloning Yourself.
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
Geometry Reconstruction March 22, Fundamental Matrix An important problem: Determine the epipolar geometry. That is, the correspondence between.
Announcements No midterm Project 3 will be done in pairs same partners as for project 2.
RECONSTRUCTION OF MULTI- SPECTRAL IMAGES USING MAP Gaurav.
Lecture 16: Image alignment
Calibrating a single camera
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry
Chapter 7. Classification and Prediction
Think-Pair-Share What visual or physiological cues help us to perceive 3D shape and depth?
Multiple View Geometry
Simple Instances of Swendson-Wang & RJMCMC
Multidisciplinary Engineering Senior Design Project P06441 See Through Fog Imaging Preliminary Design Review 05/19/06 Project Sponsor: Dr. Rao Team Members:
Epipolar geometry.
A special case of calibration
Structure from motion Input: Output: (Tomasi and Kanade)
Learning with information of features
Estimating 2-view relationships
Advanced Computer Vision
Camera Calibration Using Neural Network for Image-Based Soil Deformation Measurement Systems Zhao, Honghua Ge, Louis Civil, Architectural, and Environmental.
Structure from Motion with Non-linear Least Squares
Noah Snavely.
Multi-view geometry.
Lecture 8: Image alignment
Single-view geometry Odilon Redon, Cyclops, 1914.
Calibration and homographies
Structure from motion Input: Output: (Tomasi and Kanade)
Lecture 15: Structure from motion
Structure from Motion with Non-linear Least Squares
Presentation transcript:

Answering ‘Where am I?’ by Nonlinear Least Squares Daniel Eaton CPSC 542b December 5th, 2005 12/5/2005 Daniel Eaton

? ? ? Motivation (1/4) “Where am I” - ICCV05 Programming Contest [Given some images of a city taken from known locations, estimate locations of other images] My entry was competitive until optimization became a critical component for success ? ? ? Image Source: Rick Szleski, ICCV ‘05 12/5/2005 Daniel Eaton

Motivation (2/4) My contest entry: dataset matches Find two most similar labeled images 12/5/2005 Daniel Eaton

Motivation (3/4) Given multiple views of same scene, one can do structure and motion reconstruction Input: 2+ views, ‘interest points’ matched across images (visually discriminative parts of the image) Output: 3D position of the corresponding world points, location and orientation of cameras, all up to an overall scale factor This output is all relative to a local coordinate system – with 3 views (2 labeled) can rotate local to world coordinates and resolve scale ambiguity 12/5/2005 Daniel Eaton

Motivation (4/4) Sample input (2 view): 12/5/2005 Daniel Eaton

Formulation (1/1) Phrase structure and motion reconstruction as a nonlinear least squares problem Each point correspondence is associated with a 3D point, and projected into the 2+ images Residual is the difference between the measured point location, and projected point location Want to minimize summed residual of all 3D points & projections, over camera and structure parameters Projection is a nonlinear function 12/5/2005 Daniel Eaton

Implementation (1/6) Used Levenberg-Marquardt algorithm Why? Has been applied successfully to many vision problems already Easy to implement in Matlab (after deriving all the derivatives analytically) Adopted a heuristic for increasing/decreasing the trust region size (rather than the method introduced in class) Used an elliptical trust region to speed up convergence (hand tuned for these parameters) 12/5/2005 Daniel Eaton

Implementation (2/6) Data is noisy & contains outliers, so needed a more robust cost function than L2 Huber function defines a threshold on the residual If greater, point is considered to be an outlier and the L1 error is used for it, if less than, the L2 error is used It is once continuously differentiable With Huber, convergence is slowed (& computation time increased), but accuracy in the estimated camera parameters is greatly increased 12/5/2005 Daniel Eaton

Implementation (3/6) Robust cost function comparison (error on synthetic tests): 12/5/2005 Daniel Eaton

Implementation (4/6) Inner loop involves solving normal equations: Tried Matlab \ and pinv functions, but they will not scale to large numbers of input points (N) J is approx. 4N x 3N, and both of the above cost O(N^3) Solution: find the equivalent p minimizing: 12/5/2005 Daniel Eaton

Implementation (5/6) J is very sparse, so the minimization can be performed quickly with LSQR Two view case 12/5/2005 Daniel Eaton

Implementation (6/6) Unexpected benefit of using LSQR: 12/5/2005 Daniel Eaton

Results (1/6) Experiment 1: Mandarin orange box, two view Input: 12/5/2005 Daniel Eaton

Results (2/6) Experiment 1: Mandarin orange box, two view Output: 12/5/2005 Daniel Eaton

Results (3/6) Experiment 2: Mandarin orange box, three view (5am results) Input: 12/5/2005 Daniel Eaton

Results (4/6) Experiment 2: Mandarin orange box, three view (5am results) Output: 12/5/2005 Daniel Eaton

Results (5/6) What was I looking at? 12/5/2005 Daniel Eaton

Results (6/6) Input: 12/5/2005 Daniel Eaton

Conclusion (1/1) Now have the knowledge and infrastructure to complete my contest entry Could have done much better in the contest with what I know now 12/5/2005 Daniel Eaton

Questions? Thanks for listening! 12/5/2005 Daniel Eaton