Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003 Lecture 17
Announcements No class Wednesday Homework 3 due Wednesday Midterm on Friday –Focus on lecture material up to probability last week—use readings for depth and understanding, but I won’t go there for new topics –Definitions, some calculations (e.g., convolution)
Outline Estimating the camera matrix –Least-squares –Extracting the calibration matrix –Nonlinear least-squares Estimating radial distortion
The Camera Matrix P The transformation performed by a pinhole camera on an arbitrary point in world coordinates is written as: 3 x 4 projective camera matrix has 11 degrees of freedom (DOF): 5 intrinsic, 3 rotation, 3 translation
Applications: Football First-Down Line courtesy of Sportvision
Applications: Virtual Advertising courtesy of Princeton Video Image
First-Down Line, Virtual Advertising: How? The 3-D geometry of the line, advertising rectangle, etc. in world coordinates can be directly transformed into image coordinates for a given camera by projecting it with P
The Problem: Estimating P Given a number of correspondences between 3-D points and their 2-D image projections X i $ x i, we would like to determine P such that x i = PX i for all i
A Calibration Target courtesy of B. Wilburn XZ Y XiXi xixi
Obtaining the Points One method –Edge detection (e.g., Canny) on image of calibration target –Fit straight lines to detected edge segments –Intersect lines to find corners Another method –Directly search for corners using feature detection –Confirm with edge information
Estimating P : The Direct Linear Transformation (DLT) Algorithm Given a number of correspondences between 3-D points and their 2-D image projections X i $ x i, we would like to determine P such that x i = PX i for all i This is an equation involving homogeneous vectors, so PX i and x i are only in the same direction, not strictly equal We can specify “same directionality” by using a cross product formulation:
DLT Camera Matrix Estimation: Preliminaries Let x i = (x i, y i, w i ) T (remember that X i has 4 elements) Denoting the j th row of P by p jT (a 4- element row vector), we have:
DLT Camera Matrix Estimation: Step 1 Then by the definition of the cross product, x i £ PX i is given explicitly as:
DLT Camera Matrix Estimation: Step 2 p jT X i = X i p jT, so we can rewrite the preceding as
DLT Camera Matrix Estimation: Step 3 Collecting terms, this can be written as a matrix product where 0 T = (0, 0, 0, 0). This is a 3 x 12 matrix times a 12-element column vector p = (p 1T, p 2T, p 3T ) T
DLT Camera Matrix Estimation: Step 4 There are only two linearly independent rows here –The third row is obtained by adding x i times the first row to y i times the second and scaling the sum by -1/w i
DLT Camera Matrix Estimation: Step 4 So we can eliminate one row to obtain the following linear matrix equation for the i th pair of corresponding points: Write this as A i p = 0
DLT Camera Matrix Estimation: Step 5 We need at least 5 ½ point correspondences to solve for p –Each point correspondence yields 2 equations (the two lines of A i ) –There are 11 unknowns in the 3 x 4 homo- geneous matrix P (represented in vector form by p ) Stack A i to get homogeneous linear system A p = 0
DLT Camera Matrix Estimation: Step 6 Minimum number of correspondences –Solve linear system exactly 6 or more correspondences: Over- determined –Seek best solution in least-squares sense courtesy of Vanderbilt U.
DLT Camera Matrix Estimation: Least-Squares Want to solve A p = 0 Don’t want the trivial solution p = 0 –Can arbitrarily choose scale (since it’s a homogeneous vector), so set requirement on norm kpk = 1 –This is satisfied by computing the singular value decomposition (SVD) A = UDV T (a non-negative diagonal matrix between two orthogonal matrices) and taking p as the last column of V
Practical Considerations for P Estimation Should have about 30 or more correspondences for a good solution Normalize points beforehand for better numerical conditioning –Subtract centroids, scale so that average distance from origin is p 2 for 2- D points and p 3 for 3-D points with transformations T, U, respectively –“Denormalize” solution P’ by applying inverse scaling, translation transformations: P = T -1 P’U Degenerate configurations of 3-D points: No unique solution –Types Points are on union of plane and single line containing camera center Points are on a twisted cubic (space curve) with camera –Other calibration methods avoid some of these limitations, particularly the co-planarity one courtesy of J. Bouguet
Extracting the Camera Calibration Matrix K from P Remember that P = K[R j t], where the camera’s intrinsic parameters are given by
Anatomy of P Let c = (c’ T, 1) T be the camera center in world coordinates. This is projected to the image center, which is equivalent to Pc = 0 Since P = K[R j t], the above equation holds when t = -Rc’, so we have P = K[R j -Rc’] Let M denote the left 3 x 3 submatrix of P Then M = KR and we can write P = [M j -Mc’]
Extracting K : Method Consider P = [M j -Mc’] 1.Find camera center by solving Pc = 0 Use same SVD approach as for Ap = 0 2.Find camera orientation & internal parameters by factoring M with RQ decomposition (product of upper-triangular & orthogonal matrix) M = KR Eliminate decomposition ambiguity by requiring K ‘s diagonal entries to be positive
Estimating P via Nonlinear Minimization DLT method of minimizing kApk : Efficient but not optimal A better solution can often be obtained by finding P that minimizes a sum of squared differences: This is a nonlinear function—we can’t express it as a matrix product Can solve with a gradient descent method for function minimization like Levenberg-Marquardt (see link on course web page for details) –Initialize with DLT estimate
Correcting Radial Distortion courtesy of Shawn Becker DistortedAfter correction
Modeling Radial Distortion Some function of distance to camera center Approximate nonlinear distortion function with Taylor polynomial To estimate, just include in nonlinear minimization: instead of minimizing x = PX, we write it as: