Motion ECE 847: Digital Image Processing Stan Birchfield

Name: Motion ECE 847: Digital Image Processing Stan Birchfield
Uploaded: 2017-07-10T19:04:23+00:00
Duration: PTM23S50
Channel: Meryl Chapman
Description: Motion ECE 847: Digital Image Processing Stan Birchfield

Motion ECE 847: Digital Image Processing Stan Birchfield
Clemson University

What if you only had one eye?
Depth perception is possible by moving eye

Parallax Motion parallax – apparent displacement of object viewed along two different lines of sight

Head movements for depth perception
Some animals move their heads to generate parallax pigeon praying mantis

Motion field and optical flow
Motion field – The actual 3D motion projected onto image plane Optical flow – The “apparent” motion of the brightness pattern in an image (sometimes called optic flow)

When are the two different?
Barber pole illusion Television / movies (no motion field) Rotating ping-pong ball (no optical flow)

Optical flow breakdown
Perhaps an aperture problem discussed later.  * From Marc Pollefeys COMP

Another illusion from G. Bradski, CS223B

Motion field point in world: projection onto image: motion of point:
where Expanding yields:

Motion field (cont.) velocity of projection: Expanding yields:
Note that rotation gives no depth information

Motion field (cont.) Two special cases:
Note: This is a radial field emanating from

Optical Flow Assumptions: Brightness Constancy
* Slide from Michael Black, CS

Optical Flow Assumptions:
* Slide from Michael Black, CS

Optical flow Image at time t Image at time t + Dt
Brightness constancy assumption: Taylor series expansion: Putting together yields:

Optical flow (cont.) standard optical flow equation
From previous slide: Divide both sides by Dt and take the limit: or standard optical flow equation More compactly,

Aperture problem I(x,y,t+Dt)=x I(x,y,t)=x isophote isophote
From previous slide: Key idea: Any function looks linear through small `aperture’ This is one equation, two unknowns! (Underconstrained problem) true motion gradient another possible answer I(x,y,t+Dt)=x isophote I(x,y,t)=x isophote We can only compute component of motion in direction of gradient:

Aperture Problem Exposed
Motion along just an edge is ambiguous from G. Bradski, CS223B

Two approaches to overcome aperture problem
Horn-Schunck (1980) Assume neighboring pixels are similar Add regularization term to enforce smoothness Compute (u,v) for every pixel in image  Dense optical flow Lucas-Kanade (1981) Assume neighboring pixels are same Use additional equations to solve for motion of pixel Compute (u,v) for a small number of pixels (features); each feature treated independently of other features  Sparse optical flow

Lucas-Kanade Recall scalar equation with two unknowns:
Assume neighboring pixels have same motion: where N is the number of pixels in the window or Can solve this directly using least squares, or …

Lucas-Kanade Multiply by AT: or or 2x2 matrix 2x1 vector Note:

RGB version For color images, we can get more equations by using all color channels E.g., for 7x7 window we have 49*3=147 equations!

Improving accuracy Recall our small motion assumption It-1(x,y)
This is not exact To do better, we need to add higher order terms back in: It-1(x,y) This is a polynomial root finding problem Can solve using Newton’s method Also known as Newton-Raphson method Lucas-Kanade method does one iteration of Newton’s method Better results are obtained via more iterations * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Iterative Lucas-Kanade Algorithm
Solve 2x2 equation to get motion for each pixel Shift second image using estimated motion (Use interpolation to improve accuracy) Repeat until convergence shift image2

Lucas-Kanade Algorithm
solving 2x2 equation is easy

Linear interpolation f(x0+2) f(x0-1) f(x0) 1D function f(x0+1) x0-1 x0
f(x) ≈ f(x0) + (x-x0) ( f(x0+1) – f(x0) ) = f(x0) + a ( f(x0+1) – f(x0) ) = (1 – a) f(x0) + a f(x0+1)

Bilinear interpolation
x0-1 x0 x x0+1 x0+2 2D function y0-1 a f00 f10 y0 b y y0+1 f01 f11 y0+2 f(x0,y) ≈ (1-b)f00 + bf01 f(x0+1,y) ≈ (1-b)f10 + bf11 f(x,y) ≈ (1-a) [(1-b)f00 + bf01] + a[(1-b)f10 + bf11] = (1-a)(1-b)f00 + (1-a)bf01 + a(1-b)f10 + abf11

x0-1 x0 x x0+1 x0+2 2D function y0-1 a f00 f10 y0 b y y0+1 f01 f11 y0+2 f(x,y0) ≈ (1-a)f00 + af10 f(x,y0+1) ≈ (1-a)f01 + af11 f(x,y) ≈ (1-b) [(1-a)f00 + af10] + b[(1-a)f01 + af11] = (1-b)(1-a)f00 + (1-b)af10 + b(1-a)f01 + baf11 same result

Simply compute double weighted average of four nearest pixels Be careful to handle boundaries correctly a b

How does Lucas-Kanade work?
Recall Newton’s method (or Newton-Raphson) To find root of function, use first-order approximation Start with initial guess Iterate: Use derivative to find root, under assumption that function is linear Result become new estimate for next iteration

{ Optical Flow: 1D Case Brightness Constancy Assumption:
Because no change in brightness with time Ix v It from G. Bradski, CS223B

Tracking in the 1D case: ? from G. Bradski, CS223B

Tracking in the 1D case: Temporal derivative Spatial derivative
Assumptions: Brightness constancy Small motion from G. Bradski, CS223B

Tracking in the 1D case: Iterating helps refining the velocity vector
Temporal derivative at 2nd iteration Can keep the same estimate for spatial derivative Converges in about 5 iterations from G. Bradski, CS223B

From 1D to 2D tracking 1D: 2D:
One equation, two velocity (u,v) unknowns from G. Bradski, CS223B

From 1D to 2D tracking We get at most “Normal Flow” – with one point we can only detect movement perpendicular to the brightness gradient. Solution is to take a patch of pixels around the pixel of interest. * Slide from Michael Black, CS

From 1D to 2D tracking The Math is very similar: Aperture problem
Window size here ~ 11x11 from G. Bradski, CS223B

When does Lucas-Kanade work?
ATA must be invertible In real world, all matrices are invertible Instead, we must ensure that ATA is well-conditioned (not close to singular): Both eigenvalues are large Ratio of eigenvalues lmax / lmin is not too large

Eigenvalues of Hessian
Z is gradient covariance matrix Related to autocorrelation of I (Moravec interest operator) Sometimes called Hessian Recall from PCA: (Square root of) eigenvalues give length of best fitting ellipse Large eigenvalues means large gradient vectors Small ratio means information in all directions Iy Ix

Finding good features Three cases: l1 and l2 small l1 large, l2 small
Iy Three cases: l1 and l2 small Not enough texture for tracking l1 large, l2 small On intensity edge Aperture problem: Only motion perpendicular to edge can be found l1 and l2 large Good feature to track Ix Iy Ix Iy M. Pollefeys, Note: Even though tracking is a two-frame problem, we can determine good features from only one frame

Finding good features In practice, eigenvalues cannot be too large, b/c image values range from [0,255] Solution: Threshold minimum eigenvalue (Shi and Tomasi 1994) An alternative approach (Harris and Stephens 1987): Note that det(Z) = l1l2 and trace(Z) = l1+ l1 Use det(Z) – k trace(Z)2, where k=0.04 The second term reduces effect of having a small eigenvalue with a large dominant eigenvalue (known as Harris corner detector or Plessey operator) Another alternative: det(Z) / trace(Z) = 1 / (1/l1 + 1/l2)

Good features code % Harris Corner detector - by Kashif Shahzad sigma=2; thresh=0.1; sze=11; disp=0; % Derivative masks dy = [-1 0 1; ; ]; dx = dy'; %dx is the transpose matrix of dy % Ix and Iy are the horizontal and vertical edges of image Ix = conv2(bw, dx, 'same'); Iy = conv2(bw, dy, 'same'); % Calculating the gradient of the image Ix and Iy g = fspecial('gaussian',max(1,fix(6*sigma)), sigma); Ix2 = conv2(Ix.^2, g, 'same'); % Smoothed squared image derivatives Iy2 = conv2(Iy.^2, g, 'same'); Ixy = conv2(Ix.*Iy, g, 'same'); % My preferred measure according to research paper cornerness = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps); % We should perform nonmaximal suppression and threshold mx = ordfilt2(cornerness,sze^2,ones(sze)); % Grey-scale dilate cornerness = (cornerness==mx)&(cornerness>thresh); % Find maxima [rws,cols] = find(cornerness); % Find row,col coords. clf ; imshow(bw); hold on; p=[cols rws]; plot(p(:,1),p(:,2),'or'); title('\bf Harris Corners') from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Example (s=0.1) from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Feature tracking Identify features and track them over video
Usually use a few hundred features When many features have been lost, renew feature detection Assume small difference between frames Potential large difference overall Two problems: Motion between frames may be large Translation assumption is fine between consecutive frames, but long periods of time introduce deformations

Revisiting the small motion assumption
Is this motion small enough? Probably not—it’s much larger than one pixel (2nd order terms dominate) How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Reduce the resolution! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Coarse-to-fine optical flow estimation
Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1 slides from Bradsky and Thrun u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels image It-1 image I

Coarse-to-fine optical flow estimation
Gaussian pyramid of image It-1 Gaussian pyramid of image I image I image It-1 slides from Bradsky and Thrun run iterative L-K warp & upsample run iterative L-K . image J image I

Alternative derivation
Compute translation assuming it is small differentiate: Affine is also possible, but a bit harder (6x6 in stead of 2x2) M. Pollefeys,

Example Simple displacement is sufficient between consecutive frames, but not to compare to reference template M. Pollefeys,

Example M. Pollefeys,

Synthetic example M. Pollefeys,

Good features to keep tracking
Perform affine alignment between first and last frame Stop tracking features with too large errors M. Pollefeys,

Errors in Lucas-Kanade
What are the potential causes of errors in this procedure? Suppose ATA is easily invertible Suppose there is not much noise in the image When our assumptions are violated Brightness constancy is not satisfied The motion is not small A point does not move like its neighbors window size is too large what is the ideal window size? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Feature matching vs. tracking
Image-to-image correspondences are key to passive triangulation-based 3D reconstruction Extract features independently and then match by comparing descriptors Extract features in first images and then try to find same feature back in next view What is a good feature? M. Pollefeys,

Horn-Schunck Dense optical flow Minimize where (data) (smoothness)
Note: We want to find u and v to minimize the integral, but u and v are themselves functions of x and y!

Calculus of variations
A function maps values to real numbers To find the value that minimizes a function, use calculus A functional maps functions to real numbers To find the function that minimizes a functional, use calculus of variations

Calculus of variations
Integral is stationary where independent variable dependent variable explicit function Euler-Lagrange equations

C.o.v. applied to o.f. explicit function independent variables
Euler-Lagrange equations

C.o.v. applied to o.f. (cont.)
Expand: Take derivatives:

Plug back in: where Rearrange: Laplacian Approximate: where (difference of Gaussians) constant mean What if l=0?

Solve equation: This is a pair of equations for each pixel. The combined system of equations is 2N x 2N, where N is the number of pixels in the image. Because it is a sparse matrix, iterative methods (Jacobi iterations, Gauss-Seidel, successive over-relaxation) are more appropriate. Iterate: where

Stationary iterative methods
Problem is to solve sparse linear system: matrix = (diagonal) – (lower) – (upper) Jacobi method (simply solve for element, using existing estimates): Gauss-Seidel is “sloppy Jacobi” (use updates as soon as they are available): Successive over relaxation (SOR) speeds convergence: G-S iterate w=1 means Gauss-Seidel

Horn-Schunck Algorithm

Horn-Schunck Algorithm
note that this l is defined differently

Time-to-Collision

Beyond two-frame tracking

Background subtraction and frame differencing

Spatiotemporal volumes

Layers Wang and Adelson

Motion ECE 847: Digital Image Processing Stan Birchfield

Similar presentations

Presentation on theme: "Motion ECE 847: Digital Image Processing Stan Birchfield"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Motion ECE 847: Digital Image Processing Stan Birchfield

Similar presentations

Presentation on theme: "Motion ECE 847: Digital Image Processing Stan Birchfield"— Presentation transcript:

Similar presentations

About project

Feedback