The Measurement of Visual Motion P. Anandan Microsoft Research.

The Measurement of Visual Motion P. Anandan Microsoft Research

Direct Methods: Methods for motion and/or shape estimation, which recover the unknown parameters directly from measurable image quantities at each pixel in the image. Minimization step: Direct methods: Error measure based on dense measurable image quantities. Feature-based methods: Error measure based on distances of destinct features.

WHY BOTHER Visual Motion can be annoying –Camera instabilities, jitter –Measure it. Remove it. Visual Motion indicates dynamics in the scene –Moving objects, behavior –Track objects and analyze trajectories Visual Motion reveals spatial layout of the scene –Motion parallax

Video Enhancement Visual Motion can be annoying –Camera instabilities, jitter –Measure it. Remove it.

Temporal Information Visual Motion indicates dynamics in the scene –Moving objects, behavior –Track objects and analyze trajectories

Spatial Layout Visual Motion reveals spatial layout of the scene –Motion parallax Sprite Viewer Demo

Classes of Techniques Feature-based methods –Extract salient visual features (corners, textured areas) and track them over multiple frames –Analyze the global pattern of motion vectors of these features –Sparse motion fields, but possibly robust tracking –Suitable especially when image motion is large (10-s of pixels) Direct-methods –Directly recover image motion from spatio temporal image brightness variations –Global motion parameters directly recovered without an intermediate feature motion calculation –Dense motion fields, but more sensitive to appearance variations –Suitable for video and when image motion is small (< 10 pixels) Our Focus Today

Brightness Constancy Equation: The Brightness Constraint Or, better still, Minimize : Linearizing (assuming small (u,v)):

Gradient Constraint (or the Optical Flow Constraint) Minimizing: The gradient constraint – only one constraint for each pixel In general Hence,

Aperture Problem and Normal Flow

The gradient constraint: Defines a line in the (u,v) space u v Normal Flow:

Local Patch Analysis

Combining Local Constraints u v etc.

Patch Translation Minimizing Assume a single velocity for all pixels within an image patch On the LHS: sum of the 2x2 outer product tensor of the gradient vector

Singularities and the Aperture Problem Let Algorithm: At each pixel compute by solving M is singular if all gradient vectors point in the same direction -- e.g., along an edge -- of course, trivially singular if the summation is over a single pixel -- i.e., only normal flow is available (aperture problem) Corners and textured areas are OK and

Iterative Refinement Estimate velocity at each pixel using one iteration of Lucas and Kanade estimation Warp one image toward the other using the estimated flow field (easier said than done) Refine estimate by repeating the process

Motion and Gradients Consider 1-d signal; assume linear function of x x I t=0 t=1 t

Iterative refinement x t x t t x BUT!!

Limits of the gradient method 1.Fails when intensity structure within window is poor 2.Fails when the displacement is large (typical operating range is motion of 1 pixel) –Linearization of brightness is suitable only for small displacements Also, brightness is not strictly constant in images –actually less problematic than it appears, since we can pre-filter images to make them look similar

Pyramids Pyramids were introduced as a multi-resolution image computation paradigm in the early 80s. The most popular pyramid is the Burt pyramid, which foreshadows wavelets Two kinds of pyramids: Low pass or “Gaussian pyramid” Band-pass or “Laplacian pyramid”

Gaussian Pyramid Convolve image with a small Gaussian kernel –Typically 5x5 Subsample (decimate by 2) to get lower resolution image Repeat for more levels A sequence of low-pass filtered images

Laplacian pyramid Laplacian as Difference of Gaussian Band-pass filtered images Highlights edges at different spatial scales For matching, this is less sensitive to image illumination changes But more noisy than using Gaussians

image I image J JwJw warp refine + Pyramid of image JPyramid of image I image I image J Coarse-to-Fine Estimation u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels ==> small u and v...

Global Motion Models 2D Models: Affine Quadratic Planar projective transform (Homography) 3D Models: Instantaneous camera motion models Homography+epipole Plane+Parallax

Example: Affine Motion Substituting into the B.C. Equation: Each pixel provides 1 linear constraint in 6 global unknowns (minimum 6 pixels necessary) Least Square Minimization (over all pixels):

Quadratic – instantaneous approximation to planar motion Other 2D Motion Models Projective – exact planar motion

3D Motion Models Local Parameter: Instantaneous camera motion: Global parameters: Homography+Epipole Local Parameter: Residual Planar Parallax Motion Global parameters: Local Parameter:

Correlation and SSD For larger displacements, do template matching –Define a small area around a pixel as the template –Match the template against each pixel within a search area in next image. –Use a match measure such as correlation, normalized correlation, or sum-of-squares difference –Choose the maximum (or minimum) as the match –Sub-pixel interpolation also possible

SSD Surface – Textured area

SSD Surface -- Edge

SSD Surface – homogeneous area

Discrete Search vs. Gradient Based Estimation Consider image I translated by The discrete search method simply searches for the best estimate. The gradient method linearizes the intensity function and solves for the estimate

Consider image I translated by Uncertainty in Local Estimation Now, This assumes uniform priors on the velocity field

Quadratic Approximation When are small After some fiddling around, we can show

Posterior uncertainty At edges is singular, but just take pseudo-inverse Note that the error is always convex, since is positive semi-definite i.e., even for occluded points and other false matches, this is the case… seems a bit odd!

Match plus confidence Numerically compute error for various Search for the peak Numerically fit a qudratic to around the peak Find sub-pixel estimate for and covariance If the matrix is negative, it is false match Or even better, if you can afford it, simply maintain a discrete sampling of and

Quadratic Approximation and Covariance Estimation When where

Choosing the Correlation Window Size Small windows lead to more false matches Large windows are better this way, but… –Neighboring flow vectors will be more correlated (since the template windows have more in common) –Flow resolution also lower (same reason) –More expensive to compute Another way to look at this: Small windows are good for local search but more precise and less smooth Large windows good for global search but less precise and more smooth method

Robust Estimation Standard Least Squares Estimation allows too much influence for outlying points

Robust Estimation Robust gradient constraint Robust SSD

Influence functions and Redescening Estimators GO TO THE BOARD

Reweighted Least-Squares Robust Minimization (over all pixels): An Outlier Measure Residual normal flow

Outlier rejection Original Outliers Synthesized

Locking Property ORIGINAL ENHNACED

Original sequencePlane-aligned sequenceRecovered shape “block sequence” [Kumar-Anandan-Hanna’94] Dense 3D Reconstruction (Plane+Parallax)

Original sequence Plane-aligned sequence Recovered shape

Dense 3D Reconstruction Brightness Constancy constraint The intersection of the two line constraints uniquely defines the displacement. Epipolar line epipole p

Limitations Limited search range (up to 10% of the image size). Brightness constancy.

Multi-sensor Alignment Originals: IR and EO images

Direct Correlation Based Alignment

Example: Affine Motion Substituting into the B.C. Equation: Each pixel provides 1 linear constraint in 6 global unknowns    yIxIIyIxII yyyxxx (minimum 6 pixels necessary)

The Brightness Constraint Brightness Constancy Equation: Linearizing J (assume small (u,v)): In practice (although not necessary) one assumes

Motion Models Motion vector (u,v) has two unknowns (at each pixel) Brightness Constraint provides one equation One approach is to use a regularizer (e.g., Horn and Schunk) Alternative: Global Motion Model Constraint –e.g., Affine, Quadratic, Projective (planar) transform (2D Models) –Instantaneous camera motion, homography+epipole, Plane+Parallax, etc. (3D Models)

J JwJw I warp refine + J JwJw I warp refine + J pyramid construction J JwJw I warp refine + I pyramid construction Coarse-to-Fine Estimation

Affine Motion Estimation Each pixel provides one constraint Least square estimation; Minimize:

Coarse-to-Fine Estimation Problem: Brightness linearization assumes small (u,v). Solution: Iterative refinement Coarse-to-fine estimation within a pyramid

The Measurement of Visual Motion P. Anandan Microsoft Research.

Similar presentations

Presentation on theme: "The Measurement of Visual Motion P. Anandan Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Measurement of Visual Motion P. Anandan Microsoft Research.

Similar presentations

Presentation on theme: "The Measurement of Visual Motion P. Anandan Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback