Optic Flow and Motion Detection Cmput 615 Martin Jagersand
Image motion Somehow quantify the frame-to-frame differences in image sequences.Somehow quantify the frame-to-frame differences in image sequences. 1.Image intensity difference. 2.Optic flow dim image motion computation
Motion is used to: Attention: Detect and direct using eye and head motionsAttention: Detect and direct using eye and head motions Control: Locomotion, manipulation, toolsControl: Locomotion, manipulation, tools Vision: Segment, depth, trajectoryVision: Segment, depth, trajectory
Small camera re-orientation Note: Almost all pixels change!
MOVING CAMERAS ARE LIKE STEREO Locations of points on the object (the “structure”) The change in spatial location between the two cameras (the “motion”)
Classes of motion Still camera, single moving objectStill camera, single moving object Still camera, several moving objectsStill camera, several moving objects Moving camera, still backgroundMoving camera, still background Moving camera, moving objectsMoving camera, moving objects
The optic flow field Vector field over the image:Vector field over the image: [u,v] = f(x,y), u,v = Vel vector, x,y = Im pos [u,v] = f(x,y), u,v = Vel vector, x,y = Im pos FOE, FOC Focus of Expansion, ContractionFOE, FOC Focus of Expansion, Contraction
Motion/Optic flow vectors How to compute? –Solve pixel correspondence problem –given a pixel in Im1, look for same pixels in Im2 Possible assumptionsPossible assumptions 1.color constancy: a point in H looks the same in I –For grayscale images, this is brightness constancy 2.small motion: points do not move very far –This is called the optical flow problem
Optic/image flow Assume: 1.Image intensities from object points remain constant over time 2.Image displacement/motion small
Taylor expansion of intensity variation Keep linear terms Keep linear terms Use constancy assumption and rewrite:Use constancy assumption and rewrite: Notice: Linear constraint, but no unique solutionNotice: Linear constraint, but no unique solution
Aperture problem Rewrite as dot productRewrite as dot product Each pixel gives one equation in two unknowns:Each pixel gives one equation in two unknowns: n*f = k Min length solution: Can only detect vectors normal to gradient directionMin length solution: Can only detect vectors normal to gradient direction The motion of a line cannot be recovered using only local informationThe motion of a line cannot be recovered using only local ðñ á îx îy òó = r Im á îx îy òó f n f ît
Aperture problem 2
The flow continuity constraint Flows of nearby pixels or patches are (nearly) equalFlows of nearby pixels or patches are (nearly) equal Two equations, two unknowns:Two equations, two unknowns: n 1 * f = k 1 n 2 * f = k 2 Unique solution f exists, provided n 1 and n 2 not parallelUnique solution f exists, provided n 1 and n 2 not parallel f n f f n f
Sensitivity to error n 1 and n 2 might be almost paralleln 1 and n 2 might be almost parallel Tiny errors in estimates of k’s or n’s can lead to huge errors in the estimate of fTiny errors in estimates of k’s or n’s can lead to huge errors in the estimate of f f n f f n f
Using several points Typically solve for motion in 2x2, 4x4 or larger image patches.Typically solve for motion in 2x2, 4x4 or larger image patches. Over determined equation system:Over determined equation system: dIm = Mu dIm = Mu Can be solved in least squares sense using MatlabCan be solved in least squares sense using Matlab u = M\dIm u = M\dIm Can also be solved be solved using normal equationsCan also be solved be solved using normal equations u = (M T M) -1 *M T dIm u = (M T M) -1 *M T dIm
3-6D Optic flow Generalize to many freedooms (DOFs)Generalize to many freedooms (DOFs)
All 6 freedoms X Y Rotation Scale Aspect Shear
Conditions for solvability –SSD Optimal (u, v) satisfies Optic Flow equation When is this solvable? A T A should be invertible A T A entries should not be too small (noise) A T A should be well-conditioned Study eigenvalues: – 1 / 2 should not be too large ( 1 = larger eigenvalue)
Edge – gradients very large or very small – large 1, small 2
Low texture region – gradients have small magnitude – small 1, small 2
High textured region – gradients are different, large magnitudes – large 1, large 2
Observation This is a two image problem BUTThis is a two image problem BUT –Can measure sensitivity by just looking at one of the images! –This tells us which pixels are easy to track, which are hard –very useful later on when we do feature tracking...
Errors in Optic flow computation What are the potential causes of errors in this procedure?What are the potential causes of errors in this procedure? –Suppose A T A is easily invertible –Suppose there is not much noise in the image When our assumptions are violatedWhen our assumptions are violated –Brightness constancy is not satisfied –The motion is not small –A point does not move like its neighbors –window size is too large –what is the ideal window size?
Iterative Refinement Used in SSD/Lucas-Kanade tracking algorithmUsed in SSD/Lucas-Kanade tracking algorithm 1.Estimate velocity at each pixel by solving Lucas-Kanade equations 2.Warp H towards I using the estimated flow field - use image warping techniques 3.Repeat until convergence
Revisiting the small motion assumption Is this motion small enough?Is this motion small enough? –Probably not—it’s much larger than one pixel (2 nd order terms dominate) –How might we solve this problem?
Reduce the resolution!
image I image H Gaussian pyramid of image HGaussian pyramid of image I image I image H u=10 pixels u=5 pixels u=2.5 pixels u=1.25 pixels Coarse-to-fine optical flow estimation
image I image J Gaussian pyramid of image HGaussian pyramid of image I image I image H Coarse-to-fine optical flow estimation run iterative L-K warp & upsample......
Application: mpeg compression
HW accelerated computation of flow vectors Norbert’s trick: Use an mpeg-card to speed up motion computationNorbert’s trick: Use an mpeg-card to speed up motion computation
Other applications: Recursive depth recovery: Kostas and JaneRecursive depth recovery: Kostas and Jane Motion control (we will cover)Motion control (we will cover) SegmentationSegmentation TrackingTracking
Lab: Assignment1:Assignment1: Purpose:Purpose: –Intro to image capture and processing –Hands on optic flow experience See www page for details.See www page for details. Suggestions welcome!Suggestions welcome!
Organizing Optic Flow Cmput 615 Martin Jagersand
Questions to think about Readings: Book chapter, Fleet et al. paper. Compare the methods in the paper and lecture 1.Any major differences? 2.How dense flow can be estimated (how many flow vectore/area unit)? 3.How dense in time do we need to sample?
Organizing different kinds of motion Two examples: 1.Greg Hager paper: Planar motion 2.Mike Black, et al: Attempt to find a low dimensional subspace for complex motion
Remember: The optic flow field Vector field over the image:Vector field over the image: [u,v] = f(x,y), u,v = Vel vector, x,y = Im pos [u,v] = f(x,y), u,v = Vel vector, x,y = Im pos FOE, FOC Focus of Expansion, ContractionFOE, FOC Focus of Expansion, Contraction
(Parenthesis) Euclidean world motion -> image Let us assume there is one rigid object moving with velocity T and w = d R / dt For a given point P on the object, we have p = f P/z The apparent velocity of the point is V = -T – w x P Therefore, we have v = dp/dt = f (z V – V z P)/z 2
Component wise: Motion due to translation: depends on depth Motion due to rotation: independent of depth
Remember last lecture: Solving for the motion of a patchSolving for the motion of a patch Over determined equation system: Im t = Mu Im t = Mu Can be solved in e.g. least squares sense using matlab u = M\Im tCan be solved in e.g. least squares sense using matlab u = M\Im t t t+1
3-6D Optic flow Generalize to many freedooms (DOFs)Generalize to many freedooms (DOFs) Im = Mu
Know what type of motion (Greg Hager, Peter Belhumeur) u’ i = A u i + d E.g. Planar Object => Affine motion model: I t = g(p t, I 0 )
Mathematical Formulation Define a “warped image” gDefine a “warped image” g –f(p,x) = x’ (warping function), p warp parameters –I(x,t) (image a location x at time t) –g(p,I t ) = (I(f(p,x 1 ),t), I(f(p,x 2 ),t), … I(f(p,x N ),t))’ Define the Jacobian of warping functionDefine the Jacobian of warping function –M(p,t) = Model Model – I 0 = g(p t, I t ) (image I, variation model g, parameters p) – I = M(p t, I t ) p (local linearization M) Compute motion parametersCompute motion parameters – p = (M T M) -1 M T I where M = M(p t,I t )
Planar 3D motion From geometry we know that the correct plane- to-plane transform isFrom geometry we know that the correct plane- to-plane transform is 1.for a perspective camera the projective homography 2.for a linear camera (orthographic, weak-, para- perspective) the affine warp
Planar Texture Variability 1 Affine Variability Affine warp functionAffine warp function Corresponding image variabilityCorresponding image variability Discretized for imagesDiscretized for images
On The Structure of M u’ i = A u i + d Planar Object + linear (infinite) camera -> Affine motion model X Y Rotation Scale Aspect Shear
Planar Texture Variability 2 Projective Variability Homography warpHomography warp Projective variability:Projective variability: Where,Where, and and
Planar motion under perspective projection Perspective plane-plane transforms defined by homographiesPerspective plane-plane transforms defined by homographies
Planar-perspective motion 3 In practice hard to compute 8 parameter model stably from one image, and impossible to find out-of plane variationIn practice hard to compute 8 parameter model stably from one image, and impossible to find out-of plane variation Estimate variability basis from several images:Estimate variability basis from several images: Computed Estimated Computed Estimated
Another idea Black, Fleet) Organizing flow fields Express flow field f in subspace basis mExpress flow field f in subspace basis m Different “mixing” coefficients a correspond to different motionsDifferent “mixing” coefficients a correspond to different motions
Example: Image discontinuities
Mathematical formulation Let: Mimimize objective function: =Where Motion basis Robust error norm
Experiment Moving camera 4x4 pixel patches4x4 pixel patches Tree in foreground separates wellTree in foreground separates well
Experiment: Characterizing lip motion Very non-rigid!Very non-rigid!
Summary Three types of visual motion extractionThree types of visual motion extraction 1.Optic (image) flow: Find x,y – image velocities 2.3-6D motion: Find object pose change in image coordinates based more spatial derivatives (top down) 3.Group flow vectors into global motion patterns (bottom up) Visual motion still not satisfactorily solved problemVisual motion still not satisfactorily solved problem
Sensing and Perceiving Motion Cmput 610 Martin Jagersand
How come perceived as motion? Im = sin(t)U5+cos(t)U6Im = f1(t)U1+…+f6(t)U6
Counterphase sin grating Spatio-temporal patternSpatio-temporal pattern –Time t, Spatial x,y
Counterphase sin grating Spatio-temporal patternSpatio-temporal pattern –Time t, Spatial x,y Rewrite as dot product: = + Result: Standing wave is superposition of two moving waves
Analysis: Only one term: Motion left or rightOnly one term: Motion left or right Mixture of both: Standing waveMixture of both: Standing wave Direction can flip between left and rightDirection can flip between left and right
Reichardt detector QT movieQT movie
Several motion models Gradient: in Computer VisionGradient: in Computer Vision Correlation: In bio visionCorrelation: In bio vision Spatiotemporal filters: Unifying modelSpatiotemporal filters: Unifying model
Spatial response: Gabor function Definition:Definition:
Temporal response: Adelson, Bergen ’85 Note: Terms from taylor of sin(t) Spatio-temporal D=D s D t
Receptor response to Counterphase grating Separable convolutionSeparable convolution
Simplified: For our grating: (Theta=0)For our grating: (Theta=0) Write as sum of components:Write as sum of components: = exp(…)*(acos… + bsin…) = exp(…)*(acos… + bsin…)
Space-time receptive field
Combined cells Spat: Temp:Spat: Temp: Both:Both: Comb:Comb:
Result: More directionally specific responseMore directionally specific response
Energy model: Sum odd and even phase componentsSum odd and even phase components Quadrature rectifierQuadrature rectifier
Adaption: Motion aftereffect
Where is motion processed?
Higher effects:
Equivalence: Reich and Spat
Conclusion Evolutionary motion detection is importantEvolutionary motion detection is important Early processing modeled by Reichardt detector or spatio-temporal filters.Early processing modeled by Reichardt detector or spatio-temporal filters. Higher processing poorly understoodHigher processing poorly understood