Detection of image intensity changes
Detecting intensity changes Smooth the image intensities reduces effect of noise sets resolution or scale of analysis Differentiate the smoothed intensities transforms image into a representation that facilitates detection of intensity changes Detect and describe features in the transformed image (e.g. peaks or zero-crossings)
Smoothing the image intensity more smoothing
Derivatives of the smoothed intensity “peaks” first derivative second derivative “zero-crossings”
Convolution in one dimension 1 3 10 1 3 10 1 3 10 1 3 10 1 3 10 1 3 10 1 3 10 1 3 10 10 20 intensity I(x) convolution operator 1 3 10 G(x) convolution result 180 190 220 320 350 360 G(x) * I(x)
Smoothing the image intensities Strategy 1: compute the average intensity in a neighborhood around each image position Strategy 2: compute a weighted average of the intensity values in a neighborhood around each image position use a smooth function that weighs nearby intensities more heavily Gaussian function works well -- in one dimension: small σ large σ
The derivative of a convolution
Analyzing a 2D image image after smoothing and second derivative image black = negative white = positive zero-crossings
Convolution in two dimensions 1 8 1 3 8 1 3 8 * convolution result 2D convolution operator 24 = image
Smoothing a 2D image To smooth a 2D image I(x,y), we convolve with a 2D Gaussian: result of convolution G(x,y) * I(x,y) image
Differentiation in 2D To differentiate the smoothed image, we will use the Laplacian operator: We can again combine the smoothing and derivative operations: (displayed with sign reversed)
Detecting intensity changes at multiple scales small σ large σ zero-crossings of convolutions of image with 2G operators
Computing the contrast of intensity changes L
Projection from the retina 1st cortical stage of visual processing: primary visual cortex (or area V1)
Retinal ganglion cells receptive fields exhibit a center-surround structure, whose cross-section is the difference of two Gaussians
Analyzing a 2D image image after smoothing and second derivative image black = negative white = positive zero-crossings
Detecting intensity changes at multiple scales human vision: multiple receptive field sizes in the same region of the visual field receptive field sizes increase with eccentricity (distance from the center of the eye) small σ large σ zero-crossings of convolutions of image with 2G operators
Some simple cells respond best to lines… Light Line Detector Dark Line Detector Firing Rate Firing Rate Horizontal Position Horizontal Position … of a particular contrast sign, orientation, position
Some simple cells respond best to edges Dark-to-light Edge Detector Light-to-dark Edge Detector Firing Rate Firing Rate Horizontal Position Horizontal Position Again, of a particular contrast sign, orientation, position large receptive fields coarse spatial structure small receptive fields fine spatial structure
Magic-eye “autostereograms” Stereo disparity left right Magic-eye “autostereograms”
Stereo viewing geometry LEFT + - RIGHT positive disparity in front of point of fixation negative disparity behind fixation point LEFT RIGHT
Stereo viewing geometry LEFT RIGHT larger disparity further away from fixation point zero disparity LEFT RIGHT
Steps of the stereo process extract features from the left and right images, whose disparity we want to measure match the left and right image features and measure their disparity in position use stereo disparity to compute depth “stereo correspondence problem”
Constraints on stereo correspondence Uniqueness each feature in the left image matches with only one feature in the right (and vice versa…) Similarity matching features appear “similar” in the two images Continuity nearby image features have similar disparities Epipolar constraint simple version: matching features have similar vertical positions, but…
Epipolar constraint Possible matching candidates for pL lie along a line in the right image (the epipolar line…)
Solving the stereo correspondence problem
Measuring goodness of match between patches (1) sum of absolute differences Σ | pleft – pright | (1/n) optional: divide by n = number of pixels in patch patch (2) normalized correlation Σ (pleft – pleft) (pright – pright) (1/n) σpleft σpright patch p = average of values within patch σ = standard deviation of values within patch
Region-based stereo matching algorithm for each row r for each column c let pleft be a square patch centered on (r,c) in the left image initialize best match score mbest to ∞ initialize best disparity dbest for each disparity d from –drange to +drange let pright be a square patch centered on (r,c+d) in the right image compute the match score m between pleft and pright (sum of absolute differences) if (m < mbest), assign mbest = m and dbest = d record dbest in the disparity map at (r,c) (normalized correlation) How are the assumptions used??
The real world works against us sometimes… left right
Example: Region-based stereo matching, using filtered images and sum of absolute differences (results before improvements) (from Carolyn Kim, 2013)
Properties of human stereo processing Use features for stereo matching whose position and disparity can be measured very precisely Stereoacuity is only a few seconds of visual angle difference in depth 0.01 cm at a viewing distance of 30 cm
Properties of human stereo processing Matching features must appear similar in the left and right images For example, a left stereo image cannot be fused with a negative of the right image…
Properties of human stereo processing Only “fuse” objects within a limited range of depth around the fixation distance Vergence eye movements are needed to fuse objects over larger range of depths
Properties of human stereo processing In the early stages of visual processing, the image is analyzed at multiple spatial scales… Stereo information at multiple scales can be processed independently
Matching features for the MPG stereo algorithm zero-crossings of convolution with 2G operators of different size L rough disparities over large range M accurate disparities over small range S
large w left large w right small w left small w right correct match outside search range at small scale
vergence eye movements! large w left right vergence eye movements! small w left right correct match now inside search range at small scale
Stereo images (Tsukuba, CMU)
Zero-crossings for stereo matching + - … …
Simplified MPG algorithm, Part 1 To determine initial correspondence: (1) Find zero-crossings using a 2G operator with central positive width w (2) For each horizontal slice: (2.1) Find the nearest neighbors in the right image for each zero-crossing fragment in the left image (2.2) Fine the nearest neighbors in the left image for each zero-crossing fragment in the right image (2.3) For each pair of zero-crossing fragments that are closest neighbors of one another, let the right fragment be separated by δinitial from the left. Determine whether δinitial is within the matching tolerance, m. If so, consider the zero-crossing fragments matched with disparity δinitial m = w/2
Simplified MPG algorithm, Part 2 To determine final correspondence: (1) Find zero-crossings using a 2G operator with reduced width w/2 (2) For each horizontal slice: (2.1) For each zero-crossing in the left image: (2.1.1) Determine the nearest zero-crossing fragment in the left image that matched when the 2G operator width was w (2.1.2) Offset the zero-crossing fragment by a distance δinitial, the disparity of the nearest matching zero-crossing fragment found at the lower resolution with operator width w (2.2) Find the nearest neighbors in the right image for each zero-crossing fragment in the left image (2.3) Fine the nearest neighbors in the left image for each zero-crossing fragment in the right image (2.4) For each pair of zero-crossing fragments that are closest neighbors of one another, let the right fragment be separated by δnew from the left. Determine whether δnew is within the reduced matching tolerance, m/2. If so, consider the zero-crossing fragments matched with disparity δfinal = δnew + δinitial
w = 8 m = 4 w = 4 m = 2 w = 4 m = 2 Coarse-scale zero-crossings: Use coarse-scale disparities to guide fine-scale matching: w = 4 m = 2 Ignore coarse-scale disparities: w = 4 m = 2 1-43 43
Measuring image motion “aperture problem” “local” motion detectors only measure component of motion perpendicular to moving edge velocity field 2D velocity field not determined uniquely from the changing image need additional constraint to compute a unique velocity field
motion measurement strategy! mystery Sohie! motion measurement strategy! CS ROCKS!
Assume pure translation or constant velocity Vy Vx Error in initial motion measurements Velocities not constant locally Image features with small range of orientations In practice…
Measuring motion in one dimension I(x) Vx x Vx = velocity in x direction rightward movement: Vx > 0 leftward movement: Vx < 0 speed: |Vx| pixels/time step I/x + - + - I/t I/t I/x Vx = -
Measurement of motion components in 2-D (1) gradient of image intensity I = (I/x, I/y) (2) time derivative I/t (3) velocity along gradient: v movement in direction of gradient: v > 0 movement opposite direction of gradient: v < 0 +x +y true motion motion component I/t [(I/x)2 + (I/y)2]1/2 I/t |I| v = - = -
2-D velocities (Vx,Vy) consistent with v All (Vx, Vy) such that the component of (Vx, Vy) in the direction of the gradient is v (ux, uy): unit vector in direction of gradient Use the dot product: (Vx, Vy) (ux, uy) = v Vxux + Vy uy = v
Details… (Vx, Vy) ?? For each component: ux uy v Vxux + Vyuy = v I/x = 10 I/y = -10 I/t = -30 (Vx, Vy) ?? I/x = 10 I/y = 10 I/t = -30 For each component: ux uy v Vxux + Vyuy = v solve for Vx, Vy
Find (Vx, Vy) that minimizes: In practice… Previously… Vy Vxux + Vy uy = v New strategy: Find (Vx, Vy) that best fits all motion components together Vx Find (Vx, Vy) that minimizes: Σ(Vxux + Vy uy - v)2
When is the smoothest velocity field correct? When is it wrong? motion illusions
Computing the smoothest velocity field (Vxi-1, Vyi-1) motion components: Vxiuxi + Vyi uyi= vi (Vxi, Vyi) (Vxi+1, Vyi+1) i-1 i i+1 change in velocity: (Vxi+1-Vxi, Vyi+1-Vyi) Find (Vxi, Vyi) that minimize: Σ(Vxiuxi + Vyiuyi - vi)2 + [(Vxi+1-Vxi)2 + (Vyi+1-Vyi)2]
Two-stage motion measurement motion components 2D image motion Movshon, Adelson, Gizzi & Newsome V1: high % of cells selective for direction of motion (especially in layer that projects to MT) MT: high % of cells selective for direction and speed of motion lesions in MT behavioral deficits in motion tasks
Testing with sine-wave “plaids” moving plaid Movshon et al. recorded responses of neurons in area MT to moving plaids with different component gratings
Movshon et al. observations: Cortical area V1: all neurons behaved like component cells Cortical area MT: layers 4 & 6: component cells layers 2, 3, 5: pattern cells Perceptually, two components are not integrated if: large difference in spatial frequency large difference in speed components have different stereo disparity Evidence for two-stage motion measurement!