CS 433/557 Algorithms for Image Analysis

Slides:



Advertisements
Similar presentations
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Advertisements

CS 4487/9587 Algorithms for Image Analysis
Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.
Computer Vision Lecture 16: Region Representation
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Segmentation (2): edge detection
Image alignment Image from
Snakes with Some Math.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Snakes - Active Contour Lecturer: Hagit Hel-Or
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
Lecture 5 Template matching
Robust and large-scale alignment Image from
EE 7730 Image Segmentation.
Announcements Final Exam May 16 th, 8 am (not my idea). Practice quiz handout 5/8. Review session: think about good times. PS5: For challenge problems,
EE663 Image Processing Edge Detection 5 Dr. Samir H. Abdul-Jauwad Electrical Engineering Department King Fahd University of Petroleum & Minerals.
Iterative closest point algorithms
Motion Analysis (contd.) Slides are from RPI Registration Class.
CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.
Segmentation CSE P 576 Larry Zitnick Many slides courtesy of Steve Seitz.
Segmentation Divide the image into segments. Each segment:
Chamfer Matching & Hausdorff Distance Presented by Ankur Datta Slides Courtesy Mark Bouts Arasanathan Thayananthan.
(1) Feature-point matching by D.J.Duff for CompVis Online: Feature Point Matching Detection, Extraction.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
Fitting a Model to Data Reading: 15.1,
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.
Presenter: Stefan Zickler
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean Hall 5409 T-R 10:30am – 11:50am.
October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.
CSE 185 Introduction to Computer Vision
06 - Boundary Models Overview Edge Tracking Active Contours Conclusion.
Multimodal Interaction Dr. Mike Spann
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
Shape Matching for Model Alignment 3D Scan Matching and Registration, Part I ICCV 2005 Short Course Michael Kazhdan Johns Hopkins University.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Edge Linking & Boundary Detection
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Digital Image Processing CCS331 Relationships of Pixel 1.
Generalized Hough Transform
CS 4487/6587 Algorithms for Image Analysis
EECS 274 Computer Vision Segmentation by Clustering II.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.
Tracking Hands with Distance Transforms Dave Bargeron Noah Snavely.
CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.
Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Another Example: Circle Detection
TP11 - Fitting: Deformable contours
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
University of Ioannina
LOCUS: Learning Object Classes with Unsupervised Segmentation
Mean Shift Segmentation
Feature description and matching
Dynamical Statistical Shape Priors for Level Set Based Tracking
A special case of calibration
Fitting Curve Models to Edges
Snakes, Shapes, and Gradient Vector Flow
Comparing Images Using Hausdorff Distance
Flexible templates, density estimation, mean shift
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Presentation transcript:

CS 433/557 Algorithms for Image Analysis Template Matching Acknowledgements: Dan Huttenlocher

CS 433/557 Algorithms for Image Analysis Matching and Registration Template Matching intensity based (correlation measures) feature based (distance transforms) Flexible Templates pictorial structures Dynamic Programming on trees generalized distance transforms Extra Material:

Intensity Based Template Matching Basic Idea Left ventricle template image Face template image Find best template “position” in the image

Intensity-Based Rigid Template matching image coordinate system s template coordinate system pixel p in template T pixel p+s in image For each position s of the template compute some goodness of “match” measure Q(s) e.g. sum of squared differences Sum over all pixels p in template T

Intensity-Based Rigid Template matching image coordinate system template coordinate system s1 s2 Search over all plausible positions s and find the optimal one that has the largest goodness of match value Q(s)

Intensity-Based Rigid Template matching What if intensities of your image are not exactly the same as in the template? (e.g. may happen due to different gain setting at image acquisition)

Other intensity based goodness of match measures Normalized correlation Mutual Information (next slide)

Other goodness of match measures : Mutual Information Will work even in extreme cases In this example the spatial structure of template and image object are similar while actual intensities are completely different

Other goodness of match measures : Mutual Information Fix s and consider joint histogram of intensity “pairs”: T I Joint histogram is spread-out for s1 s2 Joint histogram is more concentrated (peaked) for s2 T I T s1 I Mutual information between template T and image I (for given transformation s) describes “peakedness” of the joint histogram measures how well spatial structures in T and I align

Mutual Information (technical definition) Assuming two random variables X and Y their mutual information is entropy and joint entropy e for random variables X and Y measures “peakedness” of histogram/distribution marginal histogram (distribution) joint histogram (distribution)

Mutual Information Computing MI for a given position s We want to find s that maximizes MI that can be written as T marginal distributions Pr(x) and Pr(y) I joint distribution Pr(x,y) (normalized histogram) for a fixed given s NOTE: has to be careful when computing. For example, what if H(x,y)=0 for a given pair (x,y)?

Finding optimal template position s Need to search over all feasible values of s Template T could be large The bigger the template T the more time we spend computing goodness of match measure at each s Search space (of feasible positions s) could be huge Besides translation/shift, position s could include scale, rotation angle, and other parameters (e.g. shear) Q: Efficient search over all s?

Finding optimal template position s One possible solution: Hierarchical Approach Subsample both template and image. Note that the search space can be significantly reduced. The template size is also reduced. Once a good solution(s) is found at a corser scale, go to a finer scale. Refine the search in the neighborhood of the courser scale solution.

Feature Based Template Matching Features: edges, corners,… (found via filtering) Distance transforms of binary images Chamfer and Housdorff matching Iterated Closed Points

Feature-based Binary Templates/Models What are they? What are features? Object edges, corners, junctions, e.t.c. Features can be detected by the corresponding image filters Intensity can also be a considered a feature but it may not be very robust (e.g. due to illumination changes) A model (binary template) is a set of feature points in N-dimensional space (also called feature space) Each feature is defined by a descriptor (vector)

Binary Feature Templates (Models) 2D example Links may represent neighborhood relationships between the features of the model Model’s features are represented by points reference point descriptor could be a 2D vector specifying feature position with respect to model’s coordinate system Feature spaces could be 3D (or higher). E.g., position of an edge in a medical volumes is a 3D vector. But even in 2D images edge features can be described by 3D vectors (add edge’s angular orientation to its 2D location) 2D feature space For simplicity, we will mainly concentrate on 2D feature space examples

Matching Binary Template to Image L - model’s positioning - position of feature i At fixed position L we can compute match quality Q(L) using some goodness of match criteria. Object is detected at all positions which are local maxima of function Q(L) such that where K is some presence threshold Example: Q(L) = number of (exact) matches (in red) between model and image features (e.g. edges).

Exact feature matching is not robust Counting exact matches may be sensitive to even minor deviation in shape between the model and the actual object appearance

Distance Transform More robust goodness of match measures use distance transform of image features s Detect desirable image features (edges, corners, e.t.c.) using appropriate filters For all image pixels p find distance D(p) to the nearest image feature q p

Distance Transform Image features (2D) 3 4 2 5 1 Distance Transform Distance Transform is a function that for each image pixel p assigns a non-negative number corresponding to distance from p to the nearest feature in the image I

Distance Transform can be visualized as a gray-scale image Image features (edges) Distance Transform

Distance Transform can be very efficiently computed

Distance Transform can be very efficiently computed

Metric properties of discrete Distance Transforms Forward mask Backward mask Metric Set of equidistant points - 1 Manhattan (L1) metric 1.4 1 Better approximation of Euclidean metric Exact Euclidean Distance transform can be computed fairly efficiently (in linear time) without bigger masks. www.cs.cornell.edu/~dph/matchalgs/ Euclidean (L2) metric

Goodness of Match via Distance Transforms At each model position one can “probe” distance transform values at locations specified by model (template) features 3 4 2 5 1 Use distance transform values as evidence of proximity to image features.

Goodness of Match Measures using Distance Transforms Chamfer Measure sum distance transform values “probed” by template features Hausdorff Measure k-th largest value of the distance transform at locations “probed” by template features (Equivalently) number of template features with “probed” distance transform values less than fixed (small) threshold Count template features “sufficiently” close to image features Spatially coherent matching

Hausdorff Matching counting matches with a dialated set of image features

Spatial Coherence of Feature Matches 50% matched 50% matched Spatially incoherent matches Few “discontinuities” between neighboring features Neighborhood is defined by links between template/model features Spatial coherence:

Spatially Coherent Matching Separate template/model features into three subsets Matchable (red) -near image features Boundary (blue circle) -matchable but “near” un-matchable -links define “near” for model features Un-matchable (gray) -far from image features Count the number of non-boundary matchable features

Spatially Coherent Matching Percentage of non-boundary matchable features (spatially coherent matches)

Comparing different match measures Monte Carlo experiments with known object location and synthetic clutter and occlusion -Matching edge locations Varying percent clutter -Probability of edge pixel 2.5-15% Varying occlusion -Single missing interval 10-25% of the boundary Search over location, scale, orientation Binary model (edges) 5% clutter image

Comparing different match measures: ROC curves Probability of false alarm versus detection - 10% and 15% of occlusion with 5% clutter -Chamfer is lowest, Hausdorff (f=0.8) is highest -Chamfer truncated distance better than trimmed

ROC’s for Spatial Coherence Matching Clutter 3% Occlusion 20% FA CD 1 Clutter 5% Occlusion 40% Parameter defined degree of connectivity between model features If then model features are not connected at all. In this case, spatially coherent matching reduces to plain Hausdorff matching.

Edge Orientation Information Match edge orientation (in addition to location) Edge normals or gradient direction 3D model feature space (2D location + orientation) Extract 3D (edge) features from image as well. Requires 3D distance transform of image features weight orientation versus location fast forward-backward pass algorithm applies Increases detection robustness and speeds up matching better able to discriminate object from clutter better able to eliminate cells in branch and bound search

ROC’s for Oriented Edge Pixels Vast Improvement for moderate clutter Images with 5% randomly generated contours Good for 20-25% occlusion rather than 2-5% Oriented Edges Location only

Efficient search for good matching positions L Distance transform of observed image features needs to be computed only once (fast operation). Need to compute match quality for all possible template/model locations L (global search) Use hierarchical approach to efficiently prune the search space. Alternatively, gradient descent from a given initial position (e.g. Iterative Closest Point algorithm, …later) Easily gets stuck at local minima Sensitive to initialization

Global Search Hierarchical Search Space Pruning Assume that the entire box might be pruned out if the match quality is sufficiently bad in the center of the box (how? … in a moment)

Global Search Hierarchical Search Space Pruning If a box is not pruned then subdivide it into smaller boxes and test the centers of these smaller boxes.

Global Search Hierarchical Search Space Pruning Continue in this fashion until the object is localized.

Pruning a Box (preliminary technicality) Location L’ is uniformly better than L” if for all model features i 5 6 7 2 4 3 1 L’ L” 9 10 11 8 7 12 6 5 4 3 2 1 A uniformly better location is guaranteed to have better match quality!

Pruning a Box (preliminary technicality) 7 5 6 4 3 8 2 1 Assume that is uniformly better than any location in the box hypothetical location Assume that is uniformly better than any location then the match quality satisfies for any If the presence test fails ( for a given threshold K) then any location must also fail the test The entire box can be pruned by one test at !!!!

Building “ “ for a Box of “Radius” n at the center of the box 9 10 11 8 7 12 6 5 4 3 2 1 7 5 6 4 3 8 2 1 hypothetical location value of the distance transform changes at most by 1 between neighboring pixels value of can decrease by at most n (box radius) for other box positions

Global Hierarchical Search (Branch and Bound) Hierarchical search works in more general case where “position” L includes translation, scale, and orientation of the model N-dimensional search space Guaranteed or admissible search heuristic Bound on how good answer could be in unexplored region can not miss an answer In worst case won’t rule anything out In practice rule out vast majority of template locations (transformations)

Local Search (gradient descent): Iterated Closest Point algorithm ICP: Iterate until convergence Estimate correspondence between each template feature i and some image feature located at F(i) (Fitzgibbons: use DT) Move model to minimize the sum of distances between the corresponding features (like chamfer matching) Alternatively, find local move of the model improving DT-based match quality function Q(L)

Problems with ICP and gradient descent matching Slow Can take many iterations ICP: each iteration is slow due to search for correspondences Fitzgibbons: improve this by using DT No convergence guarantees Can get stuck in local minima Not much to do about this Can be improved by using robust distance measures (e.g. truncated Euclidean measure)

Observations on DT based matching Main point of DT: allows to measure match quality without explicitly finding correspondence between pairs of mode and image features (hard problem!) Hierarchical search over entire transformation space Important to use robust distance Straight Chamfer very sensitive to outliers Truncated DT can be computed very fast Fast exact or approximate methods for DT ( metric) For edge features use orientation too edge normals or intensity gradients

Rigid 2D templates Should we really care? So far we studied matching in case of 2D images and rigid 2D templates/models of objects When do rigid 2D templates work? there are rigid 2D objects (e.g. fingerprints) 3D object may be imaged from the same view point: controlled image-based data bases (e.g. photos of employees, criminals) 2D satellite images always view 3D objects from above X-Rays, microscope photography, e.t.c.

More general 3D objects 3D image volumes and 3D objects Distance transforms, DT-based matching criteria, and hierarchical search techniques easily generalize Mainly medical applications 2D image and 3D objects 3D objects may be represented by a collection of 2D templates (e.g. tree-structured templates, next slide) 3D objects may be represented by flexible 2D templates (soon)

Tree-structured templates Larger pair-wise differences higher in tree

Tree-structured templates Rule out multiple templates simultaneously - Speeds up matching - Course-to-fine search where coarse granularity can rule out many templates - Applies to variety of DT based matching measures: Chamfer, Hausdorff, robust Chamfer

Flexible Templates Flexible Template combines a number of rigid templates connected by flexible strings parts connected by springs and appearance models for each part Used for human bodies, faces Fischler & Elschlager, 1973 – considerable recent work (e.g. Felzenszwalb & Huttenlocher, 2003 )

Flexible Templates Why? To account for significant deviation between proportions of generic model (e.g average face template) and a multitude of actual object appearance non-rigid (3D) objects may consist of multiple rigid parts with (relatively) view independent 2D appearance

Flexible Templates: Formal Definition Set of parts Positioning Configuration specifies locations of the parts Appearance model matching quality of part i at location Edge for connected parts explicit dependency between edge-connected parts Interaction/connection energy e.g. elastic energy

Flexible Templates: Formal Definition Find configuration L (location of all parts) that minimizes Difficulty depends on graph structure Which parts are connected (E) and how (C) General case: exponential time

Flexible Templates: simplistic example from the past Discrete Snakes What graph? What appearance model? What connection/interaction model? What optimization algorithm?

Flexible Templates: special cases Pictorial Structures What graph? What appearance model? -intensity based match measure -DT based match measure (binary templates) What connection/interaction model? -elastic springs What optimization algorithm?

Dynamic Programming for Flexible Template Matching DP can be used for minimization of E(L) for tree graphs (no loops!)

Dynamic Programming for Flexible Template Matching root DP algorithm on trees Choose post-order traversal for any selected “root” site/part Compute for all “leaf” parts Process a part after its children are processed Select best energy position for the “root” and backtrack to “leafs” If part ‘i ‘ has only one child ‘a’ If part ‘i ‘ has two (or more) children ‘a’, ‘b’, …

Dynamic Programming for Flexible Template Matching root DP’s complexity on trees (same as for 1D snakes) n parts, m positions OK complexity for local search where “m” is relatively small (e.g. in snakes) E.g. for tracking a flexible model from frame to frame in a video sequence

Local Search Tracking Flexible Templates

Local Search Tracking Flexible Templates

Local Search Tracking Flexible Templates

Searching in the whole image (large m) m = image size or m = image size*rotations Then complexity is not good For some interactions can improve to based on Generalized Distance Transform (from Computational Geometry) This is an amazing complexity for matching n dependent parts Note that is the number of operations for finding n independent matches

Generalized Distance Transform Idea: improve efficiency of the key computational step (performed for each parent-child pair, n-times) ( operations) Intuitively: if x and y describe all feasible positions of “parts” in the image then energy functions and can be though of as some gray-scale images (e.g. like responses of the original image to some filters)

Generalized Distance Transform Idea: improve efficiency of the key computational step ( operations performed for each parent-child pair) Let (distance between x and y) reasonable interactions model! Then is called a Generalized Distance Transform of

From Distance Transform to Generalized Distance Transform Assuming then is standard Distance Transform (of image features) Locations of binary image features

From Distance Transform to Generalized Distance Transform For general and any fixed is called Generalized Distance Transform of E(y) may prefer strength of E(x) to proximity E(x) may represent non-binary image features (e.g. image intensity gradient)

Algorithm for computing Generalized Distance Transform Straightforward generalization of forward-backward pass algorithm for standard Distance Transforms Initialize to E(x) instead of Use instead of 1

Flexible Template Matching Complexity Computing via Generalized Distance Transform: previously (m-number of positions x and y) Improves complexity of Flexible Template Matching to in case of interactions

“Simple” Flexible Template Example: Central Part Model Consider special case in which parts translate with respect to common origin E.g., useful for faces Parts Distinguished central part Connect each to Elastic spring costs NOTE: for simplicity (only) we consider part positions that are translations only (no rotation or scaling of parts)

Central Part Model example “Ideal” location w.r.t. is given by where is a fixed translation vector for each i>1 “String cost for deformation from this “ideal” location Whole template energy

Central Part Model Summary of search algorithm Matching cost: For each non-central part i>1 compute matching cost for all possible positions of that part in the image For each i>1 compute Generalized DT of For all possible positions of the central part compute energy Select the best location or select all locations with larger then a fixed threshold

Central Part Model for face detection

Search Algorithm for tree-based pictorial structures Algorithm is basically the same as for Central Part Model. Each “parent” part knows ideal positions of “child” parts. String deformations are accounted for by the Generalized Distance Transform of the children’s positioning energies