Image alignment Image from http://graphics.cs.cmu.edu/courses/15-463/2010_fall/

Slides:



Advertisements
Similar presentations
Summary of Friday A homography transforms one 3d plane to another 3d plane, under perspective projections. Those planes can be camera imaging planes or.
Advertisements

Instance-level recognition II.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
776 Computer Vision Jared Heinly Spring 2014 (slides borrowed from Jan-Michael Frahm, Svetlana Lazebnik, and others)
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Image alignment Image from
Robust and large-scale alignment Image from
Image alignment.
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Distinctive Image Feature from Scale-Invariant KeyPoints
Feature extraction: Corners and blobs
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Fitting a Model to Data Reading: 15.1,
Scale Invariant Feature Transform (SIFT)
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
Feature Matching and RANSAC : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and Rick Szeliski.
Keypoint-based Recognition and Object Search
10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Object Recognition and Augmented Reality
Image Stitching Ali Farhadi CSE 455
Image alignment.
CSE 185 Introduction to Computer Vision
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.
Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Fitting & Matching Lecture 4 – Prof. Bregler Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Example: line fitting. n=2 Model fitting Measure distances.
10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Computer Vision : CISC 4/689 Going Back a little Cameras.ppt.
Reconnaissance d’objets et vision artificielle Jean Ponce Equipe-projet WILLOW ENS/INRIA/CNRS UMR 8548 Laboratoire.
CSE 185 Introduction to Computer Vision Feature Matching.
Project 3 questions? Interest Points and Instance Recognition Computer Vision CS 143, Brown James Hays 10/21/11 Many slides from Kristen Grauman and.
Local features: detection and description
Distinctive Image Features from Scale-Invariant Keypoints
776 Computer Vision Jan-Michael Frahm Spring 2012.
COS 429 PS3: Stitching a Panorama Due November 10 th.
Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.
Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.
SIFT.
776 Computer Vision Jan-Michael Frahm Spring 2012.
SIFT Scale-Invariant Feature Transform David Lowe
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Fitting a transformation: feature-based alignment
Nearest-neighbor matching to feature database
TP12 - Local features: detection and description
Local features: detection and description May 11th, 2017
Feature description and matching
A special case of calibration
Image Stitching Slides from Rick Szeliski, Steve Seitz, Derek Hoiem, Ira Kemelmacher, Ali Farhadi.
Features Readings All is Vanity, by C. Allan Gilbert,
Object recognition Prof. Graeme Bailey
Nearest-neighbor matching to feature database
CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5
SIFT.
Computational Photography
Feature descriptors and matching
Presented by Xu Miao April 20, 2005
Recognition and Matching based on local invariant features
Calibration and homographies
Back to equations of geometric transformations
Image Stitching Linda Shapiro ECE/CSE 576.
Image Stitching Linda Shapiro ECE P 596.
Presentation transcript:

Image alignment Image from http://graphics.cs.cmu.edu/courses/15-463/2010_fall/

A look into the past http://blog.flickr.net/en/2010/01/27/a-look-into-the-past/

A look into the past Leningrad during the blockade http://komen-dant.livejournal.com/345684.html

Bing streetside images http://www.bing.com/community/blogs/maps/archive/2010/01/12/new-bing-maps-application-streetside-photos.aspx

Image alignment: Applications Panorama stitching

Image alignment: Applications Recognition of object instances

Image alignment: Challenges Small degree of overlap Intensity changes Occlusion, clutter

Feature-based alignment Search sets of feature matches that agree in terms of: Local appearance Geometric configuration ?

Feature-based alignment: Overview Alignment as fitting Affine transformations Homographies Robust alignment Descriptor-based feature matching RANSAC Large-scale alignment Inverted indexing Vocabulary trees Application: searching the night sky

Alignment as fitting Previous lectures: fitting a model to features in one image M Find model M that minimizes xi

Find transformation T that minimizes Alignment as fitting Previous lectures: fitting a model to features in one image Alignment: fitting a model to a transformation between pairs of features (matches) in two images M Find model M that minimizes xi T xi ' Find transformation T that minimizes

2D transformation models Similarity (translation, scale, rotation) Affine Projective (homography)

Let’s start with affine transformations Simple fitting procedure (linear least squares) Approximates viewpoint changes for roughly planar objects and roughly orthographic cameras Can be used to initialize fitting for more complex models

Fitting an affine transformation Assume we know the correspondences, how do we get the transformation? Want to find M, t to minimize

Fitting an affine transformation Assume we know the correspondences, how do we get the transformation?

Fitting an affine transformation Linear system with six unknowns Each match gives us two linearly independent equations: need at least three to solve for the transformation parameters

Fitting a plane projective transformation Homography: plane projective transformation (transformation taking a quad to another arbitrary quad)

Homography The transformation between two views of a planar surface The transformation between images from two cameras that share the same center

Application: Panorama stitching Source: Hartley & Zisserman

Fitting a homography Recall: homogeneous coordinates Converting to homogeneous image coordinates Converting from homogeneous image coordinates

Fitting a homography Recall: homogeneous coordinates Equation for homography: Converting to homogeneous image coordinates Converting from homogeneous image coordinates

3 equations, only 2 linearly independent Fitting a homography Equation for homography: 3 equations, only 2 linearly independent

Direct linear transform H has 8 degrees of freedom (9 parameters, but scale is arbitrary) One match gives us two linearly independent equations Homogeneous least squares: find h minimizing ||Ah||2 Eigenvector of ATA corresponding to smallest eigenvalue Four matches needed for a minimal solution

Robust feature-based alignment So far, we’ve assumed that we are given a set of “ground-truth” correspondences between the two images we want to align What if we don’t know the correspondences?

Robust feature-based alignment So far, we’ve assumed that we are given a set of “ground-truth” correspondences between the two images we want to align What if we don’t know the correspondences? ?

Robust feature-based alignment

Robust feature-based alignment Extract features

Robust feature-based alignment Extract features Compute putative matches

Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T

Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T Verify transformation (search for other matches consistent with T)

Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T Verify transformation (search for other matches consistent with T)

Generating putative correspondences ?

Generating putative correspondences ? ( ) ? ( ) = feature descriptor feature descriptor Need to compare feature descriptors of local patches surrounding interest points

Feature descriptors Recall: feature detection and description

Feature descriptors Simplest descriptor: vector of raw intensity values How to compare two such vectors? Sum of squared differences (SSD) Not invariant to intensity change Normalized correlation Invariant to affine intensity change

Disadvantage of intensity vectors as descriptors Small deformations can affect the matching score a lot

Review: Alignment 2D alignment models Feature-based alignment outline Descriptor matching

Feature descriptors: SIFT Descriptor computation: Divide patch into 4x4 sub-patches Compute histogram of gradient orientations (8 reference angles) inside each sub-patch Resulting descriptor: 4x4x8 = 128 dimensions David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.

Feature descriptors: SIFT Descriptor computation: Divide patch into 4x4 sub-patches Compute histogram of gradient orientations (8 reference angles) inside each sub-patch Resulting descriptor: 4x4x8 = 128 dimensions Advantage over raw vectors of pixel values Gradients less sensitive to illumination change Pooling of gradients over the sub-patches achieves robustness to small shifts, but still preserves some spatial information David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.

Feature matching Generating putative matches: for each patch in one image, find a short list of patches in the other image that could match it based solely on appearance ?

Problem: Ambiguous putative matches Source: Y. Furukawa

Rejection of unreliable matches How can we tell which putative matches are more reliable? Heuristic: compare distance of nearest neighbor to that of second nearest neighbor Ratio of closest distance to second-closest distance will be high for features that are not distinctive Threshold of 0.8 provides good separation David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.

RANSAC The set of putative matches contains a very high percentage of outliers RANSAC loop: Randomly select a seed group of matches Compute transformation from seed group Find inliers to this transformation If the number of inliers is sufficiently large, re-compute least-squares estimate of transformation on all of the inliers Keep the transformation with the largest number of inliers

RANSAC example: Translation Putative matches

RANSAC example: Translation Select one match, count inliers

RANSAC example: Translation Select one match, count inliers

RANSAC example: Translation Select translation with the most inliers

Scalability: Alignment to large databases What if we need to align a test image with thousands or millions of images in a model database? Efficient putative match generation Approximate descriptor similarity search, inverted indices Test image ? Model database

Large-scale visual search Reranking/ Geometric verification Inverted indexing Figure from: Kristen Grauman and Bastian Leibe, Visual Object Recognition, Synthesis Lectures on Artificial Intelligence and Machine Learning, April 2011, Vol. 5, No. 2, Pages 1-181

Example indexing technique: Vocabulary trees Test image Database Vocabulary tree with inverted index D. Nistér and H. Stewénius, Scalable Recognition with a Vocabulary Tree, CVPR 2006

Goal: find a set of representative prototypes or cluster centers to which descriptors can be quantized Descriptor space

K-means clustering Want to minimize sum of squared Euclidean distances between points xi and their nearest cluster centers mk Algorithm: Randomly initialize K cluster centers Iterate until convergence: Assign each data point to the nearest center Recompute each cluster center as the mean of all points assigned to it

K-means demo Source: http://shabal.in/visuals/kmeans/1.html Another demo: http://www.kovan.ceng.metu.edu.tr/~maya/kmeans/

Recall: Visual codebook for generalized Hough transform … Appearance codebook Source: B. Leibe

Hierarchical k-means clustering of descriptor space (vocabulary tree) We then run k-means on the descriptor space. In this setting, k defines what we call the branch-factor of the tree, which indicates how fast the tree branches. In this illustration, k is three. We then run k-means again, recursively on each of the resulting quantization cells. This defines the vocabulary tree, which is essentially a hierarchical set of cluster centers and their corresponding Voronoi regions. We typically use a branch-factor of 10 and six levels, resulting in a million leaf nodes. We lovingly call this the Mega-Voc. Hierarchical k-means clustering of descriptor space (vocabulary tree) Slide credit: D. Nister

Vocabulary tree/inverted index We now have the vocabulary tree, and the online phase can begin. In order to add an image to the database, we perform feature extraction. Each descriptor vector is now dropped down from the root of the tree and quantized very efficiently into a path down the tree, encoded by a single integer. Each node in the vocabulary tree has an associated inverted file index. Indecies back to the new image are then added to the relevant inverted files. This is a very efficient operation that is carried out whenever we want to add an image. Vocabulary tree/inverted index Slide credit: D. Nister

Populating the vocabulary tree/inverted index Model images Populating the vocabulary tree/inverted index Slide credit: D. Nister

Populating the vocabulary tree/inverted index Model images Populating the vocabulary tree/inverted index Slide credit: D. Nister

Populating the vocabulary tree/inverted index Model images Populating the vocabulary tree/inverted index Slide credit: D. Nister

Populating the vocabulary tree/inverted index Model images Populating the vocabulary tree/inverted index Slide credit: D. Nister

Looking up a test image Model images Test image When we then wish to query on an input image, we quantize the descriptor vectors of the input image in a similar way, and accumulate scores for the images in the database with so called term frequency inverse document frequency (tf-idf). This is effectively an entropy weighting of the information. The winner is the image in the database with the most common information with the input image. Looking up a test image Slide credit: D. Nister

Cool application of large-scale alignment: searching the night sky http://www.astrometry.net/ B C D A