Image Mosaicing Shiran Stan-Meleh

Image Mosaicing Shiran Stan-Meleh
*

Why do we need it? Satellite Images 360 View Panorama
Compact Camera FOV = 50 x 35° Human FOV = 200 x 135° Panoramic Mosaic = 360 x 180°

How do we do it? 2 methods Direct (appearance-based)
Search for alignment where most pixels agree Feature-based Find a few matching features in both images compute transformation *Copied from Hagit Hel-Or ppt

Direct (appearance-based) methods
How do we do it? Direct (appearance-based) methods Manually…  *

Direct (appearance-based) methods
How do we do it? Direct (appearance-based) methods Define an error metric to compare the images Ex: Sum of squared differences (SSD). Define a search technique (simplest: full search) Pros: Simple algorithm, can work on complicated transformation Good for matching sequential frames in a video Cons: Need to manually estimate parameters Can be very slow

How do we do it? Feature based methods
Harris Corner detection - C. Harris & M. Stephens (1988) SIFT - David Lowe (1999) PCA-SIFT - Y. Ke & R. Sukthankar (2004) SURF - Bay & Tuytelaars (2006) GLOH - Mikolajczyk & Schmid (2005) HOG - Dalal & Triggs (2005) GLOH (Gradient Location and Orientation Histogram) is a robust image descriptor that can be used in computer vision tasks. It is a SIFT-like descriptor that considers more spatial regions for the histograms. The higher dimensionality of the descriptor is reduced to 64 through principal components analysis (PCA). Histogram of Oriented Gradients (HOG). The technique counts occurrences of gradient orientation in localized portions of an image. differs in that it is computed on a dense grid of uniformly spaced cells and uses overlapping local contrast normalization for improved accuracy

Agenda We will concentrate on feature based methods using SIFT for features extraction and RANSAC for features matching and transformation estimation

Some Background SIFT and RANSAC

Scale Invariant Features Transform
What is SIFT? Scale Invariant Features Transform From Wiki: “an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999”

Applications Object recognition Robotic mapping and navigation
Image stitching 3D modeling Gesture recognition Video tracking Individual identification of wildlife Match moving

Basic Steps Scale Space extrema detection Construct Scale Space
Take Difference of Gaussians Locate DoG Extrema Keypoint localization Orientation assignment Build Keypoint Descriptors *

1a. Construct Scale Space
Motivation: Real-world objects are composed of different structures at different scales 𝑮 𝝈 ∗𝑰 𝑮 𝒌 𝟐 𝝈 ∗𝑰 𝑮 𝒌𝝈 ∗𝑰 First Octave For example a tree can be examined at leaf level where we can see it’s texture, or from a large distance where it can be seen as a dot. Studies also show a close link between scale-space and biological vision. Explanation: representing an image at different scales at different blurred levels 𝑮 𝟐𝝈 ∗𝑰 𝑮 𝟐𝒌𝝈 ∗𝑰 𝑮 𝟐 𝒌 𝟐 𝝈 ∗𝑰 Second Octave *copied from Hagit Hel-Or ppt

1b. Take Difference of Gaussians
Experimentally, Maxima of Laplacian-of-Gaussian (LoG: 𝜎 2 ∆ 2 𝐺) gives best notion of scale: But it’s extremely costly so instead we use DoG: 𝐺 𝑥,𝑦,𝑘𝜎 −𝐺(𝑥,𝑦,𝜎)≈(𝑘−1) 𝜎 2 ∆ 2 𝐺 *Mikolajczyk 2002 *Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe

1c. Locate DoG Extrema Find all Extrema, that is minimum or maximum in 3x3x3 neighborhood: *Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe

Basic Steps Scale Space extrema detection  Keypoint localization
Sub Pixel Locate Potential Feature Points Filter Edge and Low Contrast Responses Orientation assignment Build Keypoint Descriptors *

2a. Sub Pixel Locate Potential Feature Points
Problem: *

2b. Filter Edge and Low Contrast Responses
Remove low contrast points (sensitive to noise): 𝐷 𝑥 =𝐷 𝜕 𝐷 𝑇 𝜕𝑥 𝑥<0.03 Remove keypoints with strong edge response in only one direction (how?): *

2b. Filter Edge and Low Contrast Responses
By using Hessian Matrix: Eigenvalues of Hessian matrix are proportional to principal curvatures Use Trace and Determinant: 𝑇𝑟 𝐻 = 𝐷 𝑥𝑥 + 𝐷 𝑦𝑦 =𝛼+𝛽 𝐷𝑒𝑡 𝐻 = 𝐷 𝑥𝑥 𝐷 𝑦𝑦 − 𝐷 𝑥𝑦 2 =𝛼𝛽 𝑇𝑟(𝐻) 𝐷𝑒𝑡(𝐻) < 𝑟 𝑟 R=10, only 20 floating points operations per Keypoint Eigenvalues of Hessian matrix must both be large the Hessian matrix (or simply the Hessian) is the square matrix of second-order partial derivatives of a function; that is, it describes the local curvature of a function of many variables An eigenvector of a square matrix is a non-zero vector that, when multiplied by , yields the original vector multiplied by a single number ; that is:

“Picture worth a 1000 keypoints”
Original image Low contrast removed (729) Initial features (832) Low curvature removed (536) *Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe

Basic Steps Scale Space extrema detection  Keypoint localization 
Orientation assignment Build Keypoint Descriptors Low contrast removed Low curvature removed *

3. Orientation assignment
Compute gradient magnitude and orientation for each SIFT point 𝑥,𝑦,𝜎 : 𝑚 𝑥,𝑦 = 𝐿 𝑥+1,𝑦 −𝐿 𝑥−1,𝑦 𝐿 𝑥,𝑦+1 −𝐿 𝑥,𝑦− 𝜃 𝑥,𝑦 = tan −1 𝐿 𝑥,𝑦+1 −𝐿 𝑥,𝑦−1 𝐿 𝑥+1,𝑦 −𝐿 𝑥−1,𝑦 Create gradient histogram weighted by Gaussian window with 𝜎 = 1.5* 𝜎 and use parabola fit to interpolate more accurate location of peak. *

Basic Steps Scale Space extrema detection  Keypoint localization 
Orientation assignment  Build Keypoint Descriptors *

4. Build Keypoint Descriptors
4x4 Gradient windows relative to keypoint orientation Histogram of 4x4 samples per window in 8 directions Gaussian weighting around center(𝜎 is 0.5 times that of the scale of a keypoint) 4x4x8 = 128 dimensional feature vector Normalize to remove contrast perform threshold at 0.2 and normalize again *Image from: Jonas Hurrelmann

Live Demo And next… RANSAC
*

RANdom SAmple Consensus
What is RANSAC? RANdom SAmple Consensus first published by Fischler and Bolles at SRI International in 1981 From Wiki: An iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers Non Deterministic Outputs a “reasonable” result with certain probability

What is RANSAC? A data set with many outliers for which a line has to be fitted Fitted line with RANSAC outliers have no influence on the result *

RANSAC Input & Output The procedure is iterated k times, for each iteration: Input Set of observed data values Parameterized model which can explain or be fitted to the observations Some confidence parameters Output Best model - model parameters which best fit the data (or nil if no good model is found) Best consensus set - data points from which this model has been estimated Best error - the error of this model relative to the data n - the minimum number of data required to fit the model k - the number of iterations performed by the algorithm t - a threshold value for determining when a datum fits a model d - the number of close data values required to assert that a model fits well to data

Basic Steps Select a random subset of the original data called hypothetical inliers Fill free parameters according to the hypothetical inliers creating suggested model. Test all non hypothetical inliers in the model, if a point fits well, also consider as a hypothetical inlier. Check that suggested model has sufficient points classified as hypothetical inliers. Recheck free parameters according to the new set of hypothetical inliers. Evaluate the error of the inliers relative to the model.

Basic Steps – Line Fitting Example
Select a random subset of the original data called hypothetical inliers 𝑦=𝑎𝑥+𝑏 *copied from Hagit Hel-Or ppt

Fill free parameters according to the hypothetical inliers creating suggested model. 𝑦=−2𝑥+3 *copied from Hagit Hel-Or ppt

Test all non hypothetical inliers in the model, if a point fits well, also consider as a hypothetical inlier. 𝑦=−2𝑥+3 *copied from Hagit Hel-Or ppt

Check that suggested model has sufficient points classified as hypothetical inliers. 𝑦=−2𝑥+3 C=3 *copied from Hagit Hel-Or ppt

Recheck free parameters according to the new set of hypothetical inliers. 𝑦=𝑎𝑥+𝑏 = ? 𝑦=−2𝑥+3 C=3 *copied from Hagit Hel-Or ppt

Evaluate the error of the inliers relative to the model. C=3 *copied from Hagit Hel-Or ppt

Repeat 𝑦=0.5𝑥+1 C=3 *copied from Hagit Hel-Or ppt

Best Model 𝑦=3𝑥+2 C=15 *copied from Hagit Hel-Or ppt

An example from image mosaicing
Estimate transformation Taking pairs of points from 2 images and testing with transformation model Model: direct linear transformation Set size: 4 Repeats: 500 𝑝 𝐻 𝑖𝑠 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 =1− 1− 𝑝 𝑖 𝑟 𝑛 Thus for 𝑝 𝑖 =0.5 the probability that the correct transformation is not found after 500 trials is approximately 1𝑥 10 −14

Back to Image Mosaicing

How it is done? For each pair of images: Extract features
Match features Estimate transformation Transform 2nd image Blend two images Repeat for next pair *Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

1. Extract features Challenges
Need to match points from different images Different orientations Different scales Different illuminations

Contenders for the crown
1. Extract features Contenders for the crown SIFT - David Lowe (1999) PCA-SIFT - Y. Ke & R. Sukthankar (2004) SURF - Bay & Tuytelaars (2006)

1. Extract features Used to lower the dimensionality of a dataset with a minimal information loss Compute or load a projection matrix using set of images which match a certain characteristics PCA-SIFT: computing a projection matrix ● Select a representative set of pictures and detect all keypoints in these pictures ~20-40K ● For each keypoint: • Extract an image patch around it with size 41 x 41 pixels • Calculate horizontal and vertical gradients, resulting in a vector of size 39 x 39 x 2 = 3042 ● Put all these vectors into a k x 3042 matrix A where k is the number of keypoints detected ● Calculate the covariance matrix of A ● Compute the eigenvectors and eigenvalues of covA ● Select the first n eigenvectors; the projection matrix is a n x 3042 matrix composed of these eigenvectors ● n can either be a fixed value determined empirically or set dynamically based on the eigenvalues ● The projection matrix is only computed once and saved * SIFT or PCA-SIFT

Principal Components Analysis SIFT or PCA-SIFT
1. Extract features Principal Components Analysis SIFT or PCA-SIFT Detect keypoints in the image same as SIFT Extract a 41×41 patch centered over each keypoint, compute its local image gradient Project the gradient image vector by multiplying with the projection matrix - to derive a compact feature vector. This results in a descriptor of size n<20 ● SIFT: • Dimensions: 128 - High dimensionality - not fully affine invariant + less empiric knowledge required + easier implementation ● PCA-SIFT: • Dimensions: variable, recommended is 20 or less - projection matrix needs representative set of pictures; this matrix will then only work for pictures of this kind + lower dimensionality while retaining distinctiveness leads to greatly reduced computational cost

1. Extract features Why SIFT?
*A Comparison of SIFT, PCA-SIFT and SURF - Luo Juan & Oubong Gwun

How it is done? For each pair of images: Extract features 
Match features Estimate transformation Transform 2nd image Blend two images Repeat for next pair *Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

2. Match features General approach
Identify K nearest neighbors for each keypoint (Lowe suggested k=4) where… Near is measured by minimum Euclidian distance between a point (descriptor) on image A to points (descriptor) in image B. Takes 𝑂 𝑛 2 complexity thus using k-d tree to get 𝑂(𝑛𝑙𝑜𝑔𝑛)

2. Match features Another approach
For each feature point define a circle with the feature as center and r=0.1*height_of_image Find largest Mutual Information value between a circle of feature in image A to a circle of feature in image B: 𝐾 𝐴,𝐵 =𝐻 𝐴 +𝐻 𝐵 −𝐻 𝐴,𝐵 H is the Entropy of an image block measure of similarity of the distributions in the 2 circles *Image Mosaic Based On SIFT - Pengrui Qiu,Ying Liangand Hui Rong

Match features  Estimate transformation Transform 2nd image Blend two images Repeat for next pair *Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

3. Estimate transformation
Problem: Outliers: Not all features has a match, why? They are not in the overlapped area Same features were not extracted on both images Solution RANSAC Decide on a model which suits best. Input the model, size of set, number of repeats, threshold and tolerance. Get a fitted model and the inliers feature points.

Match features  Estimate transformation  Transform 2nd image - Depending on the desired output (panorama, 360 view etc.) and transformation found Blend two images Repeat for next pair Cylindrical / Linear / Radial *Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

Match features  Estimate transformation  Transform 2nd image  Blend two images Repeat for next pair *Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

5. Blend two images Simple approach
Place 2nd image on top of reference image. Apply weighted average on pixel values in overlapping area: 𝑃𝐵 𝑖,𝑗 = 1−𝑤 ∗𝑃𝐴 𝑖,𝑗 +𝑤∗𝑃𝐵(𝑖,𝑗) *

5. Blend two images Pyramid Blending
Create Laplacian pyramid for each image Combine the two images in different Laplacian levels by combining partial images from each of them *

5. Blend two images Multi-Band Blending Burt and Adelson [BA83].
The idea behind multi-band blending is to blend low frequencies over a large spatial range, and high frequencies over a short range.

5. Blend two images Multi-Band Blending Band 1 (scale 0 to σ)
*Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe

Match and filter using RANSAC
2 Images Extract Features Match and filter using RANSAC Transform and Blend *Automatic Panoramic Image Stitching using Invariant Features - M Brown and DG. Lowe

Idea – Millions of images
Image Matches

Connected components of image matches

Output panoramas

That’s it… Question?

References - Articles “Image Mosaic Based On SIFT”, Yang zhan-long and Guo bao-long. International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp: ,2008. Image Mosaics Algorithm Based on SIFT Feature Point Matching and Transformation Parameters Automatically Recognizing - Pengrui Qiu,Ying Liang and Hui Rong Image Alignment and Stitching: A Tutorial1 - Richard Szeliski Comparison of SIFT SURF

References - Additional
“SIFT: scale invariant feature transform by David Lowe” - Presented by Jason Clemons “SIFT - The Scale Invariant Feature Transform” - Presented by Ofir Pele

Image Mosaicing Shiran Stan-Meleh

Similar presentations

Presentation on theme: "Image Mosaicing Shiran Stan-Meleh"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Image Mosaicing Shiran Stan-Meleh

Similar presentations

Presentation on theme: "Image Mosaicing Shiran Stan-Meleh"— Presentation transcript:

Similar presentations

About project

Feedback