Robust Feature Matching and Fast GMS Solution

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

BRISK (Presented by Josh Gleason)

MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.

Patch to the Future: Unsupervised Visual Prediction

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.

1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

1 Robust Video Stabilization Based on Particle Filter Tracking of Projected Camera Motion (IEEE 2009) Junlan Yang University of Illinois,Chicago.

1 Interest Operators Find “interesting” pieces of the image –e.g. corners, salient regions –Focus attention of algorithms –Speed up computation Many possible.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Recognising Panoramas

1 Interest Operator Lectures lecture topics –Interest points 1 (Linda) interest points, descriptors, Harris corners, correlation matching –Interest points.

Fitting a Model to Data Reading: 15.1,

Independent Motion Estimation Luv Kohli COMP Multiple View Geometry May 7, 2003.

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

1 Interest Operators Find “interesting” pieces of the image Multiple possible uses –image matching stereo pairs tracking in videos creating panoramas –object.

Automatic Camera Calibration

Overview Introduction to local features

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Action recognition with improved trajectories

1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.

Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Example: line fitting. n=2 Model fitting Measure distances.

CSCE 643 Computer Vision: Structure from Motion

Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

© 2005 Martin Bujňák, Martin Bujňák Supervisor : RNDr.

18 th August 2006 International Conference on Pattern Recognition 2006 Epipolar Geometry from Two Correspondences Michal Perďoch, Jiří Matas, Ondřej Chum.

Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

Fitting image transformations Prof. Noah Snavely CS1114

Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.

CSE 185 Introduction to Computer Vision Feature Matching.

Frank Bergschneider February 21, 2014 Presented to National Instruments.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching Link: singhashwini.mesinghashwini.me.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

SIFT Scale-Invariant Feature Transform David Lowe

Presented by David Lee 3/20/2006

Nearest-neighbor matching to feature database

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

Tracking Objects with Dynamics

Nonparametric Semantic Segmentation

Fast Preprocessing for Robust Face Sketch Synthesis

Approximate Models for Fast and Accurate Epipolar Geometry Estimation

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Features Readings All is Vanity, by C. Allan Gilbert,

Object recognition Prof. Graeme Bailey

Nearest-neighbor matching to feature database

CSE 455 – Guest Lectures 3 lectures Contact Interest points 1

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Video Google: Text Retrieval Approach to Object Matching in Videos

Introduction to Sensor Interpretation

CSE 185 Introduction to Computer Vision

Computational Photography

Presented by Xu Miao April 20, 2005

Recognition and Matching based on local invariant features

Introduction to Sensor Interpretation

Calibration and homographies

Lecture 15: Structure from motion

Presentation transcript:

Robust Feature Matching and Fast GMS Solution Singapore University of Technology and Design (SUTD) Advanced Digital Sciences Center (ADSC) Speaker：JiaWang Bian(边佳旺) http://jwbian.net/

Content Feature Matching Introduction Recent Robust Matchers Feature Detector & Descriptor Matching RANSAC-based Geometry Estimation (or Verification) Recent Robust Matchers CODE (PAMI,2016) RepMatch (ECCV,2016) Fast and Robust GMS Solution(CVPR,2017) Video Demo Methodology Algorithm Share (Material Links)

Feature Matching Introduction

Feature Matching Introduction Pipeline Detection Description Matching Geometry

Feature Matching Introduction Applications Correct Correspondences Geometry between 2 views Similarity(Number of matches) Estimate Camera Pose Localization (SFM) Tracking (SLAM) … Image Retrieval Object Recognition Loop Closing (SLAM) Re-localization (SLAM) …

Feature Matching Introduction Feature detector & descriptor SIFT Faster Better SURF, ORB, AKAZE, … PCA-SIFT, ASIFT, LIFT, …

Feature Matching Introduction Nearest-Neighbor Optimization Others Brute-Force Approximate(FLANN) Matching Algorithms Graph Matching CODE, RepMatch, GMS…

Feature Matching Introduction RANSAC-based Geometry Estimation (or Verification) An example for RANSAC framework (fitting a line)

Feature Matching Introduction RANSAC-based Geometry Estimation (or Verification) Fundamental Matrix (for 3D scenes) Point to Line (weak, general) Homography (for 2D scenes) Point to Point (strong, narrow range)

Recent Robust Matchers

Recent Robust Matchers CODE[1] For wide-baseline matching. RepMatch[2] Based on CODE[1]. Solve the repeated structure problem. [1] CODE: Coherence Based Decision Boundaries for Feature Correspondence, IEEE TPAMI,2016, Lin et. al. [2] RepMatch: Robust Feature Matching and Pose for Reconstructing Modern Cities, ECCV, 2016,, Lin et. al.

Recent Robust Matchers (CODE) Wide-baseline matching

Recent Robust Matchers (CODE) Idea

Recent Robust Matchers (CODE) Regression models Likelihood Regression Affine motion regression -> x Affine motion regression -> y

Recent Robust Matchers (CODE) Likelihood Regression Train Data Selected distinctive correspondences(after ratio-test). Test Data All feature correspondences. Features of a correspondence 𝑋𝑖 = [𝑥, 𝑦, 𝑑𝑥, 𝑑𝑦, 𝑇1, 𝑇2, 𝑇3, 𝑇4]. T is a transformation matrix of [s1, r1] to [s2, r2]. s means scale, r represents rotation. Labels 1 for all correspondences Cost function Huber function Non-linear Optimization Construct Gaussian Similar Matrix X(Matrix with n x n elements), Y(Matrix with nx1 elements(1) ) n is the number of train data

Recent Robust Matchers (CODE) Affine motion regression Train Data The inliers of train data in the likelihood model Test Data Correspondences filtered by the likelihood model Feature Space Same as the likelihood model Label X2, and y2.(x,y represents pixel position, 2 means the second image) Cost function Huber function Non-linear Optimization Same as before(Gaussian Similar Matrix).

Recent Robust Matchers (CODE) Insight (likelihood model)

Recent Robust Matchers (CODE) Matching samples

Recent Robust Matchers (CODE) Structure from Motion C. Wu, “VisualSfM: A visual structure from motion system,” 2011[Online]. Available: http://ccwu.me/vsfm/

Recent Robust Matchers (CODE) Run time comparison

Recent Robust Matchers (RepMatch)

Recent Robust Matchers (RepMatch) Repetitive Structure

Recent Robust Matchers (RepMatch) Idea

Recent Robust Matchers (RepMatch) Structure from Motion

Recent Robust Matchers (RepMatch) Structure from Motion

Recent Robust Matchers (RepMatch) Structure from Motion

Recent Robust Matchers (RepMatch) Structure from Motion

Fast and Robust GMS Solution

Video Demo ORB with GMS vs SIFT with Ratio

Motivation: Trade-off of quality and speed Matching Nearest-Neighbor Optimization Ratio test GMS Current Methods Graph Matching Popular, Fast, Non-Robust Fast, Robust Slow, Robust

Methodology: Motion Smoothness Observation True matches(green) are visually smooth while false matches(cyan) are not.

Methodology: Key idea Inference Key idea Method According to the Bayesian rule, as true matches are smooth in motion space, consistent matches are thus more likely to be true. Key idea Find smooth matches from noisy data as our proposals. Method Motion Statistics Grid Framework Motion Kernels

Methodology: Motion Statistics Motion Statistics Model

Methodology: Motion Statistics Distribution Let 𝑓𝑎 be one of the n supporting features in region 𝑎 Let 𝑝𝑡 , 𝑝𝑓 be the probability that, feature fa’s nearest neighbor is in region 𝑏, given {𝑎, 𝑏} view the same and different location, respectively,

Methodology: Motion Statistics Event Assumption Here, 𝑚 is the number of features in region 𝑏 and 𝑀 is the number of features in second image. 𝛽 is a factor added to accommodate violations of assumption caused by repeated patterns.

Methodology: Motion Statistics Probability Explanation: If {𝑎 𝑏} view the same location, event 𝑓 𝑎 𝑏 occurs when 𝑓𝑎 matches correctly or it matches wrongly but coincidentally lands in region 𝑏. Explanation: If {𝑎 𝑏} view the different location, event 𝑓 𝑎 𝑏 occurs only when 𝑓𝑎 matches wrongly and coincidentally lands in region 𝑏.

Methodology: Motion Statistics Multi-region Generalization

Methodology: Motion Statistics Distribution Mean & Variance

Methodology: Motion Statistics Analysis Partionability Quantity-Quality equivalence: Relationship to Descriptors:

Methodology: Motion Statistics Experiments on real data: The model is evaluated on Oxford Affine Dataset. Here, we run SIFT matching and label all matches as inlier or outlier according to the ground truth. we count the supporting score for each match in a small region.

Algorithm: Grid Framework Both images are segmented by a pre-defined grid. Calculating the Motion Statistics for cell-pairs instead of each feature correspondence. O(N)  O(1)!

Algorithm: Motion Kernels Basic Motion Kernel

Algorithm: Motion Kernels Generalized Motion Kernels (Extension*) Rotation Scale Varying the cell size of the second image by a scale factor.

Algorithm: Empirical parameters How many grid-cells should be used? Too fine: weak statistics and low efficiency. Too coarse: low accuracy The empirical results show 20 x 20 is a good choice. How to set the threshold?

Algorithm: GMS Grid Motion Statistics Algorithm

Algorithm: Full Feature Matching Full feature matching pipeline

Algorithm: Run time Run time on Image pairs Real time on Video data ORB feature extraction(about 35ms on cpu) Nearest Neighbor Matching(106ms on cpu, 25ms on gpu) GMS(1ms on cpu) Overall : 1000 / (2 * 35 + 25 + 1) = 10.42fps Real time on Video data ORB and NN can run parallelly on video sequence. Overall : 1000 / 35 = 28.57fps

Evaluation Dataset Capture of TUM dataset

Evaluation Capture of Strecha dataset Capture of VGG dataset

Evaluation Matching ability

Evaluation Pose Estimation

Evaluation Wide-baseline matching In both graphs, the first row shows initial results and the second row illustrates the matches after RANSAC.

Evaluation GMS on Images with Repetitive Structures Images are selected by [1], where many state-of-art matchers fail and SIFT fails all. [1] Epipolar Geometry Estimation for Urban Scenes with Repetitive Structures, IEEE TPAMI, 2014, Kushnir et. al.

Evaluation Non-rigid object

Evaluation Video Demo(screen shot)

Share JiaWang’s Home Page Project Page Code on GitHub: http://jwbian.net/ Project Page http://jwbian.net/gms/ Code on GitHub: https://github.com/JiawangBian/GMS-Feature-Matcher Videos on YouTube: https://youtu.be/3SlBqspLbxI Links to CODE and RepMatch http://www.kind-of-works.com/

Q&A Q&A