Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.

Slides:

Advertisements

Similar presentations

Kapitel 14 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image categorization Bag-of-features.

Advertisements

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.

Instance-level recognition II.

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Image alignment Image from

776 Computer Vision Jared Heinly Spring 2014 (slides borrowed from Jan-Michael Frahm, Svetlana Lazebnik, and others)

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Fitting: The Hough transform

Robust and large-scale alignment Image from

Lecture 5 Hough transform and RANSAC

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Image alignment.

1 Model Fitting Hao Jiang Computer Science Department Oct 8, 2009.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Fitting a Model to Data Reading: 15.1,

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Fitting: The Hough transform

Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

Keypoint-based Recognition and Object Search

Alignment and Object Instance Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/16/12.

כמה מהתעשייה? מבנה הקורס השתנה Computer vision.

10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Object Recognition and Augmented Reality

October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.

Image alignment.

CSE 185 Introduction to Computer Vision

Computer Vision - Fitting and Alignment

Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.

Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

Fitting & Matching Lecture 4 – Prof. Bregler Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

HOUGH TRANSFORM Presentation by Sumit Tandon

HOUGH TRANSFORM & Line Fitting Introduction  HT performed after Edge Detection  It is a technique to isolate the curves of a given shape / shapes.

Fitting : Voting and the Hough Transform Monday, Feb 14 Prof. Kristen Grauman UT-Austin.

Object Detection 01 – Advance Hough Transformation JJCAO.

Example: line fitting. n=2 Model fitting Measure distances.

CS 1699: Intro to Computer Vision Matching and Fitting Prof. Adriana Kovashka University of Pittsburgh September 29, 2015.

10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.

Fitting: The Hough transform

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Fitting Thursday, Sept 24 Kristen Grauman UT-Austin.

1 Model Fitting Hao Jiang Computer Science Department Sept 30, 2014.

Computer Vision - Fitting and Alignment (Slides borrowed from various presentations)

Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

CSE 185 Introduction to Computer Vision Feature Matching.

Local features: detection and description

776 Computer Vision Jan-Michael Frahm Spring 2012.

Robust Estimation Course web page: vision.cis.udel.edu/~cv April 23, 2003  Lecture 25.

Detecting Image Features: Corner. Corners Given an image, denote the image gradient. C is symmetric with two positive eigenvalues. The eigenvalues give.

Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.

Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Fitting a transformation: feature-based alignment

Fitting: Voting and the Hough Transform

TP12 - Local features: detection and description

Fitting: Voting and the Hough Transform (part 2)

Fitting Curve Models to Edges

Features Readings All is Vanity, by C. Allan Gilbert,

Presentation transcript:

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH Zurich Chicago, Kristen Grauman Department of Computer Sciences University of Texas in Austin

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 2 K. Grauman, B. Leibe Outline 1. Detection with Global Appearance & Sliding Windows 2. Local Invariant Features: Detection & Description 3. Specific Object Recognition with Local Features ― Coffee Break ― 4. Visual Words: Indexing, Bags of Words Categorization 5. Matching Local Features 6. Part-Based Models for Categorization 7. Current Challenges and Research Directions

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 3 K. Grauman, B. Leibe Recognition with Local Features Image content is transformed into local features that are invariant to translation, rotation, and scale Goal: Verify if they belong to a consistent configuration Local Features, e.g. SIFT Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 4 K. Grauman, B. Leibe Finding Consistent Configurations Global spatial models  Generalized Hough Transform [Lowe99]  RANSAC [Obdrzalek02, Chum05, Nister06]  Basic assumption: object is planar Assumption is often justified in practice  Valid for many structures on buildings  Sufficient for small viewpoint variations on 3D objects

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 5 K. Grauman, B. Leibe Hough Transform Origin: Detection of straight lines in clutter  Basic idea: each candidate point votes for all lines that it is consistent with.  Votes are accumulated in quantized array  Local maxima correspond to candidate lines Representation of a line  Usual form y = a x + b has a singularity around 90º.  Better parameterization: x cos(  ) + y sin(  ) =  θ ρ x y

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 6 K. Grauman, B. Leibe Examples  Hough transform for a square (left) and a circle (right)

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 7 K. Grauman, B. Leibe Hough Transform: Noisy Line Problem: Finding the true maximum TokensVotes θ ρ Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 8 K. Grauman, B. Leibe Hough Transform: Noisy Input Problem: Lots of spurious maxima TokensVotes Slide credit: David Lowe θ ρ

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 9 K. Grauman, B. Leibe Generalized Hough Transform [Ballard81] Generalization for an arbitrary contour or shape  Choose reference point for the contour (e.g. center)  For each point on the contour remember where it is located w.r.t. to the reference point  Remember radius r and angle  relative to the contour tangent  Recognition: whenever you find a contour point, calculate the tangent angle and ‘vote’ for all possible reference points  Instead of reference point, can also vote for transformation  The same idea can be used with local features! Slide credit: Bernt Schiele

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 10 K. Grauman, B. Leibe Gen. Hough Transform with Local Features For every feature, store possible “occurrences” – Object identity – Pose – Relative position For new image, let the matched features vote for possible object positions

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 11 K. Grauman, B. Leibe When is the Hough transform useful? Textbooks wrongly imply that it is useful mostly for finding lines  In fact, it can be very effective for recognizing arbitrary shapes or objects The key to efficiency is to have each feature (token) determine as many parameters as possible  For example, lines can be detected much more efficiently from small edge elements (or points with local gradients) than from just points  For object recognition, each token should predict location, scale, and orientation (4D array) Bottom line: The Hough transform can extract feature groupings from clutter in linear time! Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 12 K. Grauman, B. Leibe 3D Object Recognition Gen. HT for Recognition  Typically only 3 feature matches needed for recognition  Extra matches provide robustness  Affine model can be used for planar objects Slide credit: David Lowe [Lowe99]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 13 K. Grauman, B. Leibe View Interpolation Training  Training views from similar viewpoints are clustered based on feature matches.  Matching features between adjacent views are linked. Recognition  Feature matches may be spread over several training viewpoints.  Use the known links to “transfer votes” to other viewpoints. Slide credit: David Lowe [Lowe01]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 14 K. Grauman, B. Leibe Recognition Using View Interpolation Slide credit: David Lowe [Lowe01]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 15 K. Grauman, B. Leibe Location Recognition Slide credit: David Lowe Training [Lowe04]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 16 K. Grauman, B. Leibe Applications Sony Aibo (Evolution Robotics) SIFT usage  Recognize docking station  Communicate with visual cards Other uses  Place recognition  Loop closure in SLAM Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 17 K. Grauman, B. Leibe RANSAC (RANdom SAmple Consensus) [Fischler81] Randomly choose a minimal subset of data points necessary to fit a model (a sample) Points within some distance threshold t of model are a consensus set. Size of consensus set is model’s support. Repeat for N samples; model with biggest support is most robust fit  Points within distance t of best model are inliers  Fit final model to all inliers Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 18 K. Grauman, B. Leibe Slide credit: David Forsyth

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 19 K. Grauman, B. Leibe RANSAC: How many samples? How many samples are needed?  Suppose w is fraction of inliers (points from line).  n points needed to define hypothesis (2 for lines)  k samples chosen. Prob. that a single sample of n points is correct: Prob. that all samples fail is:  Choose k high enough to keep this below desired failure rate. Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 20 K. Grauman, B. Leibe RANSAC: Computed k (p=0.99) Slide credit: David Lowe Sample size n Proportion of outliers 5%10%20%25%30%40%50%

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 21 K. Grauman, B. Leibe After RANSAC RANSAC divides data into inliers and outliers and yields estimate computed from minimal set of inliers Improve this initial estimate with estimation over all inliers (e.g. with standard least-squares minimization) But this may change inliers, so alternate fitting with re- classification as inlier/outlier Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 22 K. Grauman, B. Leibe Example: Finding Feature Matches Find best stereo match within a square search window (here 300 pixels 2 ) Global transformation model: epipolar geometry from Hartley & Zisserman Slide credit: David Lowe

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 23 K. Grauman, B. Leibe Example: Finding Feature Matches Find best stereo match within a square search window (here 300 pixels 2 ) Global transformation model: epipolar geometry from Hartley & Zisserman Slide credit: David Lowe before RANSACafter RANSAC

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 24 K. Grauman, B. Leibe Comparison Gen. Hough Transform Advantages  Very effective for recognizing arbitrary shapes or objects  Can handle high percentage of outliers (>95%)  Extracts groupings from clutter in linear time Disadvantages  Quantization issues  Only practical for small number of dimensions (up to 4) Improvements available  Probabilistic Extensions  Continuous Voting Space RANSAC Advantages  General method suited to large range of problems  Easy to implement  Independent of number of dimensions Disadvantages  Only handles moderate number of outliers (<50%) Many variants available, e.g.  PROSAC: Progressive RANSAC [Chum05]  Preemptive RANSAC [Nister05] [Leibe08]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 25 B. Leibe Example Applications Mobile tourist guide Self-localization Object/building recognition Photo/video augmentation Aachen Cathedral [Quack, Leibe, Van Gool, CIVR’08]

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Web Demo: Movie Poster Recognition 26 K. Grauman, B. Leibe 50’000 movie posters indexed Query-by-image from mobile phone available in Switzer- land

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Application: Large-Scale Retrieval 27 K. Grauman, B. Leibe [Philbin CVPR’07] QueryResults from 5k Flickr images (demo available for 100k set)

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Application: Image Auto-Annotation 28 K. Grauman, B. Leibe Left: Wikipedia image Right: closest match from Flickr [Quack CIVR’08] Moulin Rouge Tour Montparnasse Colosseum Viktualienmarkt Maypole Old Town Square (Prague)

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial 29 K. Grauman, B. Leibe Outline 1. Detection with Global Appearance & Sliding Windows 2. Local Invariant Features: Detection & Description 3. Specific Object Recognition with Local Features ― Coffee Break ― 4. Visual Words: Indexing, Bags of Words Categorization 5. Matching Local Features 6. Part-Based Models for Categorization 7. Current Challenges and Research Directions