Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Slides:

Advertisements

Similar presentations

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Advertisements

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

TP14 - Indexing local features

Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.

1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

1 Image Retrieval Hao Jiang Computer Science Department 2009.

Image alignment Image from

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.

Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Fitting: The Hough transform

Robust and large-scale alignment Image from

Lecture 5 Hough transform and RANSAC

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Lecture 28: Bag-of-words models

1 Model Fitting Hao Jiang Computer Science Department Oct 8, 2009.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Bag-of-features models

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Fitting a Model to Data Reading: 15.1,

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Fitting: The Hough transform

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

Keypoint-based Recognition and Object Search

Alignment and Object Instance Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/16/12.

Machine learning & category recognition Cordelia Schmid Jakob Verbeek.

10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Object Recognition and Augmented Reality

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Indexing Techniques Mei-Chen Yeh.

Clustering with Application to Fast Object Search

Exercise Session 10 – Image Categorization

Image alignment.

Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

CSE 473/573 Computer Vision and Image Processing (CVIP)

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Lecture 06 06/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.

HOUGH TRANSFORM Presentation by Sumit Tandon

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.

Fitting : Voting and the Hough Transform Monday, Feb 14 Prof. Kristen Grauman UT-Austin.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Fitting: The Hough transform

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Fitting Thursday, Sept 24 Kristen Grauman UT-Austin.

1 Model Fitting Hao Jiang Computer Science Department Sept 30, 2014.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Fitting: Voting and the Hough Transform

CS 2770: Computer Vision Feature Matching and Indexing

Video Google: Text Retrieval Approach to Object Matching in Videos

Fitting: Voting and the Hough Transform (part 2)

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Features Readings All is Vanity, by C. Allan Gilbert,

Video Google: Text Retrieval Approach to Object Matching in Videos

Presentation transcript:

Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Today Hough Transform Generalized Hough Transform Implicit Shape Model Video Google

Hough Transform & Generalized Hough Transform

K. Grauman, B. Leibe Hough Transform Origin: Detection of straight lines in clutter – Basic idea: each candidate point votes for all lines that it is consistent with. – Votes are accumulated in quantized array – Local maxima correspond to candidate lines Representation of a line – Usual form y = a x + b has a singularity around 90º. – Better parameterization: x cos(  ) + y sin(  ) =  θ ρ x y

K. Grauman, B. Leibe Examples – Hough transform for a square (left) and a circle (right)

K. Grauman, B. Leibe Hough Transform: Noisy Line Problem: Finding the true maximum TokensVotes θ ρ

K. Grauman, B. Leibe Hough Transform: Noisy Input Problem: Lots of spurious maxima TokensVotes θ ρ

K. Grauman, B. Leibe Generalized Hough Transform [Ballard81] Generalization for an arbitrary contour or shape – Choose reference point for the contour (e.g. center) – For each point on the contour remember where it is located w.r.t. to the reference point – Remember radius r and angle  relative to the contour tangent – Recognition: whenever you find a contour point, calculate the tangent angle and ‘vote’ for all possible reference points – Instead of reference point, can also vote for transformation  The same idea can be used with local features! Slide credit: Bernt Schiele

Implicit Shape Model

K. Grauman, B. Leibe Gen. Hough Transform with Local Features For every feature, store possible “occurrences” For new image, let the matched features vote for possible object positions

K. Grauman, B. Leibe When is the Hough transform useful? Textbooks wrongly imply that it is useful mostly for finding lines – In fact, it can be very effective for recognizing arbitrary shapes or objects The key to efficiency is to have each feature (token) determine as many parameters as possible – For example, lines can be detected much more efficiently from small edge elements (or points with local gradients) than from just points – For object recognition, each token should predict location, scale, and orientation (4D array) Bottom line: The Hough transform can extract feature groupings from clutter in linear time! Slide credit: David Lowe

K. Grauman, B. Leibe 3D Object Recognition Gen. HT for Recognition – Typically only 3 feature matches needed for recognition – Extra matches provide robustness – Affine model can be used for planar objects Slide credit: David Lowe [Lowe99]

K. Grauman, B. Leibe View Interpolation Training – Training views from similar viewpoints are clustered based on feature matches. – Matching features between adjacent views are linked. Recognition – Feature matches may be spread over several training viewpoints.  Use the known links to “transfer votes” to other viewpoints. [Lowe01]

K. Grauman, B. Leibe Recognition Using View Interpolation

K. Grauman, B. Leibe Location Recognition Training

16K. Grauman, B. Leibe Applications Sony Aibo (Evolution Robotics) SIFT usage – Recognize docking station – Communicate with visual cards Other uses – Place recognition – Loop closure in SLAM Slide credit: David Lowe

Video Google

Indexing local features Each patch / region has a descriptor, which is a point in some high-dimensional feature space (e.g., SIFT) K. Grauman, B. Leibe

Indexing local features When we see close points in feature space, we have similar descriptors, which indicates similar local content. Figure credit: A. Zisserman K. Grauman, B. Leibe

Indexing local features We saw in the previous section how to use voting and pose clustering to identify objects using local features K. Grauman, B. Leibe Figure credit: David Lowe

Indexing local features With potentially thousands of features per image, and hundreds to millions of images to search, how to efficiently find those that are relevant to a new image? – Low-dimensional descriptors : can use standard efficient data structures for nearest neighbor search – High-dimensional descriptors: approximate nearest neighbor search methods more practical – Inverted file indexing schemes K. Grauman, B. Leibe

For text documents, an efficient way to find all pages on which a word occurs is to use an index… We want to find all images in which a feature occurs. To use this idea, we’ll need to map our features to “visual words”. K. Grauman, B. Leibe Indexing local features: inverted file index

Visual words K. Grauman, B. Leibe More recently used for describing scenes and objects for the sake of indexing or classification. Sivic & Zisserman 2003; Csurka, Bray, Dance, & Fan 2004; many others.

Inverted file index for images comprised of visual words Image credit: A. Zisserman K. Grauman, B. Leibe Word number List of image numbers

Bags of visual words Summarize entire image based on its distribution (histogram) of word occurrences. Analogous to bag of words representation commonly used for documents. K. Grauman, B. Leibe Image credit: Fei-Fei Li

Video Google System 1.Collect all words within query region 2.Inverted file index to find relevant frames 3.Compare word counts 4.Spatial verification Sivic & Zisserman, ICCV 2003 Demo online at : research/vgoogle/index.html 26K. Grauman, B. Leibe Query region Retrieved frames

Visual vocabulary formation Issues: Sampling strategy Clustering / quantization algorithm What corpus provides features (universal vocabulary?) Vocabulary size, number of words K. Grauman, B. Leibe

Sampling strategies K. Grauman, B. Leibe Image credits: F-F. Li, E. Nowak, J. Sivic Dense, uniformly Sparse, at interest points Randomly Multiple interest operators To find specific, textured objects, sparse sampling from interest points often more reliable. Multiple complementary interest operators offer more image coverage. For object categorization, dense sampling offers better coverage. [See Nowak, Jurie & Triggs, ECCV 2006]

Clustering / quantization methods k-means (typical choice), agglomerative clustering, mean-shift,… 29K. Grauman, B. Leibe