A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept. 2006.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Aggregating local image descriptors into compact codes
Presented by Xinyu Chang
Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
Statistical Topic Modeling part 1
São Paulo Advanced School of Computing (SP-ASC’10). São Paulo, Brazil, July 12-17, 2010 Looking at People Using Partial Least Squares William Robson Schwartz.
Bag of Features Approach: recent work, using geometric information.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Recognising Panoramas
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Scale Invariant Feature Transform (SIFT)
Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)
Overview Introduction to local features
Exercise Session 10 – Image Categorization
School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping.
Bag-of-Words based Image Classification Joost van de Weijer.
Computer vision.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
AUTOMATIC ANNOTATION OF GEO-INFORMATION IN PANORAMIC STREET VIEW BY IMAGE RETRIEVAL Ming Chen, Yueting Zhuang, Fei Wu College of Computer Science, Zhejiang.
Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Local features: detection and description
Scale Invariant Feature Transform (SIFT)
Duc-Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department to Information and Computer Science –University of Trento.
Presented by David Lee 3/20/2006
1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.
SIFT Scale-Invariant Feature Transform David Lowe
Presented by David Lee 3/20/2006
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Distinctive Image Features from Scale-Invariant Keypoints
The topic discovery models
Learning Mid-Level Features For Recognition
Paper Presentation: Shape and Matching
Feature description and matching
The topic discovery models
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Bo Zhang
The topic discovery models
ECE734 Project-Scale Invariant Feature Transform Algorithm
Feature descriptors and matching
Lecture 5: Feature invariance
Recognition and Matching based on local invariant features
Presentation transcript:

A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept. 2006

Outline Introduction Image Representation –Bag-of-Visterms (BOV) Representation –Probabilistic Latent Semantic Analysis (PLSA) Scene Classification Experiments –Classification –Image Ranking Conclusion

Introduction Main work –Scene modeling and classification What’s new? –Combine text modeling methods and local invariant features to represent an image. A text-like bag-of-visterms representation (histogram of quantized local visual features) A text-like bag-of-visterms representation (histogram of quantized local visual features) Probabilistic Latent Semantic Analysis (PLSA) Probabilistic Latent Semantic Analysis (PLSA) –Scene classification is based on the image representation –Scenes can be ranked via PLSA

Introduction Framework An image Interest point detector Local descriptors Quantization BOVPLSA Classification (SVM) Classification / ranking Low level feature extraction Approach to text- like representation Text-modeling methods Feature Extraction

Image Representation Local invariant features – Interest point detection Extract characteristic points and more generally regions from the images. Invariant to geometric and photometric transformations, given an image and transformed versions, same points are extracted. Employ the Difference of Gaussians (DOG) point detector: – –Compare a point with its eight neighbors to find minimum/maximum. –Invariant to translation, scale, rotation and illumination variations.

Image Representation –Local descriptors Compute the descriptor on the region around each interest point. Use Scale Invariant Feature Transform (SIFT) feature as local descriptor. –Low level feature extraction example Each point has a feature vector of 128D

Image Representation Quantization – Quantize each local descriptor into a symbol via K- means Bag-of-visterms representation –Histogram of the visterms –Cons: no spatial information between visterms.

Image Representation Probabilistic Latent Semantic Analysis (PLSA) – Introduce latent variables z l, called aspect, and associate a z l with each observation (visterm), –Build a joint probability model over images and visterms –Likelihood of the model parameters is –Image representation

Image Representation Polysemy and synonymy with visterms –Polysemy: a single visterm may represent different scene content. –Synonymy: several visterms may characterized the same image content. –Example: samples from 3 randomly selected visterms from a vocabulary of size samples from 3 randomly selected visterms from a vocabulary of size not all visterms have a clear semantic interpretation. not all visterms have a clear semantic interpretation. –Pros of PLSA Introduce aspect to capture visterm co-occurrence, thus can handle polysemy and synonymy issues.

Experiments Classification –BOV classification (three-class) Dataset: indoor, city, landscape Training&testing: the whole dataset is slip into 10 parts, one for training, the other 9 for testing. Baseline methods: histograms on low-level features;

Experiments –PLSA classification (three-class) PLSA-I: use the same part of data to train SVM as well as learning the aspect models. PLSA-O: use an auxiliarty dataset to learn the aspect models.

Experiments Aspect-based image ranking –Given an aspect z, images can be ranked according to –Dataset: landscape/city

Conclusion The proposed scene modeling method is effective for scene classification A visual scene is presented as a mixture of aspects in PLSA modeling.