Generic object recognition

Slides:



Advertisements
Similar presentations
Joint Face Alignment The Recognition Pipeline
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Recognition: A machine learning approach
Chapter 1: Introduction to Pattern Recognition
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A Study of Approaches for Object Recognition
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Visual Object Recognition Rob Fergus Courant Institute, New York University
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Introduction to machine learning
Crash Course on Machine Learning
Exercise Session 10 – Image Categorization
Advanced Computer Vision Devi Parikh Electrical and Computer Engineering.
TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Classification. An Example (from Pattern Classification by Duda & Hart & Stork – Second Edition, 2001)
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Face detection Slides adapted Grauman & Liebe’s tutorial
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
11/26/2015 Copyright G.D. Hager Class 2 - Schedule 1.Optical Illusions 2.Lecture on Object Recognition 3.Group Work 4.Sports Videos 5.Short Lecture on.
Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
Advanced Computer Vision Devi Parikh Electrical and Computer Engineering.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Recognition Readings – C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1. – Szeliski, Chapter (eigenfaces)
Introduction to Recognition CS4670/5670: Intro to Computer Vision Noah Snavely mountain building tree banner vendor people street lamp.
Li Fei-Fei, Stanford Rob Fergus, NYU Antonio Torralba, MIT Recognizing and Learning Object Categories: Year 2009 ICCV 2009 Kyoto, Short Course, September.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
TP12 - Local features: detection and description
Tracking Objects with Dynamics
Lecture 25: Introduction to Recognition
Local features: detection and description May 11th, 2017
Lecture 14: Introduction to Recognition
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Object detection as supervised classification
Lecture 25: Introduction to Recognition
Object-Graphs for Context-Aware Category Discovery
Lecture 26: Faces and probabilities
ICCV 2009 Kyoto, Short Course, September 24
Brief Review of Recognition + Context
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
CS4670: Intro to Computer Vision
Pattern Recognition and Machine Learning
Announcements Project 2 artifacts Project 3 due Thursday night
Announcements Project 4 out today Project 2 winners help session today
Announcements Artifact due Thursday
Advanced Topics in Computer Vision
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Lecture: Object Recognition
Announcements Artifact due Thursday
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

Generic object recognition Wed, April 6 Kristen Grauman

What does recognition involve? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Verification: is that a lamp? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Detection: are there people? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Identification: is that Potala Palace? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Object categorization mountain tree building banner street lamp vendor people Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Scene and context categorization outdoor city … Source: Fei-Fei Li, Rob Fergus, Antonio Torralba.

Instance-level recognition problem John’s car

Generic categorization problem

Object Categorization Task Description “Given a small number of training images of a category, recognize a-priori unknown instances of that category and assign the correct category label.” Which categories are feasible visually? German shepherd animal dog living being “Fido” K. Grauman, B. Leibe K. Grauman, B. Leibe

Visual Object Categories Basic Level Categories in human categorization [Rosch 76, Lakoff 87] The highest level at which category members have similar perceived shape The highest level at which a single mental image reflects the entire category The level at which human subjects are usually fastest at identifying category members The first level named and understood by children The highest level at which a person uses similar motor actions for interaction with category members K. Grauman, B. Leibe K. Grauman, B. Leibe

Visual Object Categories Basic-level categories in humans seem to be defined predominantly visually. There is evidence that humans (usually) start with basic-level categorization before doing identification.  Basic-level categorization is easier and faster for humans than object identification! How does this transfer to automatic classification algorithms? … animal Abstract levels … … quadruped … Basic level dog cat cow German shepherd Doberman Individual level … “Fido” … K. Grauman, B. Leibe K. Grauman, B. Leibe

How many object categories are there? ~10,000 to 30,000 Biederman 1987 Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. 13

~10,000 to 30,000 14

Other Types of Categories Functional Categories e.g. chairs = “something you can sit on” K. Grauman, B. Leibe K. Grauman, B. Leibe

Other Types of Categories Ad-hoc categories e.g. “something you can find in an office environment” K. Grauman, B. Leibe K. Grauman, B. Leibe

Why recognition? Recognition a fundamental part of perception e.g., robots, autonomous agents Organize and give access to visual content Connect to information Detect trends and themes 17

Posing visual queries Yeh et al., MIT Belhumeur et al. Kooaba, Bay & Quack et al.

Autonomous agents able to detect objects http://www.darpa.mil/grandchallenge/gallery.asp

Finding visually similar objects 20

Discovering visual patterns Sivic & Zisserman Objects Lee & Grauman Categories Wang et al. Actions Kristen Grauman

Auto-annotation Gammeter et al. T. Berg et al. Kristen Grauman

Challenges: robustness Illumination Object pose Clutter Intra-class appearance Occlusions Viewpoint Kristen Grauman

Challenges: robustness Realistic scenes are crowded, cluttered, have overlapping objects.

Challenges: importance of context slide credit: Fei-Fei, Fergus & Torralba

Challenges: importance of context

Challenges: complexity Thousands to millions of pixels in an image 3,000-30,000 human recognizable object categories 30+ degrees of freedom in the pose of articulated objects (humans) Billions of images indexed by Google Image Search 18 billion+ prints produced from digital camera images in 2004 295.5 million camera phones sold in 2005 About half of the cerebral cortex in primates is devoted to processing visual information [Felleman and van Essen 1991] Kristen Grauman 27

Challenges: learning with minimal supervision Less More Unlabeled, multiple objects Classes labeled, some clutter Cropped to object, parts and classes labeled Kristen Grauman

What works most reliably today Reading license plates, zip codes, checks Source: Lana Lazebnik 29

What works most reliably today Reading license plates, zip codes, checks Fingerprint recognition Source: Lana Lazebnik 30

What works most reliably today Reading license plates, zip codes, checks Fingerprint recognition Face detection Source: Lana Lazebnik 31

What works most reliably today Reading license plates, zip codes, checks Fingerprint recognition Face detection Recognition of flat textured objects (CD covers, book covers, etc.) Source: Lana Lazebnik 32

Generic category recognition: basic framework Build/train object model Choose a representation Learn or fit parameters of model / classifier Generate candidates in new image Score the candidates Kristen Grauman

Generic category recognition: representation choice Window-based Part-based Kristen Grauman

Supervised classification Given a collection of labeled examples, come up with a function that will predict the labels of new examples. How good is some function we come up with to do the classification? Depends on Mistakes made Cost associated with the mistakes “four” “nine” ? Training examples Novel input Kristen Grauman

Supervised classification Given a collection of labeled examples, come up with a function that will predict the labels of new examples. Consider the two-class (binary) decision problem L(4→9): Loss of classifying a 4 as a 9 L(9→4): Loss of classifying a 9 as a 4 Risk of a classifier s is expected loss: We want to choose a classifier so as to minimize this total risk Kristen Grauman

Supervised classification Optimal classifier will minimize total risk. At decision boundary, either choice of label yields same expected loss. Feature value x If we choose class “four” at boundary, expected loss is: If we choose class “nine” at boundary, expected loss is: Kristen Grauman

Supervised classification Optimal classifier will minimize total risk. At decision boundary, either choice of label yields same expected loss. Feature value x So, best decision boundary is at point x where To classify a new point, choose class with lowest expected loss; i.e., choose “four” if Kristen Grauman

Supervised classification Optimal classifier will minimize total risk. At decision boundary, either choice of label yields same expected loss. P(4 | x) P(9 | x) Feature value x So, best decision boundary is at point x where To classify a new point, choose class with lowest expected loss; i.e., choose “four” if How to evaluate these probabilities? Kristen Grauman

Probability Basic probability X is a random variable P(X) is the probability that X achieves a certain value or Conditional probability: P(X | Y) probability of X given that we already know Y called a PDF probability distribution/density function continuous X discrete X Source: Steve Seitz

Example: learning skin colors We can represent a class-conditional density using a histogram (a “non-parametric” distribution) Percentage of skin pixels in each bin Feature x = Hue P(x|skin) P(x|not skin) Feature x = Hue Kristen Grauman

Example: learning skin colors We can represent a class-conditional density using a histogram (a “non-parametric” distribution) Feature x = Hue P(x|skin) Now we get a new image, and want to label each pixel as skin or non-skin. What’s the probability we care about to do skin detection? P(x|not skin) Feature x = Hue Kristen Grauman

Bayes rule posterior likelihood prior Where does the prior come from? Why use a prior?

Example: classifying skin pixels Now for every pixel in a new image, we can estimate probability that it is generated by skin. Brighter pixels  higher probability of being skin Classify pixels based on these probabilities Kristen Grauman

Example: classifying skin pixels Gary Bradski, 1998 Kristen Grauman

Example: classifying skin pixels Using skin color-based face detection and pose estimation as a video-based interface Gary Bradski, 1998 Kristen Grauman

Supervised classification Want to minimize the expected misclassification Two general strategies Use the training data to build representative probability model; separately model class-conditional densities and priors (generative) Directly construct a good decision boundary, model the posterior (discriminative)

Coming up Pset 4 is posted, due in 2 weeks Next week: Face detection Categorization with local features and part-based models