Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Slide credits for this chapter: Frank Dellaert, Forsyth & Ponce, Paul Viola, Christopher Rasmussen.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Recognition by finding patterns
Lecture 3 Nonparametric density estimation and classification
Chapter 4: Linear Models for Classification
Computer vision: models, learning and inference
Discriminative and generative methods for bags of features
The Viola/Jones Face Detector (2001)
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Lecture 5 Template matching
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 5 NEURAL NETWORKS
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
Computer Vision - A Modern Approach
Segmentation by Clustering Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set.
Template matching and object recognition
Texture Reading: Chapter 9 (skip 9.4) Key issue: How do we represent texture? Topics: –Texture segmentation –Texture-based matching –Texture synthesis.
Computer Vision Template matching and object recognition Marc Pollefeys COMP 256 Some slides and illustrations from D. Forsyth, …
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
This week: overview on pattern recognition (related to machine learning)
Classification. An Example (from Pattern Classification by Duda & Hart & Stork – Second Edition, 2001)
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.
Compiled By: Raj G Tiwari.  A pattern is an object, process or event that can be given a name.  A pattern class (or category) is a set of patterns sharing.
Classification ECE 847: Digital Image Processing Stan Birchfield Clemson University.
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Template matching and object recognition. CS8690 Computer Vision University of Missouri at Columbia Recognition by finding patterns We have seen very.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Linear Discrimination Reading: Chapter 2 of textbook.
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Methods for classification and image representation
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
Face Detection Using Neural Network By Kamaljeet Verma ( ) Akshay Ukey ( )
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CSSE463: Image Recognition Day 11
Recognition using Nearest Neighbor (or kNN)
Object detection as supervised classification
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
Linear Discrimination
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Slide credits for this chapter: Frank Dellaert, Forsyth & Ponce, Paul Viola, Christopher Rasmussen Examine each window of an image Classify object class within each window based on a training set images

Example: A Classification Problem Categorize images of fish—say, “Atlantic salmon” vs. “Pacific salmon” Use features such as length, width, lightness, fin shape & number, mouth position, etc. Steps 1.Preprocessing (e.g., background subtraction) 2.Feature extraction 3.Classification example from Duda & Hart

Some errors may be inevitable: the minimum risk (shaded area) is called the Bayes risk Bayes Risk Probability density functions (area under each curve sums to 1)

Finding a decision boundary is not the same as modeling a conditional density. Discriminative vs Generative Models

Loss functions in classifiers Loss –some errors may be more expensive than others e.g. a fatal disease that is easily cured by a cheap medicine with no side-effects -> false positives in diagnosis are better than false negatives –We discuss two class classification: L(1->2) is the loss caused by calling 1 a 2 Total risk of using classifier s

Histogram based classifiers Use a histogram to represent the class-conditional densities –(i.e. p(x|1), p(x|2), etc) Advantage: Estimates converge towards correct values with enough data Disadvantage: Histogram becomes big with high dimension so requires too much data –but maybe we can assume feature independence?

Example Histograms

Kernel Density Estimation Parzen windows: Approximate probability density by estimating local density of points (same idea as a histogram) –Convolve points with window/kernel function (e.g., Gaussian) using scale parameter (e.g., sigma) from Hastie et al.

Density Estimation at Different Scales from Duda et al. Example: Density estimates for 5 data points with differently- scaled kernels Scale influences accuracy vs. generality (overfitting)

Example: Kernel Density Estimation Decision Boundaries SmallerLarger from Duda et al.

Application: Skin Colour Histograms Skin has a very small range of (intensity independent) colours, and little texture –Compute colour measure, check if colour is in this range, check if there is little texture (median filter) –Get class conditional densities (histograms), priors from data (counting) Classifier is

Skin Colour Models Skin chrominance pointsSmoothed, [0,1]-normalized courtesy of G. Loy

Skin Colour Classification courtesy of G. Loy

Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE Results

Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE ROC Curves (Receiver operating characteristics) Plots trade-off between false positives and false negatives for different values of a threshold

Nearest Neighbor Classifier Assign label of nearest training data point to each test data point Voronoi partitioning of feature space for 2-category 2-D and 3-D data from Duda et al.

For a new point, find the k closest points from training data Labels of the k points “vote” to classify Avoids fixed scale choice—uses data itself (can be very important in practice) Simple method that works well if the distance measure correctly weights the various dimensions K-Nearest Neighbors from Duda et al. k = 5 Example density estimate

Neural networks Compose layered classifiers –Use a weighted sum of elements at the previous layer to compute results at next layer –Apply a smooth threshold function from each layer to the next (introduces non-linearity) –Initialize the network with small random weights –Learn all the weights by performing gradient descent (i.e., perform small adjustments to improve results)

Input units Hidden units Output units

Training Adjust parameters to minimize error on training set Perform gradient descent, making small changes in the direction of the derivative of error with respect to each parameter Stop when error is low, and hasn’t changed much Network itself is designed by hand to suit the problem, so only the weights are learned

The vertical face-finding part of Rowley, Baluja and Kanade’s system Figure from “Rotation invariant neural-network based face detection,” H.A. Rowley, S. Baluja and T. Kanade, Proc. Computer Vision and Pattern Recognition, 1998, copyright 1998, IEEE

Architecture of the complete system: they use another neural net to estimate orientation of the face, then rectify it. They search over scales to find bigger/smaller faces. Figure from “Rotation invariant neural-network based face detection,” H.A. Rowley, S. Baluja and T. Kanade, Proc. Computer Vision and Pattern Recognition, 1998, copyright 1998, IEEE

Face Finder: Training Positive examples: –Preprocess ~1,000 example face images into 20 x 20 inputs –Generate 15 “clones” of each with small random rotations, scalings, translations, reflections Negative examples –Test net on 120 known “no-face” images from Rowley et al. from Rowley et al.

Face Finder: Results 79.6% of true faces detected with few false positives over complex test set from Rowley et al. 135 true faces 125 detected 12 false positives

Face Finder Results: Examples of Misses from Rowley et al.

Find the face! The human visual system needs to apply serial attention to detect faces (context often helps to predict where to look)

Convolutional neural networks Template matching using NN classifiers seems to work Low-level features are linear filters –why not learn the filter kernels, too?

Figure from “Gradient-Based Learning Applied to Document Recognition”, Y. Lecun et al Proc. IEEE, 1998 copyright 1998, IEEE A convolutional neural network, LeNet; the layers filter, subsample, filter, subsample, and finally classify based on outputs of this process.

LeNet is used to classify handwritten digits. Notice that the test error rate is not the same as the training error rate, because the learning “overfits” to the training data. Figure from “Gradient-Based Learning Applied to Document Recognition”, Y. Lecun et al Proc. IEEE, 1998 copyright 1998, IEEE

Support Vector Machines Try to obtain the decision boundary directly –potentially easier, because we need to encode only the geometry of the boundary, not any irrelevant wiggles in the posterior. –Not all points affect the decision boundary

Support Vectors

Pedestrian Detection with SVMs

Figure from, “A general framework for object detection,” by C. Papageorgiou, M. Oren and T. Poggio, Proc. Int. Conf. Computer Vision, 1998, copyright 1998, IEEE