1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Slides:

Advertisements

Similar presentations

COMPUTER AIDED DIAGNOSIS: CLASSIFICATION Prof. Yasser Mostafa Kadah –

Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Data Mining Classification: Alternative Techniques

Data Mining Classification: Alternative Techniques

Support Vector Machines

1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.

Indian Statistical Institute Kolkata

Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.

Lecture 20 Object recognition I

CS292 Computational Vision and Language Pattern Recognition and Classification.

Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.

Chapter 2: Pattern Recognition

Announcements See Chapter 5 of Duda, Hart, and Stork. Tutorial by Burge linked to on web page. “Learning quickly when irrelevant attributes abound,” by.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.

Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Clustering Unsupervised learning Generating “classes”

METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.

: Chapter 10: Image Recognition 1 Montri Karnjanadecha ac.th/~montri Image Processing.

This week: overview on pattern recognition (related to machine learning)

Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.

Principles of Pattern Recognition

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

1 Pattern Recognition Concepts How should objects be represented? Algorithms for recognition/matching * nearest neighbors * decision tree * decision functions.

Image Classification 영상분류

Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.

Classification Heejune Ahn SeoulTech Last updated May. 03.

1 E. Fatemizadeh Statistical Pattern Recognition.

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

Pattern Recognition 1 Pattern recognition is: 1. The name of the journal of the Pattern Recognition Society. 2. A research area in which patterns in data.

Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Chapter 12 Object Recognition Chapter 12 Object Recognition 12.1 Patterns and pattern classes Definition of a pattern class:a family of patterns that share.

CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.

Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Introduction to Pattern Recognition (การรู้จํารูปแบบเบื้องต้น)

Chapter 13 (Prototype Methods and Nearest-Neighbors )

METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.

Fuzzy Pattern Recognition. Overview of Pattern Recognition Pattern Recognition Procedure Feature Extraction Feature Reduction Classification (supervised)

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:

Fuzzy Logic in Pattern Recognition

k-Nearest neighbors and decision tree

IMAGE PROCESSING RECOGNITION AND CLASSIFICATION

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Instance Based Learning

K Nearest Neighbor Classification

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Revision (Part II) Ke Chen

Instance Based Learning

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Learning Chapter 18 and Parts of Chapter 20

Hairong Qi, Gonzalez Family Professor

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Presentation transcript:

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about the objects (models) Requires compatible representation of model with data Requires appropriate reasoning techniques Reasoning techniques include: Classification (supervised & unsupervised) Graph matching Rule-based processing Hybrid techniques

2Ellen L. Walker Knowledge Representation Syntax = symbols and how they are used Semantics = meanings of symbols & their arrangement Representation = syntax + semantics

3Ellen L. Walker Types of representations Feature vectors: [area=200, eccentricity=1,...] Grammars: person => head+trunk+legs Predicate logic: Long (x) and thin(x) -> road(x) Production rules : if R is long and R is thin then R is a road segment Graphs

4Ellen L. Walker Classification Feature-based object recognition Unknown object is represented by a feature vector e.g height, weight Known objects are also represented by feature vectors Grouped into classes Class = set of objects that share important properties Reject class = generic class for all unidentifiable objects Classification = assigning the unknown object the label of the appropriate class

5Ellen L. Walker Types of Classification Discriminant classification (supervised) Create dividing lines (discriminants) to separate classes based on (positive and negative) examples Distance classification (unsupervised) Create clusters in feature space to collect items of the same class A priori knowledge = prespecified discriminant functions or cluster centers

6Ellen L. Walker Classification Systems Pre-production (training data) Extract relevant features from training examples of each class (feature vectors) Construct (by hand) or use machine learning to develop discrimination functions to correctly classify training examples Production (test data and real data) Extract a feature vector from the image Apply the discrimination functions determined in preproduction to determine the closest class to the object Report the result (label) of the object

7Ellen L. Walker Evaluating Classification Systems Classification error = object classified into wrong class False positive = item identified as class, should be not-class False negative = item identified as not-class, should be class Increasing sensitivity to true positives often increases false negatives as well True Positive rate (desired value: 1) Number of true positives / total number of positives False Positive rate (desired value: 0) Number of false positives / total number of negatives Errors are measured on independent test data - these data have known classifications, but are not used in any way in the development (pre-production stage) of the system

8Ellen L. Walker Discrimination functions Let g(x) be “goodness” of x as a member of class g Discrimination function between g1 and g2 is simply g1(x) – g2(x) = 0 (i.e. both classes are equally good on the dividing line) An object’s class is the “g” that gives the largest value for x Linear functions are often used for g(x) With one example/class, this reduces to nearest mean Perceptrons represent linear discrimination functions (see NN notes)

9Ellen L. Walker Nearest Mean Let g(x) be distance of x from the average of all training objects in g Compute Euclidean distance: ||x2-x1|| = sqrt(sum over all dimensions(x2[d]-x1[d]) E.g. Sqrt((height difference) 2 + (weight difference) 2 ) Works beautifully if classes are well separated and compact But consider a "horizontal class" or a "vertical class" !

10Ellen L. Walker Scaled Distance Scaling the distance based on the ”shape" of the class can help (variance in each dimension) Variance is the square of distances of all related points from the mean In one dimension, we can measure “Standard Deviations,” i.e.

11Ellen L. Walker Mahalanobis Distance In multiple dimensions, we have a covariance matrix. A Covariance Matrix is a square matrix for describing the relationship among features in a feature vector Mahalanobis Distance effectively multiplies by the inverse of the Covariance Matrix

12Ellen L. Walker Nearest Neighbor Save the vectors for all the training examples (instead of just the mean for each class) Result of classification of a test vector is the class of the nearest neighbor in the training set Extension - let k nearest neighbors "vote" Can easily accommodate overlapping and oddly shaped classes (e.g. dumbbell shape) More costly than nearest mean because of more comparisons (use tree data structures to help) Highly dependent on choices and number of training examples

13Ellen L. Walker Statistical Method Minimum error criterion -- minimize probability that a new element will be misclassified (need to know prior probabilities of feature vector elements & combinations) Correct class is the one that maximizes (over all classes) P(class|vector) P(class|vector) = P(vector|class)P(class) / P(vector) -- Bayes’ rule

14Ellen L. Walker Decision Trees Each node is a question Each leaf is a decision hair? legs? snakefrog Pet? Cat Lion

15Ellen L. Walker Decision Trees Build a classification tree to classify the training set Each branch in the tree denotes a comparison & decision process Each leaf of the tree is a classification Make the tree as “balanced” as possible! The branches in the tree represent (parts of) discriminant functions - you can classify an unknown object by walking the tree! Can be constructed by hand or by algorithm

16Ellen L. Walker Automatic Construction of Decision Tree Use the idea of information content - which feature gives the most information to divide the existing data at that node At the root: which feature contributes the most information to a class? If all elements of a feature lead to a class, that is the most information At a node: given the subset of features remaining based on decisions made, which contributes the most information?

17Ellen L. Walker Clustering No training set needed! Hierarchical clustering: recursively divide data into most different (non-overlapping) subsets Non-hierarchical methods: divide data directly among some (given?) number of clusters K-means clustering Fuzzy C-means clustering Clustering to special shapes, e.g. shell clustering