Faculty of Engineering, Kagawa University,

Slides:

Advertisements

Similar presentations

Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.

Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.

Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.

Pattern Classification Chapter 2 (Part 2)0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.

Classification and Decision Boundaries

Discriminative and generative methods for bags of features

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.

Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap

Regionalized Variables take on values according to spatial location. Given: Where: A “structural” coarse scale forcing or trend A random” Local spatial.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)

This week: overview on pattern recognition (related to machine learning)

1 Facial Expression Recognition using KCCA with Combining Correlation Kernels and Kansei Information Yo Horikawa Kagawa University, Japan.

Principles of Pattern Recognition

Outline Classification Linear classifiers Perceptron Multi-class classification Generative approach Naïve Bayes classifier 2.

Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.

Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology.

1 E. Fatemizadeh Statistical Pattern Recognition.

ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.

Lecture 4 Linear machine

Bayesian Decision Theory Basic Concepts Discriminant Functions The Normal Density ROC Curves.

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.

Scale Invariant Feature Transform (SIFT)

METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

Mete Ozay, Fatos T. Yarman Vural —Presented by Tianxiao Jiang

ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.

Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.

1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.

1 Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Yo Horikawa Kagawa University, Japan.

KNN & Naïve Bayes Hongning Wang

8/16/99 Computer Vision: Vision and Modeling. 8/16/99 Lucas-Kanade Extensions Support Maps / Layers: Robust Norm, Layered Motion, Background Subtraction,

Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.

Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:

Lecture 2. Bayesian Decision Theory

CS262: Computer Vision Lect 09: SIFT Descriptors

Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

CH 5: Multivariate Methods

K Nearest Neighbor Classification

The point is class B via 3NNC.

Learning with information of features

Outline S. C. Zhu, X. Liu, and Y. Wu, “Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo”, IEEE Transactions On Pattern Analysis And Machine.

Nearest-Neighbor Classifiers

REMOTE SENSING Multispectral Image Classification

REMOTE SENSING Multispectral Image Classification

Neuro-Computing Lecture 4 Radial Basis Function Network

Learning in Pictures An intuitive approach.

COSC 4335: Other Classification Techniques

The basics of Bayes decision theory

Pattern Recognition and Machine Learning

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Generally Discriminant Analysis

McCulloch–Pitts Neuronal Model :

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Other Classification Models: Support Vector Machine (SVM)

Test #1 Thursday September 20th

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.

Hairong Qi, Gonzalez Family Professor

Presentation transcript:

Faculty of Engineering, Kagawa University, Quadratic boundaries in N-N classifiers with dissimilarity-based representations Yo Horikawa Faculty of Engineering, Kagawa University, Takamatsu 761-0396 Japan horikawa@eng.kagawa-u.ac.jp The nearest neighbor (N-N) classifiers with the dissimilarity-based representations cause quadratic decision boundaries. The dissimilarity-based representations are effective for the classification of high-dimensional patterns with different variances.

Dissimilarity-based representations x2 d(x, xm) d(x, x2) d(x) d(x, x1) x x1 d(x, x1) d(x, xm) xm d(x, x2) Feature space Dissimilarity space

Feature space Dissimilarity space EX. 1 Feature space Dissimilarity space x1 = (0, 0) ∈ C1 x2 = (1, 0) ∈ C2 d1 = (0, 1, 1) d2 = (1, 0, 0) x3 = (1, 0) ∈ C2 d3 = (1, 0, 0) the N-N boundary in the feature space x = (x, y) d = ((x2+y2)1/2, ((x-1)2+y2)1/2, ((x-1)2+y2)1/2) the N-N boudary in the dissimilarity space: h(x, y) = -4(x2-2x+y2+1)1/2+2(x2+y2)1/2+1 <===> h(x, y) = d(d, d1)2- d(d, d2)2 Quadratic discriminant function Linear discriminant function

EX. 2 x1 = (-2, -2) ∈ C1 x3 = (1, 0) ∈ C2 x2 = (0, 0) ∈ C1 x4 = (1.5, 0.5) ∈ C2 Decision boundary with the N-N classifier in the dissimilarity space: 　min(d(d, d1)2, d(d, d2)2)-min(d(d, d3)2, d(d, d4)2) = 0 Combination of quadratic curves In general, the decision boundary with the dissimilarity-based representations surrounds the prototypes of the class with small variance. → effective when the variances within class differ.

Effects of the dimensionality of the patterns EX. 1 Correct classification ratio of C1 with the N-N classifier Original patterns: x1, x2 ∈ C1 (Un(-0.5, 0.5)) x3, x4 (= 0) ∈ C2 (δn(0)) n = 2 Dissimilarity representations: d1 = (0, |x1-x2|, |x1|, |x1|) d2 = (|x1-x2|, 0, |x2|, |x2|) d3 (= d4) = (|x1|, |x2|, 0, 0) Fig. A1. Correct classification ratio for n-dimensional data C1 (Un(-0.5, 0.5)) and C2 (δn(0)) with dissimilarity representations and original patterns.

EX. 2 Original patterns: 10 prototypes ∈ C1 Nn(0, 12) Correct classification ratio of C1 with the N-N classifier Original patterns: 10 prototypes ∈ C1 Nn(0, 12) 10 prototypes ∈ C2 Nn(0.5, 0.52) n = 2 Dissimilarity representations: 20-dimensional space Fig. A2. Correct classification ratio for n-dimensional data C1 (Nn(0, 12)) and C2 (Nn(0.5, 0.52)) with dissimilarity representations and original patterns.

Texture classification The dissimilarity-based representations are applied to the classification of texture images with the bispectrum-based features. 5 prototypes: randomly shifted in [-10, 10], rotated in [0º, 360º], scaled in [×0.5, ×1.0] and with noise N(0, 1002) for Bark.0000 and Bark.0001. 100 test patterns suffering from the random transformations and noise of the same kinds for each image are classified with the invariant features based on the bispectrum (the dimensionality is 108) and with their dissimilarity representations using the N-N method. The correct classification ratio over ten trials: 0.82 with the original features 0.99 with the dissimilarity representations. (a) Bark.0000 (b) Bark.0001 Fig. 3. Texture image data in VisTex [12].

Summary Dissimilarity-based Quadratic boundary Effective for high- The dissimilarity-based representations make the decision boundaries in the N-N classifiers quadratic, which are close to those of the optimal Bayes rule when patterns are normally distributed with different variances within class. Further, the dissimilarity-based representations are effective for higher-dimensional patterns with small prototypes. This is attributed to the fact that quadratic decision boundaries correctly divide regions opposite to prototypes, which are dominant in a high-dimensional space. Dissimilarity-based Quadratic boundary Effective for high- representation dimensional Patterns