ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.

ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011

Pattern Recognition Overview Unknown Classifier/ Regressor Feature extraction: extract the most discriminative features to concisely represent the original data, typically involving dimensionality reduction Training/Learning: learn a mapping function that maps input to output Classification/regression: map the input to a discrete output value for classification and to continuous output value for regression. Feature extraction Training Raw Data Features Output Values Training Testing Raw Data Features Output Values Training Classification/ Regression Feature extraction Learned Classifier/ Regressor Training Testing

Pattern Recognition Overview (cont’d) Supervised learning Both input (feature) and output (class labels) are provided Unsupervised learning-only input is given Clustering Dimensionality reduction Density estimation Semi-supervised learning-some input has output labels and others do not have

Examples of Pattern Recognition Applications Computer/Machine Vision  object recognition, activity recognition, image segmentation, inspection Medical Imaging  Cell classification Optical Character Recognition  Machine or hand written character/digit recognition Brain Computer Interface  Classify human brain states from EEG signals Speech Recognition  Speaker recognition, speech understanding, language translation Robotics  Obstacle detection, scene understanding, navigation

Computer Vision Example: Facial Expression Recognition

Machine Vision Example

Example: Handwritten Digit Recognition

8 Probability Calculus P(X ˅ Y)=P(X)+P(Y) - P(X ˄ Y) U is the sample space X is a subset of the outcome or an event, i.e, X and Y are mutually exclusive

9 Probability Calculus (cont’d) Conditional independence The Chain Rule Given three events A, B, C

The Rules of Probability Sum Rule Product Rule

Bayes’ Theorem posterior  likelihood × prior

Bayesian Rule (cont’d) Assume E 1 and E 2 are independent given H, the above equation may be written as where is the prior and is the likelihood of H given E 2

15 A Simple Example Consider two related variables: 1. Drug (D) with values y or n 2. Test (T) with values +ve or –ve And suppose we have the following probabilities: P(D = y) = 0.001 P(T = +ve | D = y) = 0.8 P(T = +ve | D = n) = 0.01 These probabilities are sufficient to define a joint probability distribution. Suppose an athlete tests positive. What is the probability that he has taken the drug?

Expectation (or Mean) For discrete RV X For continuous RV X Conditional Expectation 16

Expectations Conditional Expectation (discrete) Approximate Expectation (discrete and continuous)

Variance The variance of a RV X Standard deviation Covariance of RVs X and Y, Chebyshev inequality 18

Variances and Covariances

Independence If X and Y are independent, then 20

Probability Densities p(x) is the density function, while P(x) is the cumulative distribution. P(x) is a non-decreasing function.

Transformed Densities

The Gaussian Distribution

Gaussian Mean and Variance

The Multivariate Gaussian  mean vector  covariance matrix

Minimum Misclassification Rate Two types of mistakes: False positive (type 1) False negative (type 2) The above is called Bayes error. Minimum Bayes error is achieved at x 0

Generative vs Discriminative Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly

ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.

Similar presentations

Presentation on theme: "ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.

Similar presentations

Presentation on theme: "ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011."— Presentation transcript:

Similar presentations

About project

Feedback