1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 8 Sept 23, 2005 Nanjing University of Science & Technology.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Detection Chia-Hsin Cheng. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outlines Detection Theory Simple Binary Hypothesis Tests Bayes.
Pattern Recognition and Machine Learning
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Bayesian Decision Theory
Pattern Classification Chapter 2 (Part 2)0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.
Bayesian Decision Theory Chapter 2 (Duda et al.) – Sections
Lecture 20 Object recognition I
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 2: Bayesian Decision Theory (Part 1) Introduction Bayesian Decision Theory–Continuous Features All materials used in this course were taken from.
Pattern Classification, Chapter 3 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P.
Visual Recognition Tutorial
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 3 (part 1): Maximum-Likelihood & Bayesian Parameter Estimation  Introduction  Maximum-Likelihood Estimation  Example of a Specific Case  The.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Principles of Pattern Recognition
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 13 Oct 14, 2005 Nanjing University of Science & Technology.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 02: BAYESIAN DECISION THEORY Objectives: Bayes.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 03: GAUSSIAN CLASSIFIERS Objectives: Whitening.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 21 Oct 28, 2005 Nanjing University of Science & Technology.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
URL:.../publications/courses/ece_8443/lectures/current/exam/2004/ ECE 8443 – Pattern Recognition LECTURE 15: EXAM NO. 1 (CHAP. 2) Spring 2004 Solutions:
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Optimal Bayes Classification
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 12 Sept 30, 2005 Nanjing University of Science & Technology.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
Bayesian Decision Theory Basic Concepts Discriminant Functions The Normal Density ROC Curves.
ECE 8443 – Pattern Recognition LECTURE 04: PERFORMANCE BOUNDS Objectives: Typical Examples Performance Bounds ROC Curves Resources: D.H.S.: Chapter 2 (Part.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.
Basic Technical Concepts in Machine Learning Introduction Supervised learning Problems in supervised learning Bayesian decision theory.
Objectives: Chernoff Bound Bhattacharyya Bound ROC Curves Discrete Features Resources: V.V. – Chernoff Bound J.G. – Bhattacharyya T.T. – ROC Curves NIST.
Linear Classifier Team teaching.
Objectives: Loss Functions Risk Min. Error Rate Class. Resources: DHS – Chap. 2 (Part 1) DHS – Chap. 2 (Part 2) RGO - Intro to PR MCE for Speech MCE for.
Lecture 2. Bayesian Decision Theory
Basic Technical Concepts in Machine Learning
Lecture 1.31 Criteria for optimal reception of radio signals.
CS479/679 Pattern Recognition Dr. George Bebis
Chapter 3: Maximum-Likelihood Parameter Estimation
Probability Theory and Parameter Estimation I
LECTURE 03: DECISION SURFACES
Special Topics In Scientific Computing
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
LECTURE 05: THRESHOLD DECODING
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 05: THRESHOLD DECODING
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
EE513 Audio Signals and Systems
LECTURE 23: INFORMATION THEORY REVIEW
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
LECTURE 05: THRESHOLD DECODING
LECTURE 11: Exam No. 1 Review
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Presentation transcript:

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 8 Sept 23, 2005 Nanjing University of Science & Technology

2 May be Optimum

3 Review 2: Classifier performance Measures 1. A’Posteriori Probability (Maximize) 2. Probability of Error ( Minimize) 3. Bayes Average Cost (Maximize) 4. Probability of Detection ( Maximize with fixed Probability of False alarm) (Neyman Pearson Rule) 5. Losses (Minimize the maximum)

4 If l( x ) N Likelihood ratio < > C1C1 C2C2 Threshold Review 3: MAP, MPE, and Bayes Classification Rule (C 22 - C 12 ) P(C 2 ) (C 11 - C 21 ) P(C 1 ) N BAYES = P(C 2 ) P(C 1 ) N MAP = P(C 2 ) P(C 1 ) N MPE =

5 Topics for Lecture 8 1. Two Dimensional problem 2. Solution in likelihood space 3. Solution in pattern space 4. Solution in feature space 5. Calculation of probability of error 6. Transformational Theorem

6 Example : 2 Class and 2 observations C 1 : x = [ x 1, x 2 ] T ~ p(x 1, x 2 | C 1 ), P(C 1 ) C 2 : x = [ x 1, x 2 ] T ~ p(x 1, x 2 | C 2 ), P(C 2 ) Given: C 1 : x ~ N( M 1, K 1 ) C 2 : x ~ N( M 2, K 2 ) M 1 =M 2 =K 1 =K2 =K2 = Find Optimum decision rule (MPE) P(C 1 ) = P(C 2 ) = 1/2

7

8

9 taking the ln of both sides gives an equivalent rule Solution in different spaces If - (x 1 + x 2 - 1) 0 < > C2C2 C1C1 In Observation Space If x 1 + x 2 1 < > C1C1 C2C2 In feature space y=g(x 1,x 2 ) rearranging gives If y 1 < > C1C1 C2C2 g(x 1,x 2 ) = x 1 +x 2

10 In Observation Space 1 1 x 1 + x 2 = 1 x2x2 x1x1 decide C 2 decide C 1 10 decide C 2 decide C 1 where y = x 1 + x 2 y In Feature Space (Sufficient statistic for this problem)

11 Calculation of P(error | C 1 ) for 2 dimensional Example in y space P(error | C 1 ) = P(decide C 2 |C 1 ) = p( y | C 1 ) dy R2R2 P(error | C 1 ) = exp(-y 2 /4)dy 1 oo Under C 1 : x 1 and x 2 are independent normally distributed gaussian random variables N(0,1) thus y is normally distributed as N(0,2). 1 2 pi

12 Calculation of P(error | C 2 ) for 2 dimensional Example in y space P(error | C 2 ) = P(decide C 1 |C 2 ) = p( y | C 2 ) dy R1R1 P(error | C 2 ) = exp{(-(y-2) 2 /4)} dy 1 oo Under C 2 : x 1 and x 2 are independent normally distributed gaussian random variables N(1,1) thus y is normally distributed as N(2,2). 1 2 pi_

13 P(error) = P(error | C 1 ) P(C 1 ) + P(error |C 2 ) P(C 2 ) Probability of error for example = exp(-y 2 /4)dy P(C 1 ) 1 oo 1 2 pi + exp{(-(y-2) 2 /4)} dy P(C 2 ) 1 oo 1 2 pi_

14 Transformational Theorem Given : X is a random Variable with known probability density function p X (x). y=g(x) is a real vlued function with no flat spots Define the random variable Y=g(X). Then The probability density function for Y, p Y (y) is as follows: d g(x) dx x=x i p Y (y) = Where x i are all real roots of y=g(x) p X (x) all x i

15 Example: Transformational Theorem Given: X ~ N(0,1) Define function: y = x 2 Define the random variable: Y = X 2 Find the probability density function p Y (y)

16 Solution: y x2x2 x1x1 x y = x 2 y > 0 x 2 = yx 1 = - y for y > 0 there are two real roots of y = x 2 given by for y > 0 there are no real roots of y = x 2 therefore p Y (y) = 0 for those values of y y < 0

17 p Y (y) = p X (x 1 ) + p X (x 2 ) = p X ( - y ) + p X ( y ) d g(x) dx x=x i p Y (y) = p X (x) all x i Apply Fundamental Theorem = 0 if no real roots if real roots d g(x) = 2x dx = exp(- (- y ) 2 /2) 1 2 pi exp(- ( y ) 2 /2) 1 2 pi + 2 (- y ) 2( y ) for y > 0

18 exp(- y/2) 2 pi u(y)= p Y (y) Final Answer

19 Summary for Lecture 8 1. Two Dimensional problem 2. Solution in likelihood space 3. Solution in pattern space 4. Solution in feature space 5. Calculation of probability of error 6. Transformation Theorem

20 End of Lecture 8