March 2006Alon Slapak 1 of 1 Bayes Classification A practical approach Example Discriminant function Bayes theorem Bayes discriminant function Bibliography.

Slides:



Advertisements
Similar presentations
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Advertisements

2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 2: Bayesian Decision Theory (Part 2) Minimum-Error-Rate Classification Classifiers, Discriminant Functions and Decision Surfaces The Normal Density.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification. Chapter 2 (Part 1): Bayesian Decision Theory (Sections ) Introduction Bayesian Decision Theory–Continuous Features.
Pattern Classification, Chapter 2 (Part 2) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R.
Pattern Classification Chapter 2 (Part 2)0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 2: Bayesian Decision Theory (Part 1) Introduction Bayesian Decision Theory–Continuous Features All materials used in this course were taken from.
Pattern Classification, Chapter 3 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 3 (part 1): Maximum-Likelihood & Bayesian Parameter Estimation  Introduction  Maximum-Likelihood Estimation  Example of a Specific Case  The.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
March 2006Alon Slapak 1 of 15 Pattern Recognition A practical approach Class Block Diagram Feature Pattern Definition Classification Bibliography.
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
Pattern Recognition: Baysian Decision Theory Charles Tappert Seidenberg School of CSIS, Pace University.
Principles of Pattern Recognition
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 13 Oct 14, 2005 Nanjing University of Science & Technology.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Feb 2007Alon Slapak 1 of 1 Classification A practical approach Classification Methods Training Set Classifier Example Definition Bibliography.
Computational Intelligence: Methods and Applications Lecture 12 Bayesian decisions: foundation of learning Włodzisław Duch Dept. of Informatics, UMK Google:
URL:.../publications/courses/ece_8443/lectures/current/exam/2004/ ECE 8443 – Pattern Recognition LECTURE 15: EXAM NO. 1 (CHAP. 2) Spring 2004 Solutions:
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 3. Bayes Decision Theory: Part II. Prof. A.L. Yuille Stat 231. Fall 2004.
Topic 2: Intro to probability CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
Optimal Bayes Classification
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Covariance matrices for all of the classes are identical, But covariance matrices are arbitrary.
Bayesian Decision Theory Basic Concepts Discriminant Functions The Normal Density ROC Curves.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 04: GAUSSIAN CLASSIFIERS Objectives: Whitening.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification All materials in these slides* were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification Chapter 2(Part 3) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Lecture 2. Bayesian Decision Theory
Lecture 1.31 Criteria for optimal reception of radio signals.
Chapter 3: Maximum-Likelihood Parameter Estimation
Presented by: Karen Miller
Probability theory retro
Comp328 tutorial 3 Kai Zhang
Pattern Classification, Chapter 3
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
EE513 Audio Signals and Systems
Bayesian Classification
LECTURE 07: BAYESIAN ESTIMATION
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A discriminant function for 2-class problem can be defined as the ratio of class likelihoods g(x) = p(x|C1)/p(x|C2) Derive formula for g(x) when class.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Presentation transcript:

March 2006Alon Slapak 1 of 1 Bayes Classification A practical approach Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 2 of 2 Discriminant function Definition: a discriminant function is an n-dimensional hypersurface which divides the n-dimensional feature space into two separate areas contain separate classes. A 2-dimwnsinal discriminant function A 1-dimwnsinal discriminant function 2-dimensional feature space 1-dimensional feature space Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 3 of 3 Discriminant function Let h(x) be a discriminant function. A two-category classifier uses the following rule: Decide  1 if h(x) > 0 and  2 if h(x) < 0 If h(x) = 0  x is assigned to either class. h(x)=x h(x) < 0 Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 4 of 4 Thomas Bayes At the time of his death, Rev. Thomas Bayes (1702 –1761) left behind two unpublished essays attempting to determine the probabilities of causes from observed effects. Forwarded to the British Royal Society, the essays had little impact and were soon forgotten. When several years later, the French mathematician Laplace independently rediscovered a very similar concept, the English scientists quickly reclaimed the ownership of what is now known as the “Bayes Theorem”. Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 5 of 5 Conditional Probability Definition: Let A and B be events with P(B) > 0. The conditional probability of A given B, denoted by P(A|B), is defined as: P(A|B) = P(A  B)/P(B) A B Venn Diagram Given: N(A) = 30 N(A  B) = 10 P(B | A) = N(A  B)/N(A) = 10/30 = 1/3 Example: Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 6 of 6 Bayes’ theorem Since P(A | B) = P(A  B)/P(B), we have:P(A | B)P(B) = P(A  B) Symmetrically we have:P(B | A)P(A) = P(B  A) = P(A  B) Therefore: P(A | B)P(B) = P(B | A)P(A) And: where P(A | B) is the conditional probability, P(A), P(B) are the prior probabilities, P(B | A) is the posterior probability Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 7 of 7 Bayes’ theorem in a pattern recognition notation Given classes  i and a pattern x, prior probability The prior probability reflects knowledge of the relative frequency of instances of a class likelihood The likelihood is a measure of the probability that a measurement value occurs in a class. evidence The evidence is a scaling term Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 8 of 8 Bayes classifier The following phrase classify each pattern x to one of two classes: or (since P(x) is common to both sides): Means, decide  1 if P(  1 |x) > P(  2 |x) Likelihood ratio Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 9 of 9 Bayes Discriminant function Since a ratio of probabilities may yield very small values, it is common to use the log of the likelihood ratio: and the derived Bayes’ discriminant function is: Remember: Decide  1 if h(x) > 0 and  2 if h(x) < 0 If h(x) = 0  x is assigned to either class. Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 10 of 10 Example - Gaussian Distributions A multi dimensional Gaussian distribution is: Females Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 11 of 11 Example - Gaussian Distributions A multi dimensional Gausian distribution is: Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 12 of 12 Example - Gaussian Distributions Assume two Gaussian distributed classes with And clear all N1 = 150; N2 = 150; E1 = [50 40; 40 50]; E2 = [50 40; 40 50]; M1 = [30,55]'; M2 = [60,40]'; % % Classes drawing % [P1,A1] = eig(E1); [P2,A2] = eig(E2); y1=randn(2,N1); y2=randn(2,N2); for i=1:N1, x1(:,i) =P1*sqrt(A1)* y1(:,i)+M1; end; for i=1:N2, x2(:,i) =P2*sqrt(A2)* y2(:,i)+M2; end; figure; plot(x1(1,:),x1(2,:),'^',x2(1,:),x2(2,:),'or'); axis([ ]); xlabel('x1') ylabel('x2') Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 13 of 13 Example - Gaussian Distributions % % Classifier drawing % ep=1.2; k=1; for i=0:k:100, for j=0:k:100, x=([i;j]); h=0.5*(x-M1)'*inv(E1)*(x-M1)-0.5*(x-M2)'*inv(E2)*(x-M2)+0.5*log(det(E1)/det(E2)); if (abs(h)<ep), hold on; plot(i,j,'*k'); hold off; end; h(x) > 0 h(x) < 0 Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 14 of 14 Example - Gaussian Distributions Assume two Gaussian distributed classes with And clear all N1 = 150; N2 = 150; E1 = [50 40; 40 50]; E2 = [50 -40; ]; M1 = [30,55]'; M2 = [60,40]'; % % Classes drawing % [P1,A1] = eig(E1); [P2,A2] = eig(E2); y1=randn(2,N1); y2=randn(2,N2); for i=1:N1, x1(:,i) =P1*sqrt(A1)* y1(:,i)+M1; end; for i=1:N2, x2(:,i) =P2*sqrt(A2)* y2(:,i)+M2; end; figure; plot(x1(1,:),x1(2,:),'^',x2(1,:),x2(2,:),'or'); axis([ ]); xlabel('x1') ylabel('x2') Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 15 of 15 Example - Gaussian Distributions % % Classifier drawing % ep=1; k=1; for i=0:k:100, for j=0:k:100, x=([i;j]); h=0.5*(x-M1)'*inv(E1)*(x-M1)-0.5*(x-M2)'*inv(E2)*(x-M2)+0.5*log(det(E1)/det(E2)); if (abs(h)<ep), hold on; plot(i,j,'*k'); hold off; end; h(x) > 0 h(x) < 0 Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 16 of 16 Exercise Synthesize two classes with different a- priory probabilities. Show how the probabilities influence the discriminant function. Synthesize three classes and plot the discriminant functions. Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 17 of 17 Summary Steps for Building a Bayesian Classifier Collect class exemplars Estimate class a priori probabilities Estimate class means Form covariance matrices, find the inverse and determinant for each Form the discriminant function for each class Example Discriminant function Bayes theorem Bayes discriminant function Bibliography

March 2006Alon Slapak 18 of 18 Bibliography 1.K. Fukunaga, Introduction to Statistical Pattern Recognition, 2 nd ed., Academic Press, San Diego, L. I. Kuncheva, J. C. Bezdek amd R. P.W. Duin, “Decision Templates for Multiple Classier Fusion: An Experimental Comparison”, Pattern Recognition, 34, (2), pp , R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification (2nd ed), John Wiley & Sons, Example Discriminant function Bayes theorem Bayes discriminant function Bibliography