Pattern Classification Chapter 2(Part 3) 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 with the permission of the authors and the publisher
Pattern Classification Chapter 2(Part 3) 1 Chapter 2 (part 3) Bayesian Decision Theory (Sections 2-6,2-9) Discriminant Functions for the Normal Density Discriminant Functions for the Normal Density Bayes Decision Theory – Discrete Features Bayes Decision Theory – Discrete Features
Pattern Classification Chapter 2(Part 3)2 2.6 Discriminant Functions for the Normal Density We saw that the minimum error-rate classification can be achieved by the discriminant function We saw that the minimum error-rate classification can be achieved by the discriminant function g i (x) = ln p(x | i ) + ln P( i ) Case of multivariate normal Case of multivariate normal
Pattern Classification Chapter 2(Part 3)3 Case i = 2. I ( I stands for the identity matrix) Case i = 2. I ( I stands for the identity matrix)
Pattern Classification Chapter 2(Part 3)4 A classifier that uses linear discriminant functions is called “ a linear machine ” A classifier that uses linear discriminant functions is called “ a linear machine ” The decision surfaces for a linear machine are pieces of hyperplanes defined by: The decision surfaces for a linear machine are pieces of hyperplanes defined by: g i (x) = g j (x)
Pattern Classification Chapter 2(Part 3)5
6 The hyperplane separating R i and R j always orthogonal to the line linking the means! The hyperplane separating R i and R j always orthogonal to the line linking the means!
Pattern Classification Chapter 2(Part 3)7
8
9 Case i = (covariance of all classes are identical but arbitrary!) Case i = (covariance of all classes are identical but arbitrary!) Hyperplane separating R i and R j Hyperplane separating R i and R j (the hyperplane separating R i and R j is generally not orthogonal to the line between the means!)
Pattern Classification Chapter 2(Part 3)10
Pattern Classification Chapter 2(Part 3)11
Pattern Classification Chapter 2(Part 3)12 Case i = arbitrary Case i = arbitrary The covariance matrices are different for each category The covariance matrices are different for each category (Hyperquadrics which are: hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)
Pattern Classification Chapter 2(Part 3)13
Pattern Classification Chapter 2(Part 3)14
Pattern Classification Chapter 2(Part 3)15 Example Example R 1 (3,8),(3,4),(2,6),(4,6); R 2 (3,0),(3,-4),(1,-2),(5,-2)
Pattern Classification Chapter 2(Part 3)16
Pattern Classification Chapter 2(Part 3) Bayes Decision Theory – Discrete Features Components of x are binary or integer valued, x can take only one of m discrete values Components of x are binary or integer valued, x can take only one of m discrete values v 1, v 2, …, v m
Pattern Classification Chapter 2(Part 3)18 Case of independent binary features in 2 category problem Case of independent binary features in 2 category problem Let x = [x 1, x 2, …, x d ] t where each x i is either 0 or 1, with probabilities: p i = P(x i = 1 | 1 ) q i = P(x i = 1 | 2 )
Pattern Classification Chapter 2(Part 3)19 The discriminant function in this case is: The discriminant function in this case is:
Pattern Classification Chapter 2(Part 3)20 Assignment :2.6.25,