Pattern Recognition Ku-Yaw Chang canseco@mail.dyu.edu.tw Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University
Outline Introduction Features and Classes Supervised v.s. Unsupervised Statistical v.s. Structural (Syntactic) Statistical Decision Theory 2004/03/02 Pattern Recognition
Supervised v.s. Unsupervised Supervised learning Using a training set of patterns of known class to classify additional similar samples Unsupervised learning Dividing samples into groups or clusters based on measures of similarity without any prior knowledge of class membership 2004/03/02 Pattern Recognition
Supervised v.s. Unsupervised Dividing the class into two groups: Supervised learning Male features Female features Unsupervised learning Male v.s. Female Tall v.s. Short With v.s. Without glasses … 2004/03/02 Pattern Recognition
Statistical v.s. Structural Statistical PR To obtain features by manipulating the measurements as purely numerical (or Boolean) variables Structural (Syntactic) PR To design features in some intuitive way corresponding to human perception of the objects 2004/03/02 Pattern Recognition
Statistical v.s. Structural Optical Character Recognition (OCR) Statistical PR Structural PR 2004/03/02 Pattern Recognition
Statistical Decision Theory An automated classification system Classified data sets Selected features 2004/03/02 Pattern Recognition
Statistical Decision Theory Hypothetical Basketball Association (HBA) apg : average number of points per game To predict the winner of the game Based on the difference between the home team’s apg and the visiting team’s apg for previous games Training set Scores of previously played games Home team classified as a winner or a loser 2004/03/02 Pattern Recognition
Statistical Decision Theory Given a game to be played, predict the home team to be a winner or loser using the feature: dapg = Home Team apg – Visiting Team apg 2004/03/02 Pattern Recognition
Statistical Decision Theory Game dapg Home Team 1 1.3 Won 16 -3.1 2 -2.7 Lost 17 1.7 3 -0.5 18 2.8 4 -3.2 19 4.6 5 2.3 20 3.0 6 5.1 21 0.7 7 -5.4 22 10.1 8 8.2 23 2.5 9 -10.8 24 0.8 10 -0.4 25 -5.0 11 10.5 26 8.1 12 -1.1 27 -7.1 13 28 2.7 14 -4.2 29 -10.0 15 -3.4 30 -6.5 2004/03/02 Pattern Recognition
Statistical Decision Theory A histogram of dapg 2004/03/02 Pattern Recognition
Statistical Decision Theory The classification cannot be performed perfectly using the single feature dapg. Probability of membership in each class With the smallest expected penalty Decision boundary or threshold The value T for Home Team Won: dapg is less than or equal to T Lost: dapg is greater than T 2004/03/02 Pattern Recognition
Statistical Decision Theory Home team’s apg = 103.4 Visiting team’s apg = 102.1 dapg = 103.4 – 102.1 = 1.3 and 1.3 > T Home team will win the game T = 0.8 or -6.5 T = 0.8 achieves the minimum error rate 2004/03/02 Pattern Recognition
Statistical Decision Theory Adding an additional feature to increase the accuracy of classification dwp = Home Team wp – Visiting Team wp wp denotes the winning percentage 2004/03/02 Pattern Recognition
Statistical Decision Theory Game dapg dwp Home Team 1 1.3 25.0 Won 16 -3.1 9.4 2 -2.7 -16.9 Lost 17 1.7 6.8 3 -0.5 5.3 18 2.8 17.0 4 -3.2 -27.5 19 4.6 13.3 5 2.3 -18.0 20 3.0 -24.0 6 5.1 31.2 21 0.7 -17.8 7 -5.4 5.8 22 10.1 44.6 8 8.2 34.3 23 2.5 -22.4 9 -10.8 -56.3 24 0.8 12.3 10 -0.4 25 -5.0 -3.8 11 10.5 16.3 26 8.1 36.0 12 -1.1 -17.6 27 -7.1 -20.6 13 5.7 28 2.7 23.2 14 -4.2 16.0 29 -10.0 -46.9 15 -3.4 30 -6.5 19.7 2004/03/02 Pattern Recognition
Statistical Decision Theory Feature vector (dapg, dwp) Presented as a scatterplot W W W W W W W W W W L W W W W L L W W W L L L L W W L W L L 2004/03/02 Pattern Recognition
Statistical Decision Theory The feature space can be divided into two decision region by a straight line Linear decision boundary If a feature space cannot be perfectly separated by a straight line, a more complex boundary line might be used. 2004/03/02 Pattern Recognition
Exercise One The values of a feature x for nine samples from class A are 1, 2, 3, 3, 4, 4, 6, 6, 8. Nine samples from class B had x values of 4, 6, 7, 7, 8, 9, 9, 10, 12. Make a histogram (with an interval width of 1) for each class and find a decision boundary (threshold) that minimizes the total number of misclassifications for this training data set. 2004/03/02 Pattern Recognition
Exercise Two Can the feature vectors (x,y) = (2,3), (3,5), (4,2), (2,7) from class A be separated from four samples from class B located at (6,2), (5,4), (5,6), (3,7) by a linear decision boundary? If so, give the equation of one such boundary and plot it. If not, find a boundary that separates them as well as possible. 2004/03/02 Pattern Recognition