Pattern Recognition and Training Pattern -- set of values (known size) that describe things The general problem Approaches to the decision-making process 1. Simple comparison 2. Common property 3. Clusters (using distance measurement |X1-u1| + |X2-u2| + … + |Xn-un| ) 4. Combination of 1, 2 and 3 06/07/62 240-373 Image Processing, Leture #13
240-373 Image Processing, Leture #13 Decision Functions Decision function: w = (w1, w2, w3, …, wn) If the pattern vector is x = [x1, x2, x3, …, xn, 1]T, then The unknown pattern is in group B if wTx > 0 The unknown pattern is in group A if wTx <= 0 Example: (8,4) is in group B because [1.5, -1.0, -3.5] [8, 4, 1]T = 8x1.5-4-3.5 = 4.5 and 4.5 > 0 How about (4,4)? 06/07/62 240-373 Image Processing, Leture #13
Decision Functions (Cont’d) The number of groups can be more than 2 Decision table Result of w1 Result of w2 Implication < 0 < 0 no group < 0 > 0 group A > 0 < 0 group C > 0 > 0 group B Decision function need not be a linear function 06/07/62 240-373 Image Processing, Leture #13
240-373 Image Processing, Leture #13 Cluster Means If the cluster consists of [3,4,8,2] [2,9,5,1][5,7,7,1], then the mean is [3.33, 6.67, 6.67, 1.33]. This represents the center of the four-dimensional cluster. The Euclidean distance from the center to a new pattern can be calculated as follows: new vector [3, 5, 7, 0], Euclidean distance = (3-3.33)2 + (5-6.67)2 + (7-6.67)2 + (0-1.33)2 = 4.78 06/07/62 240-373 Image Processing, Leture #13
240-373 Image Processing, Leture #13 Automatic Clustering Technique 1: K-means clustering USE: To automatically find the best groupings and means of K clusters. OPERATION: The pattern vectors of K different items are given to the system Classifying them as best it can (without knowing which vector belongs to which item) Let the pattern vectors be X1, …, Xn Take the first K points as the initial estimation of the cluster means M1 = X1, M2 = X2, …, Mk = Xk * Allocate each pattern vector to the nearest group (minimum distance) Calculate new cluster centers If they are the same as the old centers, then STOP, other wise goto step * 06/07/62 240-373 Image Processing, Leture #13
K-means clustering example M1 = (2, 5.0) M2 = (2, 5.5) Allocating each pattern vector to the nearest center gives 1 (2,5.0) group 1 2 (2,5.5) group 2 3 (6,2.5) group 1 4 (7,2.0) group 1 5 (7,3.0) group 1 6 (3,4.5) group 1 The group means now become group 1: M1 = (5, 3.4) group 2: M2 = (2, 5.5) 06/07/62 240-373 Image Processing, Leture #13
240-373 Image Processing, Leture #13 This gives new groupings as follows: 1 (2,5.0) group 2 2 (2,5.5) group 2 3 (6,2.5) group 1 4 (7,2.0) group 1 5 (7,3.0) group 1 6 (3,4.5) group 2 And the group means become group 1: M1 = (6.67, 2.5) group 2: M2 = (2.33, 5.0) Groupings now stay the same and the processing stops. 06/07/62 240-373 Image Processing, Leture #13
Optical Character Recognition Technique: Isolation of a character in an OCR document USE: To create a window containing only one character onto an array containing a text image OPERATION: 1. Assuming that the image is correctly oriented and the text is dark on a white background 2. Calculate row sums of the pixel gray-level values. High row sums indicate a space between the rows 3. Calculate column sums of the pixel gray-level values. High column sums indicate a space between the columns 06/07/62 240-373 Image Processing, Leture #13
240-373 Image Processing, Leture #13 Technique: Creating the pattern vector (feature extraction) USE: To create the pattern vector for a character so that it can be compared with the library OPERATION: 1. Assuming that the character has been isolated 2. Place a 4x4 grid over the image and count the number of “ink” pixels in each grid. 3. These number are then divided by the total number of pixels in the grid 4. Comparing resulting numbers with the library 06/07/62 240-373 Image Processing, Leture #13