Download presentation
Presentation is loading. Please wait.
1
Classification Adriano Joaquim de O Cruz ©2002 NCE/UFRJ adriano@nce.ufrj.br
2
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 2Classification Technique that associates samples to classes previously known. May be Crisp or Fuzzy Supervised MLP trained Non supervised K-NN e fuzzy K-NN not trained
3
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 3 K-NN and Fuzzy K-NN Classification Methods Classes identified by patterns Classifies by the k nearest neighbours Previous knowledge about the problem classes It is not restricted to a specific distribution of the samples
4
Classification Crisp K-NN
5
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 5 Crisp K-NN Supervised clustering method (Classification method). Classes are defined before hand. Classes are characterized by sets of elements. The number of elements may differ among classes. The main idea is to associate the sample to the class containing more neighbours.
6
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 6 Crisp K-NN w2w2w2w2 w1w1w1w1 w3w3w3w3 w 13 w 10 w9w9w9w9 w4w4w4w4 w5w5w5w5 w 14 w 11 w 12 w7w7w7w7 w8w8w8w8 w6w6w6w6 s Class 1 Class 2 Class 3 Class 4 Class 5 3 nearest neighbours, and sample s is closest to pattern w 6 on class 5.
7
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 7 Crisp K-NN Consider W={w 1, w 2,..., w t } a set of t labelled data. Each object w i is defined by l characteristics w i =(w i1, w i2,..., w il ). Input of y unclassified elements. k the number of closest neighbours of y. E the set of k nearest neighbours (NN).
8
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 8 Crisp K-NN Let t be the number of elements that identify the classes. Let c be the number of classes. Let W be the set that contain the t elements Each cluster is represented by a subset of elements from W.
9
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 9 Crisp K-NN algorithm set k {Calculating the NN} for i = 1 to t Calculate distance from y to x i if i<=k then add x i to E else if x i is closer to y than any previous NN then delete the farthest neighbour and include x i in the set E
10
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 10 Crisp K-NN algorithm cont. Determine the majority class represented in the set E and include y in this class. if there is a draw, then calculate the sum of distances from y to all neighbours in each class in the draw if the sums are different then add x i to class with smallest sum else add x i to class where last minimum was found
11
Classification Fuzzy K-NN
12
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 12 Fuzzy K-NN The basis of the algorithm is to assign membership as a function of the object’s distance from its K-nearest neighbours and the memberships in the possible classes. J. Keller, M. Gray, J. Givens. A Fuzzy K-Nearest Neighbor Algorithm. IEEE Transactions on Systems, Man and Cybernectics, vol smc-15, no 4, July August 1985
13
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 13 Fuzzy K-NN w1w1 w2w2 w3w3 w4w4 w 13 w 10 w9w9 w 14 w5w5 w8w8 w 12 w 11 w6w6 w7w7 Classe 1 Classe 2 Classe 3 Classe 4Classe 5
14
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 14 Fuzzy K-NN Consider W={w 1, w 2,..., w t } a set of t labelled data. Each object w i is defined by l characteristics w i =(w i1, w i2,..., w il ). Input of y unclassified elements. k the number of closest neighbours of y. E the set of k nearest neighbours (NN). i (y) is the membership of y in the class i ij is the membership in the ith class of the jth vector of the labelled set (labelled w j in class i).
15
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 15 Fuzzy K-NN Let t be the number of elements that identify the classes. Let c be the number of classes. Let W be the set that contain the t elements Each cluster is represented by a subset of elements from W.
16
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 16 Fuzzy K-NN algorithm set k {Calculating the NN} for i = 1 to t Calculate distance from y to x i if i<=k then add x i to E else if x i is closer to y than any previous NN then delete the farthest neighbour and include x i in the set E
17
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 17 Fuzzy K-NN algorithm Calculate i (y) using for i = 1 to c // number of classes
18
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 18 Computing ij (y) ij (y) can be assigned class membership in several ways. They can be given complete membership in their known class and non membership in all other. Assign membership based on distance from their class mean. Assign membership based on the distance from labelled samples of their own class and those from other classes.
19
Classification ICC-KNN System
20
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 20 ICC-KNN System Non-Parametric Statistical Pattern Recognition System Associates FCM, fuzzy KNN and ICC Evaluates data disposed on several class formats
21
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 21 ICC-KNN System Divided in two modules First module (training) chooses the best patterns to use with K-NN chooses the best fuzzy constant and best number of neighbours (K) Second module (classification) uses fuzzy k-nn to classify
22
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 22 ICC-KNN First Module Classification Module Finds structure on data sample Divided into two phases First phase of training Finds the best patterns for fuzzy K-NN FCM – Applied to each class using many numbers of categoriesFCM – Applied to each class using many numbers of categories ICC – Finds the best number of categories to represent each classICC – Finds the best number of categories to represent each class
23
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 23 ICC-KNN First Phase Results of applying FCM and ICC Patterns for K-NN which are the centres of the chosen run of FCM Number of centres which are all the centres of the number of categories resulting after applying ICC to all FCM runs
24
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 24 ICC-KNN Second Phase Second phase of training Evaluates the best fuzzy constant and the best number of neighbours so to achieve best performance on the K-NN tests several m and k values finds m and k for the maximum rate of crisp hits
25
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 25ICC-KNN Pattern Recognition Module Distributes each data to its class Uses the chosen patterns, m and k to classify data
26
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 26 ICC-KNN block diagram Class 1 Class s FCM ICC Fuzzy K-NN m k W, U w W U w w1w1 wsws U 1cmin U 1cmáx U Scmin U Scmáx Fuzzy K-NN Classification Module Pattern Recognition Module Not classified Data
27
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 27ICC-KNN Let R={r 1,r 2,...,r n } be the set of samples. Each sample r i belongs to one of s known classes. Let U ic be the inclusion matrix for the class i with c categories. Let V ic be the centre matrix for the class i with c categories. Let w i be equal to the best V ic of each class Let W be the set of sets of centres w i
28
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 28 ICC-KNN algorithm Classification Module First phase of training Step 1. Set m Step 2. Set cmin and cmáx Step 3. For each s known class Generate the set Rs with the points from R belonging to the class s For each category c in the interval [cmin, cmáx] Run FCM for c and the set Rs generating Usc and Vsc Calculate ICC for Rs e Usc End Define the patterns ws of class s as the matrix Vsc that maximizes ICC Step 4. Generate the set W = {w1,..., ws}
29
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 29 ICC-KNN algorithm Second phase of Training Step 5. Set mmin e mmáx Step 6. Set kmin e kmáx For each m from [mmin, mmáx] For each m from [mmin, mmáx] For each k from [kmin, kmáx] For each k from [kmin, kmáx] Run fuzzy K-NN for the patterns from the set W generating Umk Calculate the number of crisp hits for Umk Step 7. Choose m and k that yields the best crips hit figures Step 8. if there is a draw If the k’s are different Choose the smaller k else Choose the smaller m
30
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 30 ICC-KNN algorithm Pattern Recognition Module Step 9. Apply fuzzy K-NN using patterns form the set W and the chosen parameters m and k to the data to be classified.
31
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 31 ICC-KNN results
32
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 32 ICC-NN results 2000 samples, 4 classes, 500 samples in each class Classes 1 and 4 – concave classes Classes 2 and 3 – convex classes, elliptic format
33
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 33 ICC-KNN results First phase of training FCM applied to each class Training data 80% 400 samples from each class c = 3..7 and m = 1,25 ICC applied to results Classes 1 and 4 4 categories Classes 2 and 3 3 categories
34
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 34 ICC-KNN results Second phase of Training Running fuzzy K-NN Patterns from first phase Random patterns k = 3 a 7 neighbours m = {1,1; 1,25; 1,5; 2}
35
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 35 ICC-KNN results Conclusão:K-NN é mais estável em relação ao valor de m para os padrões da PFT
36
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 36 ICC-KNN results Training data Lines classes Columns classification m = 1,5 e k = 3 96,25% m = 1,1 e k = 3 79,13% (random patterns) 34914643972104 733240324376003 103801970379142 12106621320103881 43214321 Padrões AleatóriosPadrões da PFT Classes Dados de Treinamento
37
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 37 ICC-KNN results Test data Lines classes Columns classification Pad. PFT – 94,75% Pad. Aleat – 79% 850150991004 1882001090003 00964309342 2002753102971 43214321 Padrões AleatóriosPadrões da PFT Classes Dados de Testes
38
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 38 ICC-KNN x Others FCM, FKCN, GG e GK Fase de Treinamento (FTr) Dados de treinamento c = 4 e m = {1,1; 1,25; 1,5; 2} Associar as categorias às classes Critério do somatório dos graus de inclusãoCritério do somatório dos graus de inclusão oCálculo do somatório dos graus de inclusão dos pontos de cada classe em cada categoria oUma classe pode ser representada por mais de uma categoria
39
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 39 ICC-KNN x Others Fase de Teste Dados de Teste Inicialização dos métodos com os centros da FTr Calcula o grau de inclusão dos pontos em cada categoria Classe representada por mais de 1 categoria Grau de inclusão = soma dos graus de inclusão dos pontos nas categorias que representam a classe
40
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 40 ICC-KNN x Others GK para m = 2 84% FCM e FKCN 66% para m = 1,1 e m = 1,25 GG-FCM 69% para m = 1,1 e 1,25 GG Aleatório 57,75% para m = 1,1 e 25% para m = 1,5 18,14s22,66s2,59s2,91s 23,11s 36,5s T 89,5%69%70,75% 83% 95,75% N 84%69%66% 79% 94,75% R GKGGFKCNFCM KNN A. ICC-KNN
41
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 41GK
42
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 42GK Classes GK 1234 1 776017 2 69400 3 00973 4 003268
43
Classification KNN+Fuzzy Cmeans System
44
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 44 KNN+Fuzzy C-Means algorithm The idea is an two-layer clustering algorithm First an unsupervised tracking of cluster centres is made using K-NN rules The second layer involves one iteration of the fuzzy c-means to compute the membership degrees and the new fuzzy centres. Ref. N. Zahit et all, Fuzzy Sets and Systems 120 (2001) 239-247
45
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 45 First Layer (K-NN) Let X={x 1,…,x n } be a set of n unlabelled objects. c is the number of clusters. The first layer consists of partitioning X into c cells using the fist part of K-NN. Each cell i is (1<=i<=c) represented as E i (y i, K-NN of y i, G i ) G i is the center of cell E i and defined as
46
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 46 KNN-1FCMA settings Let X={x 1,…,x n } be a set of n unlabelled objects. Fix c the number of clusters. Choose m>1 (nebulisation factor). Set k = Integer(n/c –1). Let I={1,2,…,n} be the set of all indexes of X.
47
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 47 KNN-1FCMA algorithm step 1 Calculate G 0 for i = 1 to c Search in I for the index of the farthest object y i from G i-1 For j = 1 to n Calculate distance from y i to x j Calculate distance from y i to x j if j <= k then add x j to E i else if x i is closer to y than any previous NN then delete the farthest neighbour and include x i in the set E i
48
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 48 KNN-1FCMA algorithm cont. Include y i in the set E i. Calculate G i. Delete y i index and the K-NN indexes of y i from I. if I then for each remaining object x determine the minimum distance to any centre G i of E i. classify x to the nearest centre. update all centres.
49
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 49 KNN-1FCMA algorithm step2 Compute the matrix U according to Calculate all fuzzy centres using
50
*@2001 Adriano Cruz *NCE e IM - UFRJ Classification 50 Results KNN-1FCMA 1217163100IRIS 1013142150IRIS23 1913 6120S4 1002480S3 801360S2 1100220S1 FCMAKNN-1FCMAFCMA Number of Iterations avg Misclassification rate cElemData
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.