Clustering Validity Adriano Joaquim de O Cruz ©2006 NCE/UFRJ

Slides:



Advertisements
Similar presentations
Homework Answers 1. {3} 2. {1, 3} 5. {3, 4, 6} 6. {} 10. {2, 3, 4}
Advertisements

Art Foundations Exam 1.What are the Elements of Art? List & write a COMPLETE definition; you may supplement your written definition with Illustrations.
Extension Principle Adriano Cruz ©2002 NCE e IM/UFRJ
Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ
Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ
Fuzzy Sets - Hedges. Adriano Joaquim de Oliveira Cruz – NCE e IM, UFRJ
Implications Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Implications Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Slide 1 Insert your own content. Slide 2 Insert your own content.
Combining Like Terms. Only combine terms that are exactly the same!! Whats the same mean? –If numbers have a variable, then you can combine only ones.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
Graph of a Curve Continuity This curve is continuous
0 - 0.
Inequalities and their Graphs Objective: To write and graph simple inequalities with one variable.
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
CS4026 Formal Models of Computation Running Haskell Programs – power.
Influential Points and Outliers Debbi Amanti Debbi Amanti.
HOW TO COMPARE FRACTIONS
Bridging through 10 Learning objectives:
Learning Objectives for Section 3.2
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
1 Alberto Montanari University of Bologna Basic Principles of Water Resources Management.
Fuzzy Control - Example. Adriano Joaquim de Oliveira Cruz NCE e IM, UFRJ ©2002.
ABC Technology Project
COMP 482: Design and Analysis of Algorithms
Copyright 2012, 2008, 2004, 2000 Pearson Education, Inc.
5.9 + = 10 a)3.6 b)4.1 c)5.3 Question 1: Good Answer!! Well Done!! = 10 Question 1:
HOW TO COMPARE FRACTIONS
Understanding Functions
Risk Management & Real Options IV. Developing valuation models Stefan Scholtes Judge Institute of Management University of Cambridge MPhil Course
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Absolute-Value Equations and Inequalities
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Document Clustering Carl Staelin. Lecture 7Information Retrieval and Digital LibrariesPage 2 Motivation It is hard to rapidly understand a big bucket.
Addition 1’s to 20.
1 S Digital Communication Systems Advanced Modulation and Random Access Techniques.
Test B, 100 Subtraction Facts
Maximum ??? Minimum??? How can we tell?
Week 1.
1 Ke – Kitchen Elements Newport Ave. – Lot 13 Bethesda, MD.
Bottoms Up Factoring. Start with the X-box 3-9 Product Sum
FIND THE AREA ( ROUND TO THE NEAREST TENTHS) 2.7 in 15 in in.
9.2 Absolute Value Equations and Inequalities
Lecture 13: Force System Resultants
Principles of Computer-Aided Design and Manufacturing Second Edition 2004 ISBN Author: Prof. Farid. Amirouche University of Illinois-Chicago.
EF 202, Module 4, Lecture 2 Second Moment of Area EF Week 14.
Equivalence Relations
Unsupervised Classification
Clustering Categorical Data The Case of Quran Verses
Cluster Analysis Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Automatic Histogram Threshold Using Fuzzy Measures 呂惠琪.
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
1 CLUSTERING  Basic Concepts In clustering or unsupervised learning no training data, with class labeling, are available. The goal becomes: Group the.
Fuzzy Sets - Introduction If you only have a hammer, everything looks like a nail. Adriano Joaquim de Oliveira Cruz – NCE e IM, UFRJ
Geometry of Fuzzy Sets.
INTEGRALS 5. INTEGRALS We saw in Section 5.1 that a limit of the form arises when we compute an area.  We also saw that it arises when we try to find.
Linguistic Descriptions Adriano Joaquim de Oliveira Cruz NCE e IM/UFRJ © 2003.
Classification Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Extension Principle Adriano Cruz ©2002 NCE e IM/UFRJ
Hierarchical Clustering
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
1 CLUSTER VALIDITY  Clustering tendency Facts  Most clustering algorithms impose a clustering structure to the data set X at hand.  However, X may not.
About taking measurements, The meaning of ‘variation’, ‘range’ and ‘mean (average)’, The meaning of ‘accuracy’ and ‘precision’. Learning Objectives You.
May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.
K-means and Hierarchical Clustering
How Science works: Taking measurements.
How Science works: Taking measurements.
Clustering Deviance From CART Analysis and Silhouette Widths
Presentation transcript:

Clustering Validity Adriano Joaquim de O Cruz ©2006 NCE/UFRJ

Adriano Cruz *NCE e IM - UFRJ Cluster 2 Clustering Validity The number of clusters is not always previously known. The number of clusters is not always previously known. In many problems the number of classes is known but it is not the best configuration. In many problems the number of classes is known but it is not the best configuration. It is necessary to study methods to indicate and/or validate the number of classes. It is necessary to study methods to indicate and/or validate the number of classes.

Adriano Cruz *NCE e IM - UFRJ Cluster 3 Clustering Validity Example 1 Consider the problem of number recognition Consider the problem of number recognition It is known that there are 10 classes (10 digits) It is known that there are 10 classes (10 digits) The number of clusters, however, may be greater than 10 The number of clusters, however, may be greater than 10 This is the result of different handwriting to the same digit This is the result of different handwriting to the same digit

Adriano Cruz *NCE e IM - UFRJ Cluster 4 Clustering Validity Example 2 Consider the problem segmentation of thermal image in a room Consider the problem segmentation of thermal image in a room It is known that there are 2 classes of temperatures: body and room temperatures It is known that there are 2 classes of temperatures: body and room temperatures This is a problem where the number of classes is well defined. This is a problem where the number of classes is well defined.

Adriano Cruz *NCE e IM - UFRJ Cluster 5 Clustering Validity Problem First data is partitioned in different number of clusters First data is partitioned in different number of clusters It is also important to try different initial conditions to the same number of partitions It is also important to try different initial conditions to the same number of partitions Validity measures are applied to these partitions to estimate their quality Validity measures are applied to these partitions to estimate their quality It is necessary to estimate the quality when the number of partitions is changed and, for the same number, when the initial conditions are different It is necessary to estimate the quality when the number of partitions is changed and, for the same number, when the initial conditions are different

Clustering Validity L-Clusters

Adriano Cruz *NCE e IM - UFRJ Cluster 7 Initial Definitions d(e i,e k ) is the dissimilarity between element e i and e k. d(e i,e k ) is the dissimilarity between element e i and e k. Euclidean distance is an example of an measure of dissimilarity Euclidean distance is an example of an measure of dissimilarity

Adriano Cruz *NCE e IM - UFRJ Cluster 8 L–Cluster Definition C is an L-cluster if for each object e i belonging to C: C is an L-cluster if for each object e i belonging to C: e k C, max d(e i,e k )< e h C, min d(e i,e h ) e k C, max d(e i,e k )< e h C, min d(e i,e h ) Maximum distance between any element e i and any element e k is smaller than the minimum distance between e i and any e h from another cluster. Maximum distance between any element e i and any element e k is smaller than the minimum distance between e i and any e h from another cluster.

Adriano Cruz *NCE e IM - UFRJ Cluster 9L-cluster C

Adriano Cruz *NCE e IM - UFRJ Cluster 10 L* – Definition C is an L*-cluster if for each object e i belonging to C: C is an L*-cluster if for each object e i belonging to C: e k C, max d(e i,e k ) < e l C, e h C, min d(e l,e h ) e k C, max d(e i,e k ) < e l C, e h C, min d(e l,e h )

Adriano Cruz *NCE e IM - UFRJ Cluster 11L*-cluster C

Clustering Validity Silhouettes

Adriano Cruz *NCE e IM - UFRJ Cluster 13Introduction Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics. P.J. Rousseeuw, 1987 Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics. P.J. Rousseeuw, 1987 Each cluster is represented by one silhouette, showing which objects lie well within the cluster. Each cluster is represented by one silhouette, showing which objects lie well within the cluster. The user can compare the quality of the clusters The user can compare the quality of the clusters

Adriano Cruz *NCE e IM - UFRJ Cluster 14 Method - I Consider a cluster A. Consider a cluster A. For each element e i A calculate the average dissimilarity to all other objects of A, a(e i ) = d(e i,A). For each element e i A calculate the average dissimilarity to all other objects of A, a(e i ) = d(e i,A). Therefore, A can not be a singleton. Therefore, A can not be a singleton. Euclidean distance is an example of dissimilarity. Euclidean distance is an example of dissimilarity.

Adriano Cruz *NCE e IM - UFRJ Cluster 15 Method - II Consider all clusters C k different from A. Consider all clusters C k different from A. Calculate d k (e i,C k ), the average dissimilarity of e i to all elements of C k. Calculate d k (e i,C k ), the average dissimilarity of e i to all elements of C k. Select b(e i ) = min (d k (e i,C k )). Select b(e i ) = min (d k (e i,C k )). Let us call B the cluster whose dissimilarity is b(e i ). Let us call B the cluster whose dissimilarity is b(e i ). This is the second-best choice for e i This is the second-best choice for e i

Adriano Cruz *NCE e IM - UFRJ Cluster 16 Method - III The silhouette s(e i ) is equal to The silhouette s(e i ) is equal to s(e i ) = 1–[a(e i ) / b(e i )]se a(e i ) < b(e i ). s(e i ) = 1–[a(e i ) / b(e i )]se a(e i ) < b(e i ). s(e i ) = 0 se a(e i ) = b(e i ). s(e i ) = 0 se a(e i ) = b(e i ). s(e i ) = [b(e i ) / a(e i )] - 1 se a(e i ) > b(e i ). s(e i ) = [b(e i ) / a(e i )] - 1 se a(e i ) > b(e i ). ou ou s(e i ) = [b(e i ) - a(e i )] / max (b(e i ),a(e i )) s(e i ) = [b(e i ) - a(e i )] / max (b(e i ),a(e i )) -1 <= s(e i ) <= <= s(e i ) <= +1

Adriano Cruz *NCE e IM - UFRJ Cluster 17 Understanding s(e i ) s(e i ) 1: within dissimilarity a(e i ) << b(e i ), e i is well classified. s(e i ) 1: within dissimilarity a(e i ) << b(e i ), e i is well classified. s(e i ) 0: a(e i ) b(e i ), e i may belong to either cluster. s(e i ) 0: a(e i ) b(e i ), e i may belong to either cluster. s(e i ) -1: within dissimilarity a(e i )>>b(e i ), e i is misclassified, should belong to B. s(e i ) -1: within dissimilarity a(e i )>>b(e i ), e i is misclassified, should belong to B.

Adriano Cruz *NCE e IM - UFRJ Cluster 18Silhouette The silhouette of the cluster A is the plot of all s(e i ) ranked in decreasing order. The silhouette of the cluster A is the plot of all s(e i ) ranked in decreasing order. The average of all s(e i ) of all elements in the cluster is called the average silhouette. The average of all s(e i ) of all elements in the cluster is called the average silhouette.

Adriano Cruz *NCE e IM - UFRJ Cluster 19 Example of use I QTY = 100; X = [randn(QTY,2)+0.5*ones(QTY,2);randn(QTY,2) *ones(QTY,2)]; - 0.5*ones(QTY,2)]; opts = statset('Display','final'); [cidx, ctrs] = kmeans(X, 2, 'Distance','city',... 'Replicates',5, 'Options',opts); 'Replicates',5, 'Options',opts);figure; plot(X(cidx==1,1),X(cidx==1,2),'r.',... X(cidx==2,1),X(cidx==2,2),... X(cidx==2,1),X(cidx==2,2),... 'b.', ctrs(:,1),ctrs(:,2),'kx'); 'b.', ctrs(:,1),ctrs(:,2),'kx');figure; [s, h] = silhouette(X, cidx, 'sqeuclid');

Adriano Cruz *NCE e IM - UFRJ Cluster 20 Ex Silhouette 1

Adriano Cruz *NCE e IM - UFRJ Cluster 21 Ex Silhouette 2

Adriano Cruz *NCE e IM - UFRJ Cluster 22 Example of use I I QTY = 100; X = [randn(QTY,2)+2*ones(QTY,2);randn(QTY,2) *ones(QTY,2)]; - 2*ones(QTY,2)]; opts = statset('Display','final'); [cidx, ctrs] = kmeans(X, 2, 'Distance','city',... 'Replicates',5, 'Options',opts); 'Replicates',5, 'Options',opts);figure; plot(X(cidx==1,1),X(cidx==1,2),'r.',... X(cidx==2,1),X(cidx==2,2),... X(cidx==2,1),X(cidx==2,2),... 'b.', ctrs(:,1),ctrs(:,2),'kx'); 'b.', ctrs(:,1),ctrs(:,2),'kx');figure; [s, h] = silhouette(X, cidx, 'sqeuclid');

Adriano Cruz *NCE e IM - UFRJ Cluster 23 Ex silhouette 3

Adriano Cruz *NCE e IM - UFRJ Cluster 24 Ex silhouette 4

Cluster Validity Partition Coefficient

Adriano Cruz *NCE e IM - UFRJ Cluster 26 Partition Coefficient This coefficient is defined as This coefficient is defined as

Adriano Cruz *NCE e IM - UFRJ Cluster 27 Partition Coefficient comments F is inversely proportional to the number of clusters. F is inversely proportional to the number of clusters. F is not appropriated to find the best number of partitions F is not appropriated to find the best number of partitions F is best suited to validate the best partition among those with the same number of clusters F is best suited to validate the best partition among those with the same number of clusters

Adriano Cruz *NCE e IM - UFRJ Cluster 28 Partition Coefficient When F=1/c the system is entirely fuzzy, since every element belongs to all clusters with the same degree of membership When F=1/c the system is entirely fuzzy, since every element belongs to all clusters with the same degree of membership When F=1 the system is rigid and membership values are either 1 or 0. When F=1 the system is rigid and membership values are either 1 or 0. This measurement can only be applied to fuzzy partitions This measurement can only be applied to fuzzy partitions

Adriano Cruz *NCE e IM - UFRJ Cluster 29 Partition Coefficient Example The Partition Matrix is The Partition Matrix is w1 w2 w3

Adriano Cruz *NCE e IM - UFRJ Cluster 30 Partition Coefficient Example The Partition Matrix is The Partition Matrix is w1 w2 w3 w4

Adriano Cruz *NCE e IM - UFRJ Cluster 31 Partition Coefficient Example The Partition Matrix is The Partition Matrix is X1X2X3 X4X5X6

Cluster Validity Partition Entropy

Adriano Cruz *NCE e IM - UFRJ Cluster 33 Partition Entropy Partition Entropy is defined as Partition Entropy is defined as When H=0 the partition is rigid. When H=0 the partition is rigid. When H=log(c) the fuzziness is maximum. When H=log(c) the fuzziness is maximum. 0 <= 1-F <= H 0 <= 1-F <= H

Adriano Cruz *NCE e IM - UFRJ Cluster 34 Partition Entropy comments Partition Entropy (H) is directly proportional to the number of partitions. Partition Entropy (H) is directly proportional to the number of partitions. H is more appropriated to validate the best partition among several runs of an algorithm. H is more appropriated to validate the best partition among several runs of an algorithm. H is strictly a fuzzy measure H is strictly a fuzzy measure

Cluster Validity Compactness and Separation

Adriano Cruz *NCE e IM - UFRJ Cluster 36 Compactness and Separation CS is defined as CS is defined as J m is the objective function minimized by the FCM algorithm. J m is the objective function minimized by the FCM algorithm. n is the number of elements. n is the number of elements. d min is minimum Euclidean distance between the center of two clusters. d min is minimum Euclidean distance between the center of two clusters.

Adriano Cruz *NCE e IM - UFRJ Cluster 37 Compactness and Separation The minimum distance is defined as The minimum distance is defined as The complete formula is The complete formula is

Adriano Cruz *NCE e IM - UFRJ Cluster 38 Compactness and Separation This a very complete validation measure. This a very complete validation measure. It validates the number of clusters and the checks the separation among clusters. It validates the number of clusters and the checks the separation among clusters. From our experiments it works well even when the degree of superposition is high. From our experiments it works well even when the degree of superposition is high.

Cluster Validity Fuzzy Linear Discriminant

Adriano Cruz *NCE e IM - UFRJ Cluster 40 Fischer Linear Discriminant The Fishers Linear Discriminant (FLD) is an important technique used in pattern recognition problems to evaluate the compactness and separation of the partitions produced by crisp clustering techniques. The Fishers Linear Discriminant (FLD) is an important technique used in pattern recognition problems to evaluate the compactness and separation of the partitions produced by crisp clustering techniques.

Adriano Cruz *NCE e IM - UFRJ Cluster 41 Fischer Linear Discriminant It is easier to handle classification problems in which sampled data has few characteristics It is easier to handle classification problems in which sampled data has few characteristics So it is important to reduce the problem dimensionality So it is important to reduce the problem dimensionality When FLD is applied to a space crisply partitioned it produces an operator (W) that maps the original set (R p ) into a new set (R k ), where k<p When FLD is applied to a space crisply partitioned it produces an operator (W) that maps the original set (R p ) into a new set (R k ), where k<p

Adriano Cruz *NCE e IM - UFRJ Cluster 42 Fischer Linear Discriminant W x1 x2 Figura. – Projeção de amostras dispostas em 2 classes em uma reta feita pelo Discriminante Linear de Fisher

Adriano Cruz *NCE e IM - UFRJ Cluster 43FLD FLD measures the compactness and separation of all categories when crisp partitions are created FLD measures the compactness and separation of all categories when crisp partitions are created FLD uses two matrices: FLD uses two matrices: S B : Between Classes Scatter Matrix S B : Between Classes Scatter Matrix S W : Within Classes Scatter Matrix S W : Within Classes Scatter Matrix

Adriano Cruz *NCE e IM - UFRJ Cluster 44 FLD – S B Matrix Measures the quality of separation between classes

Adriano Cruz *NCE e IM - UFRJ Cluster 45 FLD – S B Matrix m is the average of all samples m i is the average of all samples belonging to cluster i n is the number of samples n i is the number of samples belonging to cluster i

Adriano Cruz *NCE e IM - UFRJ Cluster 46 FLD – S W Matrix Measures the compactness of all classes Measures the compactness of all classes It is the sum of all internal scattering It is the sum of all internal scattering

Adriano Cruz *NCE e IM - UFRJ Cluster 47 Total Scattering The total scattering is the sum of the internal scattering and the scattering between the classes The total scattering is the sum of the internal scattering and the scattering between the classes S T =S W +S B S T =S W +S B In an optimal partition the separation between classes (S B ) must be maximum and within the classes minimum (S W ) In an optimal partition the separation between classes (S B ) must be maximum and within the classes minimum (S W )

Adriano Cruz *NCE e IM - UFRJ Cluster 48 J criteria Fisher defined the J criteria that must be maximized Fisher defined the J criteria that must be maximized A simplified way to evaluate J is A simplified way to evaluate J is

Adriano Cruz *NCE e IM - UFRJ Cluster 49 J comments J may vary in the interval 0<=J<= J may vary in the interval 0<=J<= J is strictly rigid J is strictly rigid J looses precision as the sample overlapping increases J looses precision as the sample overlapping increases

Adriano Cruz *NCE e IM - UFRJ Cluster 50EFLD EFLD measures the compactness and separation of all categories when fuzzy partitions are created EFLD measures the compactness and separation of all categories when fuzzy partitions are created EFLD uses two matrices: EFLD uses two matrices: S Be : Between Classes Scatter Matrix S Be : Between Classes Scatter Matrix S We : Within Classes Scatter Matrix S We : Within Classes Scatter Matrix

Adriano Cruz *NCE e IM - UFRJ Cluster 51 EFLD – S Be Matrix Measures the quality of separation between classes Measures the quality of separation between classes

Adriano Cruz *NCE e IM - UFRJ Cluster 52 EFLD – S We Matrix Measures the compactness of all classes Measures the compactness of all classes It is the sum of all internal scattering It is the sum of all internal scattering

Adriano Cruz *NCE e IM - UFRJ Cluster 53 Total Scattering The total scattering is the sum of the internal scattering and the scattering between the classes The total scattering is the sum of the internal scattering and the scattering between the classes S Te =S We +S Be In an optimal partition the separation between classes (S Be ) must be maximum and within the classes minimum (S We ) In an optimal partition the separation between classes (S Be ) must be maximum and within the classes minimum (S We )

Adriano Cruz *NCE e IM - UFRJ Cluster 54 J e criteria J e : criteria that must be maximised J e : criteria that must be maximised A simplified way to evaluate J e is A simplified way to evaluate J e is

Adriano Cruz *NCE e IM - UFRJ Cluster 55 Simplifying J e criteria A simplified way to evaluate J e A simplified way to evaluate J e It can be proved that S T is constant and equal to It can be proved that S T is constant and equal to

Adriano Cruz *NCE e IM - UFRJ Cluster 56 J e comments J e may vary in the interval 0<=J e <= J e may vary in the interval 0<=J e <= J e is strictly rigid J e is strictly rigid J e looses precision as the sample overlapping increases J e looses precision as the sample overlapping increases

Adriano Cruz *NCE e IM - UFRJ Cluster 57 Applying EFLD EFLD Número de Categorias Amostras X14,68154,91360,29430,25590,3157 Amostras X2 0,32710,85890,87570,96081,0674

Cluster Validity Inter Class Contrast

Adriano Cruz *NCE e IM - UFRJ Cluster 59Comments EFLD EFLD Increases as the number of clusters rises. Increases as the number of clusters rises. Increases when classes have high degree of overlapping. Increases when classes have high degree of overlapping. Reaches maximum for a wrong number of clusters. Reaches maximum for a wrong number of clusters.

Adriano Cruz *NCE e IM - UFRJ Cluster 60ICC Evaluates a crisp and fuzzy clustering algorithms Evaluates a crisp and fuzzy clustering algorithms Measures: Measures: Partition Compactness Partition Compactness Partition Separation Partition Separation ICC must be Maximized ICC must be Maximized

Adriano Cruz *NCE e IM - UFRJ Cluster 61ICC s Be – estimates the quality of the placement of the centres. s Be – estimates the quality of the placement of the centres. 1/n – scale factor 1/n – scale factor Compensates the influence of the number of points in s Be Compensates the influence of the number of points in s Be

Adriano Cruz *NCE e IM - UFRJ Cluster 62 ICC - 2 D min – minimum Euclidian distance between all pairs of centres D min – minimum Euclidian distance between all pairs of centres Neutralizes the tendency of s Be to grow, avoiding the maximum being reached for a number of clusters greater than the ideal value. Neutralizes the tendency of s Be to grow, avoiding the maximum being reached for a number of clusters greater than the ideal value. When 2 or more clusters represent a class – D min decreases abruptly When 2 or more clusters represent a class – D min decreases abruptly

Adriano Cruz *NCE e IM - UFRJ Cluster 63 ICC Fuzzy Application Five classes with 500 points each Five classes with 500 points each No class overlapping No class overlapping X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 Apply FCM for m = 2 and c = Apply FCM for m = 2 and c =

Adriano Cruz *NCE e IM - UFRJ Cluster 64 ICC Fuzzy Application Results

Adriano Cruz *NCE e IM - UFRJ Cluster 65 ICC Fuzzy Application Time 0, ,00490,00450,0061FPI 0, ,00490,00450,0044F 0, ,00580,00560,0061NFI 0,04760,03820,02610,0226CS 2,01601,55101,13920,7800EFLDDet 1,89821,47801,08700,7678EFLDTra EFLD 0,01320,01100,00880,0110ICCDet 0,01100,00880,00600,0078ICCTra 0, ,00820,00690,0061ICC 5432 Number of Categories Time

Adriano Cruz *NCE e IM - UFRJ Cluster 66 Application with Overlapping Five classes with 500 points each Five classes with 500 points each High cluster overlapping High cluster overlapping X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 X1 – (1,2), (6,2), (1, 6), (6,6), (3,5, 9) Std 0,3 Apply FCM for m = 2 and c = Apply FCM for m = 2 and c =

Adriano Cruz *NCE e IM - UFRJ Cluster 67 Application Overlapping Results

Adriano Cruz *NCE e IM - UFRJ Cluster 68 Application Time Results 0, ,03190,02710,0167MPE 0,01640,00610,01210,0112F 0, ,03620,02830,0220CS 1,84501,60901,25800,9720EFLDDet 2,25841,75982,10380,7930EFLDTra EFLD 0,01200,01100,00780,0110ICCDet 0,01100,00980,00600,0066ICCTra 0, ,00770,00640,0060ICC 5432 Number of Clusters Time

Adriano Cruz *NCE e IM - UFRJ Cluster 69 ICC conclusions Fast and efficient Fast and efficient Works with fuzzy and crisp partitions Works with fuzzy and crisp partitions Efficient even with high overlapping clusters Efficient even with high overlapping clusters High rate of right results High rate of right results