Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
5.1 Rules for Exponents Review of Bases and Exponents Zero Exponents
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ
EuroCondens SGB E.
Worksheets.
Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.
Slide 1Fig 25-CO, p.762. Slide 2Fig 25-1, p.765 Slide 3Fig 25-2, p.765.
& dding ubtracting ractions.
Sequential Logic Design
Addition and Subtraction Equations
1 When you see… Find the zeros You think…. 2 To find the zeros...
Create an Application Title 1Y - Youth Chapter 5.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
2.11.
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Stationary Time Series
The basics for simulations
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Factoring Quadratics — ax² + bx + c Topic
EE, NCKU Tien-Hao Chang (Darby Chang)
MM4A6c: Apply the law of sines and the law of cosines.
Briana B. Morrison Adapted from William Collins
Regression with Panel Data
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Progressive Aerobic Cardiovascular Endurance Run
Biology 2 Plant Kingdom Identification Test Review.
Chapter 1: Expressions, Equations, & Inequalities
CSE 6007 Mobile Ad Hoc Wireless Networks
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Slide R - 1 Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Prentice Hall Active Learning Lecture Slides For use with Classroom Response.
Subtraction: Adding UP
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
12 System of Linear Equations Case Study
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Resistência dos Materiais, 5ª ed.
& dding ubtracting ractions.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
16. Mean Square Estimation
9. Two Functions of Two Random Variables
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
úkol = A 77 B 72 C 67 D = A 77 B 72 C 67 D 79.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Cluster Analysis Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Presentation transcript:

Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ

K-means

Adriano Cruz *NCE e IM - UFRJ Cluster 3 K-means algorithm Based on the Euclidean distances among elements of the cluster Based on the Euclidean distances among elements of the cluster Centre of the cluster is the mean value of the objects in the cluster. Centre of the cluster is the mean value of the objects in the cluster. Classifies objects in a hard way. Each object belongs to a single cluster. Classifies objects in a hard way. Each object belongs to a single cluster.

Adriano Cruz *NCE e IM - UFRJ Cluster 4 K-means algorithm Consider n (X={x 1, x 2,..., x n }) objects and k clusters. Consider n (X={x 1, x 2,..., x n }) objects and k clusters. Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Consider A a set of k clusters (A={A 1, A 2,..., A k }). Consider A a set of k clusters (A={A 1, A 2,..., A k }).

Adriano Cruz *NCE e IM - UFRJ Cluster 5 K-means properties The union of all clusters makes the Universe The union of all clusters makes the Universe No element belongs to more than one cluster No element belongs to more than one cluster There is no empty cluster There is no empty cluster

Adriano Cruz *NCE e IM - UFRJ Cluster 6 Membership function

Adriano Cruz *NCE e IM - UFRJ Cluster 7 Membership matrix U Matrix containing the values of inclusion of each element into each cluster (0 or 1). Matrix containing the values of inclusion of each element into each cluster (0 or 1). Matrix has c (clusters) lines and n (elements) columns. Matrix has c (clusters) lines and n (elements) columns. The sum of all elements in the column must be equal to one (element belongs only to one cluster The sum of all elements in the column must be equal to one (element belongs only to one cluster The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements.

Adriano Cruz *NCE e IM - UFRJ Cluster 8 Matrix examples X1X2X3 X4X5X6 Two examples of clustering. What do the clusters represent?

Adriano Cruz *NCE e IM - UFRJ Cluster 9 Matrix examples cont. X1X2X3 X4X5X6 U1 and U2 are the same matrices.

Adriano Cruz *NCE e IM - UFRJ Cluster 10 How many clusters? The cardinality of any hard k-partition of n elements is The cardinality of any hard k-partition of n elements is

Adriano Cruz *NCE e IM - UFRJ Cluster 11 How many clusters (example)? Consider the matrix U2 (k=3, n=6) Consider the matrix U2 (k=3, n=6)

Adriano Cruz *NCE e IM - UFRJ Cluster 12 K-means inputs and outputs Inputs: the number of clusters c and a database containing n objects with l characteristics each. Inputs: the number of clusters c and a database containing n objects with l characteristics each. Output: A set of k clusters that minimises the square-error criterion. Output: A set of k clusters that minimises the square-error criterion.

Adriano Cruz *NCE e IM - UFRJ Cluster 13 Number of Clusters

Adriano Cruz *NCE e IM - UFRJ Cluster 14 K-means algorithm v1 Arbitrarily assigns each object to a cluster (matrix U). Repeat Update the cluster centres; Update the cluster centres; Reassign objects to the clusters to which the objects are most similar; Reassign objects to the clusters to which the objects are most similar; Until no change;

Adriano Cruz *NCE e IM - UFRJ Cluster 15 K-means algorithm v2 Arbitrarily choose c objects as the initial cluster centres. Repeat Reassign objects to the clusters to which the objects are most similar. Reassign objects to the clusters to which the objects are most similar. Update the cluster centres. Until no change

Adriano Cruz *NCE e IM - UFRJ Cluster 16 Algorithm details The algorithm tries to minimise the function The algorithm tries to minimise the function d ie is the distance between the element x e (m characteristics) and the centre of the cluster i (v i ) d ie is the distance between the element x e (m characteristics) and the centre of the cluster i (v i )

Adriano Cruz *NCE e IM - UFRJ Cluster 17 Cluster Centre The centre of the cluster i (v i ) is an l characteristics vector. The centre of the cluster i (v i ) is an l characteristics vector. The jth co-ordinate is calculated as The jth co-ordinate is calculated as

Adriano Cruz *NCE e IM - UFRJ Cluster 18 Detailed Algorithm Choose c (number of clusters). Choose c (number of clusters). Set error ( > 0) and step (r=0). Set error ( > 0) and step (r=0). Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements. Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements.

Adriano Cruz *NCE e IM - UFRJ Cluster 19 Detailed Algorithm cont. Repeat Calculate the centre of the clusters v i (r) Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions using the equations Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || <

Adriano Cruz *NCE e IM - UFRJ Cluster 20 Matrix norms Consider a matrix U of n lines and n columns: Consider a matrix U of n lines and n columns: Column norm Column norm Line norm Line norm

Adriano Cruz *NCE e IM - UFRJ Cluster 21 K-means problems? Suitable when clusters are compact clouds well separated. Suitable when clusters are compact clouds well separated. Scalable because computational complexity is O(nkr). Scalable because computational complexity is O(nkr). Necessity of choosing c is disadvantage. Necessity of choosing c is disadvantage. Not suitable for nonconvex shapes. Not suitable for nonconvex shapes. It is sensitive to noise and outliers because they influence the means. It is sensitive to noise and outliers because they influence the means. Depends on the initial allocation. Depends on the initial allocation.

Adriano Cruz *NCE e IM - UFRJ Cluster 22 Examples of results

Adriano Cruz *NCE e IM - UFRJ Cluster 23 K-means: Actual Data

Adriano Cruz *NCE e IM - UFRJ Cluster 24 K-means: Results

Adriano Cruz *NCE e IM - UFRJ Cluster 25 K-medoids

Adriano Cruz *NCE e IM - UFRJ Cluster 26 K-medoids methods K-means is sensitive to outliers since an object with an extremely large value may distort the distribution of data. K-means is sensitive to outliers since an object with an extremely large value may distort the distribution of data. Instead of taking the mean value the most centrally object (medoid) is used as reference point. Instead of taking the mean value the most centrally object (medoid) is used as reference point. The algorithm minimizes the sum of dissimilarities between each object and the medoid (similar to k-means) The algorithm minimizes the sum of dissimilarities between each object and the medoid (similar to k-means)

Adriano Cruz *NCE e IM - UFRJ Cluster 27 K-medoids strategies Find k-medoids arbitrarily. Find k-medoids arbitrarily. Each remaining object is clustered with the medoid to which is the most similar. Each remaining object is clustered with the medoid to which is the most similar. Then iteratively replaces one of the medoids by a non-medoid as long as the quality of the clustering is improved. Then iteratively replaces one of the medoids by a non-medoid as long as the quality of the clustering is improved. The quality is measured using a cost function that measures the average dissimilarity between the objects and the medoid of its cluster. The quality is measured using a cost function that measures the average dissimilarity between the objects and the medoid of its cluster.

Adriano Cruz *NCE e IM - UFRJ Cluster 28 Reassignment costs Each time a reassignment occurs a difference in square-error J is contributed. Each time a reassignment occurs a difference in square-error J is contributed. The cost function J calculates the total cost of replacing a current medoid by a non-medoid. The cost function J calculates the total cost of replacing a current medoid by a non-medoid. If the total cost is negative then m j is replaced by m random, otherwise the replacement is not accepted. If the total cost is negative then m j is replaced by m random, otherwise the replacement is not accepted.

Adriano Cruz *NCE e IM - UFRJ Cluster 29 Replacing medoids case 1 Object p belongs to medoid m j. If m j is replaced by m random and p is closest to one of m i (i<>j), then reassigns p to m i Object p belongs to medoid m j. If m j is replaced by m random and p is closest to one of m i (i<>j), then reassigns p to m i mimi mjmj m random p

Adriano Cruz *NCE e IM - UFRJ Cluster 30 m random Replacing medoids case 2 Object p belongs to medoid m j. If m j is replaced by m random and p is closest to m random, then reassigns p to m random Object p belongs to medoid m j. If m j is replaced by m random and p is closest to m random, then reassigns p to m random mimi mjmj p

Adriano Cruz *NCE e IM - UFRJ Cluster 31 m random Replacing medoids case 3 Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is still close to m i, then does not change. Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is still close to m i, then does not change. mimi mjmj p

Adriano Cruz *NCE e IM - UFRJ Cluster 32 m random Replacing medoids case 4 Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is closest to m random,then reassigns p to m random. Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is closest to m random,then reassigns p to m random. mimi mjmj p

Adriano Cruz *NCE e IM - UFRJ Cluster 33 K-medoid algorithm Arbitrarily choose k objects as the initial medoids. Repeat Assign each remaining object to the cluster with the nearest medoid; Randomly select a nonmedoid object, m random ; Compute the total cost J of swapping m j with m random ; If J<0 then swap o j with o random ; Until no change

Adriano Cruz *NCE e IM - UFRJ Cluster 34Comparisons? K-medoids is more robust than k-means in presence of noise and outliers. K-medoids is more robust than k-means in presence of noise and outliers. K-means is less costly in terms of processing time. K-means is less costly in terms of processing time.

Adriano Cruz *NCE e IM - UFRJ Cluster 35 Fuzzy C-means

Adriano Cruz *NCE e IM - UFRJ Cluster 36 Fuzzy C-means Fuzzy version of K-means Fuzzy version of K-means Elements may belong to more than one cluster Elements may belong to more than one cluster Values of characteristic function range from 0 to 1. Values of characteristic function range from 0 to 1. It is interpreted as the degree of membership of an element to a cluster relative to all other clusters. It is interpreted as the degree of membership of an element to a cluster relative to all other clusters.

Adriano Cruz *NCE e IM - UFRJ Cluster 37 Fuzzy C-means setup Consider n (X={x 1, x 2,..., x n }) objects and c clusters. Consider n (X={x 1, x 2,..., x n }) objects and c clusters. Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Consider A a set of k clusters (A={A 1, A 2,..., A k }). Consider A a set of k clusters (A={A 1, A 2,..., A k }).

Adriano Cruz *NCE e IM - UFRJ Cluster 38 Fuzzy C-means properties The union of all clusters makes the Universe The union of all clusters makes the Universe There is no empty cluster There is no empty cluster

Adriano Cruz *NCE e IM - UFRJ Cluster 39 Membership function

Adriano Cruz *NCE e IM - UFRJ Cluster 40 Problems of probabilistic clusters Points representing circle lines (C1 e C2) Points representing circle lines (C1 e C2) Due to normalization strange results may emerge Due to normalization strange results may emerge C1 C2 {c1=0.5,c2= 0.5} C1 C2 {c1=0.5,c2= 0.5} {c1=0.7,c2= 0.3} {c1=0.3,c2= 0.7}

Adriano Cruz *NCE e IM - UFRJ Cluster 41 Membership matrix U Matrix containing the values of inclusion of each element into each cluster [0,1]. Matrix containing the values of inclusion of each element into each cluster [0,1]. Matrix has c (clusters) lines and n (elements) columns. Matrix has c (clusters) lines and n (elements) columns. The sum of all elements in the column must be equal to one. The sum of all elements in the column must be equal to one. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements.

Adriano Cruz *NCE e IM - UFRJ Cluster 42 Matrix examples X1X2X3 X4X5X6 Two examples of clustering. What do the clusters represent?

Adriano Cruz *NCE e IM - UFRJ Cluster 43 Fuzzy C-means algorithm v1 Arbitrarily assigns each object to a cluster (matrix U). Repeat Update the cluster centres; Update the cluster centres; Reassign objects to the clusters to which the objects are most similar; Reassign objects to the clusters to which the objects are most similar; Until no change;

Adriano Cruz *NCE e IM - UFRJ Cluster 44 Fuzzy C-means algorithm v2 Arbitrarily choose c objects as the initial cluster centres. Repeat Reassign objects to the clusters to which the objects are most similar. Reassign objects to the clusters to which the objects are most similar. Update the cluster centres. Until no change

Adriano Cruz *NCE e IM - UFRJ Cluster 45 Algorithm details The algorithm tries to minimise the function, m is the nebulisation factor. The algorithm tries to minimise the function, m is the nebulisation factor. d ie is the distance between the element x e (l characteristics) and the centre of the cluster i (v i ) d ie is the distance between the element x e (l characteristics) and the centre of the cluster i (v i )

Adriano Cruz *NCE e IM - UFRJ Cluster 46 Nebulisation factor m is the nebulisation factor. m is the nebulisation factor. This value has a range [1, ) This value has a range [1, ) If m=1 the the system is crisp. If m=1 the the system is crisp. If m the all the membership values tend to 1/c If m the all the membership values tend to 1/c The most common values are 1.25 and 2.0 The most common values are 1.25 and 2.0

Adriano Cruz *NCE e IM - UFRJ Cluster 47 Cluster Centre The centre of the cluster i (v i ) is a l characteristics vector. The centre of the cluster i (v i ) is a l characteristics vector. The jth co-ordinate is calculated as The jth co-ordinate is calculated as

Adriano Cruz *NCE e IM - UFRJ Cluster 48 Detailed Algorithm Choose c (number of clusters). Choose c (number of clusters). Set error ( > 0), nebulisation factor (m) and step (r=0). Set error ( > 0), nebulisation factor (m) and step (r=0). Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements. Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements.

Adriano Cruz *NCE e IM - UFRJ Cluster 49 Detailed Algorithm cont. Repeat Calculate the centre of the clusters v i (r) Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions(How?) Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || <

Adriano Cruz *NCE e IM - UFRJ Cluster 50 How to recalculate? If there is any distance greater than zero then membership grade is the weighted average of the distances to all centers. If there is any distance greater than zero then membership grade is the weighted average of the distances to all centers. else the element belongs to this cluster and no one else.

Adriano Cruz *NCE e IM - UFRJ Cluster 51 Example of clustering result

Adriano Cruz *NCE e IM - UFRJ Cluster 52 Example of clustering result

Adriano Cruz *NCE e IM - UFRJ Cluster 53 Possibilistic Clustering

Adriano Cruz *NCE e IM - UFRJ Cluster 54 Membership function

Adriano Cruz *NCE e IM - UFRJ Cluster 55 Membership function The membership degree is the representativity of typicality of the datum x for the cluster i. The membership degree is the representativity of typicality of the datum x for the cluster i.

Adriano Cruz *NCE e IM - UFRJ Cluster 56 Algorithm details The algorithm tries to minimise the function, m is the nebulisation factor. The algorithm tries to minimise the function, m is the nebulisation factor. The first sum is the usual and the second rewards high memberships. The first sum is the usual and the second rewards high memberships.

Adriano Cruz *NCE e IM - UFRJ Cluster 57 How to calculate?

Adriano Cruz *NCE e IM - UFRJ Cluster 58 Detailed Algorithm Choose c (number of clusters) and m. Choose c (number of clusters) and m. Set error ( > 0) and step (r=0). Set error ( > 0) and step (r=0). Execute FCM Execute FCM

Adriano Cruz *NCE e IM - UFRJ Cluster 59 Detailed Algorithm cont. For 2 times Initialize U (0) and the centre of the clusters v i (0) with previous results Initialize k and r=0 Repeat Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions using the equations Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || < End FOR

Adriano Cruz *NCE e IM - UFRJ Cluster 60 Gustafson-Kessel Algorithm

Adriano Cruz *NCE e IM - UFRJ Cluster 61 Gustafson-Kessel method This method (GK) is fuzzy clustering method similar to the Fuzzy C-means (FCM). This method (GK) is fuzzy clustering method similar to the Fuzzy C-means (FCM). The difference is the way the distance is calculated. The difference is the way the distance is calculated. FCM uses Euclidean distances FCM uses Euclidean distances GK uses Mahalanobis distances GK uses Mahalanobis distances

Adriano Cruz *NCE e IM - UFRJ Cluster 62 Gustafson-Kessel method Mahalobis distance is calculated as Mahalobis distance is calculated as The matrices A i are given by The matrices A i are given by

Adriano Cruz *NCE e IM - UFRJ Cluster 63 Gustafson-Kessel method The Fuzzy Covariance Matrix is The Fuzzy Covariance Matrix is

Adriano Cruz *NCE e IM - UFRJ Cluster 64 GK comments The clusters are hyperellipsoids on the l. The clusters are hyperellipsoids on the l. The hyperellipsoids have aproximately the same size. The hyperellipsoids have aproximately the same size. In order to be possible to calculate S -1 the number of samples n must be at least equal to the number of dimensions l plus 1. In order to be possible to calculate S -1 the number of samples n must be at least equal to the number of dimensions l plus 1.

Adriano Cruz *NCE e IM - UFRJ Cluster 65 Results of GK

Adriano Cruz *NCE e IM - UFRJ Cluster 66 Gath-Geva method It is also known as Gaussian Mixture Decomposition. It is also known as Gaussian Mixture Decomposition. It is similar to the FCM method It is similar to the FCM method The Gauss distance is used instead of Euclidean distance. The Gauss distance is used instead of Euclidean distance. The clusters do not have a definite shape anymore and have various sizes. The clusters do not have a definite shape anymore and have various sizes.

Adriano Cruz *NCE e IM - UFRJ Cluster 67 Gath-Geva Method Gauss distance is given by Gauss distance is given by A i =S i -1 A i =S i -1

Adriano Cruz *NCE e IM - UFRJ Cluster 68 Gath-Geva Method The term P i is the probability of a sample belong to a cluster. The term P i is the probability of a sample belong to a cluster.

Adriano Cruz *NCE e IM - UFRJ Cluster 69 Gath-Geva Comments Pi is a parameter that influences the size of a cluster. Pi is a parameter that influences the size of a cluster. Bigger clusters attract more elements. Bigger clusters attract more elements. The exponential term makes more difficult to avoid local minima. The exponential term makes more difficult to avoid local minima. Usually another clustering method is used to initialise the partition matrix U. Usually another clustering method is used to initialise the partition matrix U.

Adriano Cruz *NCE e IM - UFRJ Cluster 70 GG Results – Random Centers

Adriano Cruz *NCE e IM - UFRJ Cluster 71 GG Results – Centers FCM

Adriano Cruz *NCE e IM - UFRJ Cluster 72 Clustering based on Equivalence Relations A relation crisp R on a universe X can be thought as a relation from X to X A relation crisp R on a universe X can be thought as a relation from X to X R is an equivalence relation if it has the following three properties: R is an equivalence relation if it has the following three properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Transitivity (x i, x j ) R and (x j, x k ) R (x i, x k ) R Transitivity (x i, x j ) R and (x j, x k ) R (x i, x k ) R

Adriano Cruz *NCE e IM - UFRJ Cluster 73 Crisp tolerance relation R is a tolerance relation if it has the following two properties: R is a tolerance relation if it has the following two properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R

Adriano Cruz *NCE e IM - UFRJ Cluster 74 Composition of Relations XYZ RS T=R°S

Adriano Cruz *NCE e IM - UFRJ Cluster 75 Composition of Crisp Relations The operation ° is similar to matrix multiplication.

Adriano Cruz *NCE e IM - UFRJ Cluster 76 Transforming Relations A tolerance relation can be transformed into a equivalence relation by at most (n-1) compositions with itself. A tolerance relation can be transformed into a equivalence relation by at most (n-1) compositions with itself. n is the cardinality of the set R. n is the cardinality of the set R.

Adriano Cruz *NCE e IM - UFRJ Cluster 77 Example of crisp classification Let X={1,2,3,4,5,6,7,8,9,10} Let X={1,2,3,4,5,6,7,8,9,10} Let R be defined as the relation for the identical remainder after dividing each element by 3. Let R be defined as the relation for the identical remainder after dividing each element by 3. This relation is an equivalence relation This relation is an equivalence relation

Adriano Cruz *NCE e IM - UFRJ Cluster 78 Relation Matrix

Adriano Cruz *NCE e IM - UFRJ Cluster 79 Crisp Classification Consider equivalent columns. Consider equivalent columns. It is possible to group the elements in the following classes It is possible to group the elements in the following classes R 0 = {3, 6, 9} R 0 = {3, 6, 9} R 1 = {1, 4, 7, 10} R 1 = {1, 4, 7, 10} R 2 = {2, 5, 8} R 2 = {2, 5, 8}

Adriano Cruz *NCE e IM - UFRJ Cluster 80 Clustering and Fuzzy Equivalence Relations A relation fuzzy R on a universe X can be thought as a relation from X to X A relation fuzzy R on a universe X can be thought as a relation from X to X R is an equivalence relation if it has the following three properties: R is an equivalence relation if it has the following three properties: Reflexivity: (x i, x i ) R or (x i, x i ) = 1 Reflexivity: (x i, x i ) R or (x i, x i ) = 1 Symmetry: (x i, x j ) R (x j, x i ) R or Symmetry: (x i, x j ) R (x j, x i ) R or (x i, x j ) = (x j, x i ) (x i, x j ) = (x j, x i ) Transitivity: (x i, x j ) and (x j, x k ) R (x i, x k ) R or if (x i, x j ) = 1 and (x j, x k ) = 2 then (x i, x k ) = and >=min( 1, 2 ) Transitivity: (x i, x j ) and (x j, x k ) R (x i, x k ) R or if (x i, x j ) = 1 and (x j, x k ) = 2 then (x i, x k ) = and >=min( 1, 2 )

Adriano Cruz *NCE e IM - UFRJ Cluster 81 Fuzzy tolerance relation R is a tolerance relation if it has the following two properties: R is a tolerance relation if it has the following two properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R

Adriano Cruz *NCE e IM - UFRJ Cluster 82 Composition of Fuzzy Relations The operation ° is similar to matrix multiplication.

Adriano Cruz *NCE e IM - UFRJ Cluster 83 Distance Relation Let X be a set of data on l. Let X be a set of data on l. The distance function is a tolerance relation that can be transformed into a equivalence. The distance function is a tolerance relation that can be transformed into a equivalence. The relation R can be defined by the Minkowski distance formula. The relation R can be defined by the Minkowski distance formula. is a constant that ensures that R [0,1] and is equal to the inverse of the largest distance in X. is a constant that ensures that R [0,1] and is equal to the inverse of the largest distance in X.

Adriano Cruz *NCE e IM - UFRJ Cluster 84 Example of Fuzzy classification Let X={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in 2. Let X={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in 2. Set q=2, Euclidean distances. Set q=2, Euclidean distances. The largest distance is 4 ( x 1, x 5 ), so =0.25. The largest distance is 4 ( x 1, x 5 ), so =0.25. The relation R can be calculated by the equation The relation R can be calculated by the equation

Adriano Cruz *NCE e IM - UFRJ Cluster 85 Points to be classified

Adriano Cruz *NCE e IM - UFRJ Cluster 86 Tolerance matrix The matrix calculated by the equation is The matrix calculated by the equation is The is a tolerance relation that needs to be transformed into a equivalence relation The is a tolerance relation that needs to be transformed into a equivalence relation

Adriano Cruz *NCE e IM - UFRJ Cluster 87 Equivalence matrix The matrix transformed is The matrix transformed is

Adriano Cruz *NCE e IM - UFRJ Cluster 88 Results of clustering Taking -cuts of fuzzy equivalent relation at various values of =0.44, 0.5, 0.65 and 1.0 we get the following classes: Taking -cuts of fuzzy equivalent relation at various values of =0.44, 0.5, 0.65 and 1.0 we get the following classes: R.44 =[{x 1,x 2,x 3,x 4,x 5 }] R.44 =[{x 1,x 2,x 3,x 4,x 5 }] R.55 =[{x 1,x 2,x 4,x 5 }{x 3 }] R.55 =[{x 1,x 2,x 4,x 5 }{x 3 }] R.65 =[{x 1,x 2 },{x 3 },{x 4,x 5 }] R.65 =[{x 1,x 2 },{x 3 },{x 4,x 5 }] R 1.0 =[{x 1 },{x 2 },{x 3 },{x 4 },{x 5 }] R 1.0 =[{x 1 },{x 2 },{x 3 },{x 4 },{x 5 }]