Tree Clustering & COBWEB. Remember: k-Means Clustering.

Slides:



Advertisements
Similar presentations
Clustering II.
Advertisements

Discrimination and Classification. Discrimination Situation: We have two or more populations  1,  2, etc (possibly p-variate normal). The populations.
Conceptual Clustering
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.
K Means Clustering , Nearest Cluster and Gaussian Mixture
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
Introduction to Bioinformatics
Important clustering methods used in microarray data analysis Steve Horvath Human Genetics and Biostatistics UCLA.
Cluster Analysis.
2004/05/03 Clustering 1 Clustering (Part One) Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information.
Clustering approaches for high- throughput data Sushmita Roy BMI/CS 576 Nov 12 th, 2013.
Clustering II.
1 Machine Learning: Symbol-based 10d More clustering examples10.5Knowledge and Learning 10.6Unsupervised Learning 10.7Reinforcement Learning 10.8Epilogue.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Clustering.
Cluster Analysis: Basic Concepts and Algorithms
What is Cluster Analysis?
Data Quality Class 8. Agenda Tests will be graded by next week Project Discovery –Domain discovery –Domain identification –Nulls –Clustering for rule.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
Clustering Unsupervised learning Generating “classes”
Clustering Algorithms Mu-Yu Lu. What is Clustering? Clustering can be considered the most important unsupervised learning problem; so, as every other.
START OF DAY 8 Reading: Chap. 14. Midterm Go over questions General issues only Specific issues: visit with me Regrading may make your grade go up OR.
Ch10 Machine Learning: Symbol-Based
Text Clustering.
tch?v=Y6ljFaKRTrI Fireflies.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory 1 CLUSTERS Prof. George Papadourakis,
CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
CSE5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides.
Clustering Gene Expression Data BMI/CS 576 Colin Dewey Fall 2010.
Prepared by: Mahmoud Rafeek Al-Farra
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Clustering. 2 Outline  Introduction  K-means clustering  Hierarchical clustering: COBWEB.
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
V. Clustering 인공지능 연구실 이승희 Text: Text mining Page:82-93.
CSCI 5417 Information Retrieval Systems Jim Martin Lecture 15 10/13/2011.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
Hierarchical Clustering
1 Machine Learning Lecture 9: Clustering Moshe Koppel Slides adapted from Raymond J. Mooney.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.
Data Mining and Text Mining. The Standard Data Mining process.
Clustering (2) Center-based algorithms Fuzzy k-means Density-based algorithms ( DBSCAN as an example ) Evaluation of clustering results Figures and equations.
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Clustering (1) Clustering Similarity measure Hierarchical clustering
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Clustering CSC 600: Data Mining Class 21.
Machine Learning Lecture 9: Clustering
Clustering.
Information Organization: Clustering
Cluster Analysis.
Text Categorization Berlin Chen 2003 Reference:
Presentation transcript:

Tree Clustering & COBWEB

Remember: k-Means Clustering

k-Means Example (K=2) Pick seeds Reassign clusters Compute centroids x x Reasssign clusters x x x x Compute centroids Reassign clusters Converged!

EM-Clustering

Tree clustering Linkage rules Conceptual Clustering COBWEB Category utility

Tree Clustering Tree clustering algorithm allow us to reveal the internal similarities of a given pattern set To structure these similarities hierarchically Applied to a small set of typical patterns For n patterns these algorithm generates a sequence of 1 to n clusters

The sequence of 1 to n clusters has a form of a binary tree (two branches for each tree node) Tree can be structured bottom up Merging algorithm starting with the individual patterns Splitting algorithm starting with a cluster composed of all patterns

Merging Algorithm given n patterns x i consider initially k=n singleton clusters C i ={x i }; /*every cluster has only one element*/ while k ≥ 1 do { determine the two nearest clusters C i and C j using an approximate similarity rule; merge C i and C j : C ij ={C i,C j }, therefore obtaining a solution with k-1 clusters; k=k-1; }

The determination of the nearest clusters depends on: the similarity measure the rule used to access the similarity of the clusters

Example Similarity between two clusters is assessed by measuring the similarity of the furthest pair of patterns (each one from the distinct cluster) This is the so-called complete linkage rule

As the merging process evolves, the similarity of the merged clusters decreases

Schedule graph may be of help for selecting the best solution Solutions with very small or even singletons clusters are rather suspicious

Linkage rules Complete linkage (FN furthest neighbor) Evaluates the dissimilarity between two clusters as the greatest distance between any two patterns, one from each cluster This rule performs well when the clusters are compact and of equal size Inadequate for filamentary clusters

Complete Link Example

Single linkage (NN nearest neighbor) Dissimilarity between two clusters as the dissimilarity of the nearest patterns, one from each cluster Produce chaining effect and works well with filamentary shape

a globular data b filamentary data

Single Link Example

Average linkage between groups Also known as UPGMA (un-weighted pair-group method using arithmetic averages) This rule assesses the distance between two clusters as the average of the distances between all pairs of patterns from a distinct cluster

Impact of cluster distance measures “Single-Link” (inter-cluster distance= distance between closest pair of points) “Complete-Link” (inter-cluster distance= distance between farthest pair of points)

Conceptual Clustering - COBWEB Conceptual Clustering Begins with a collection of unclassified objects and some means of measuring the similarity of objects Numeric taxonomy: representation of objects as a collection of features, each which may have some numerical value Objects a treated by a distance function as a vector of n features

bird is defined by the following features: flies, sings, lays eggs, nests in trees, eats insects. bat is defined by the following features: flies, gives milk, eats insects

Humans distinguish degrees of category membership We generally think of a robin as a better example of a bird than a chicken Oak is more typical example of a tree than a palm

Family resemblance theory (Wittgenstein 1953) Categories are defined by a complex systems of similarities between members A category may not have shared properties by all their members Games: Not all games require two or more players - solitaire (paciacia) Not all games are fun for the players - football Not all games involve competition - jumping rope Game category is well defined

Logic, feature vectors or decision trees do not account for these effects COBWEB (Fisher 1987) addresses these issues Models base-level categorization and degrees of category membership Represents categories probabilistically, instead of defining category memberships as a set of values that must be present Builds up a hierarchy (tree)

COBWEB represents the probability with which each feature values is present of an object p(f i =v ij |c k ) is the conditional probability with which each feature f i will have a value v ij, given that an object is in category c k

Example COBWEB forms a taxonomy (tree, hierarchy) of categories Example: Categorization of four single-cell animals

Each animal is defined by number of features Number of tails, color, and number of nuclei Category C3: have a 1.0 probability of having 2 tails, a 0.5 probability of having light color, and a 1.0 probability of having 2 nuclei

When given a new example, COBWEB considers the overall quality of either placing the example in an existing category or modifying the hierarchy The criterion COBWEB uses for evaluating the quality of the classification is called category utility

Category utility Was developed in research of human categorization (Gluck and Corter 1985) Category utility attempts to maximize both the probability that two objects in the same category have values in common and the probability that objects in different categories will have different property values

Category utility This sum is taken across all categories c k, all features f i and all feature values v ij

p(f i =v ij |c k ) is called predictability, it is the probability that an object has the value v ij for feature f i given that the object belongs to category c k The higher this probability, the more likely two objects in a category share the same feature values p(c k |f i =v ij ) is called predictiveness is the probability with which an objects belongs to the category c k given it has a value v ij for a feature f i The greater this probability, the less likely objects not in the category will have those values p(f i =v ij ) serves as a weight, frequent features exert a stronger influence

By combining these values, high category utility measures indicate a high likelihood that objects in the same category will share properties, while decreasing the likelihood of objects in different categories having properties in common

COBWEB performs a hill-climbing search of the space of possible taxonomies (trees) using category utility to evaluate and select possible categorizations

Initializes the taxonomy to a single category whose features are those of the first example For each example, the algorithm begins with the root category and moves through the tree At each level is uses category utility to evaluate the taxonomies 1. Placing the example in the best category 2. Adding a new category containing the example 3. Merging two existing categories and adding the example to the category 4. Splitting two existing categories and placing the example into the best category in the tree

COBWEB is efficient in producing trees with reasonable number of classes Because is allows probabilistic membership, its categories are flexible and robust

Tree clustering Linkage rules Conceptual Clustering COBWEB Category utility

Assessment Cluster validation