SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Lecture 12 introduced hierarchical cluster analysis,

Slides:



Advertisements
Similar presentations
How to Use the Earthquake Travel Time Graph (Page 11
Advertisements

4-4 Variability Objective: Learn to find measures of variability.
4.4.1 Generalised Row Echelon Form
MALT©2006 Maths/Fractions Slide Show : Lesson 4
Year 5 Term 3 Unit 6b Day 1.
The 5S numbers game..
Chapter 8 The Maximum Principle: Discrete Time 8.1 Nonlinear Programming Problems We begin by starting a general form of a nonlinear programming problem.
The basics for simulations
ALGEBRA TILES Jim Rahn LL Teach, Inc.
The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.
The Assignment Problem
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Section 2.5: Graphs and Trees
Abstract Data Types and Algorithms
Relationships Between Two Variables: Cross-Tabulation
1 After completing this lesson, you will be able to: Insert a table. Navigate and select cells within a table. Merge table cells. Insert and delete columns.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Chapter 4 Systems of Linear Equations; Matrices
Square Roots and the Pythagoren Theorm
Ch. 1: Number Relationships
Discrete Mathematics 3. MATRICES, RELATIONS, AND FUNCTIONS Lecture 5 Dr.-Ing. Erwin Sitompul
Simplifying Expressions
Factoring Grouping (Bust-the-b) Ex. 3x2 + 14x Ex. 6x2 + 7x + 2.
AE1APS Algorithmic Problem Solving John Drake
Factors Terminology: 3  4 =12
TRIANGULAR MATRICES A square matrix in which all the entries above the main diagonal are zero is called lower triangular, and a square matrix in which.
12 System of Linear Equations Case Study
CSE Lecture 17 – Balanced trees
9. Two Functions of Two Random Variables
How to create Magic Squares
MA 1128: Lecture 06 – 2/15/11 Graphs Functions.
a*(variable)2 + b*(variable) + c
CS 450: COMPUTER GRAPHICS LINEAR ALGEBRA REVIEW SPRING 2015 DR. MICHAEL J. REALE.
Let’s Do Algebra Tiles Algebra Tiles Manipulatives used to enhance student understanding of subject traditionally taught at symbolic level. Provide access.
Bioinformatics Programming 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Section 3.4 The Traveling Salesperson Problem Tucker Applied Combinatorics By Aaron Desrochers and Ben Epstein.
Chapter 4 Systems of Linear Equations; Matrices
Chapter 4 Systems of Linear Equations; Matrices
Copyright © Cengage Learning. All rights reserved.
Fault Tree Analysis Part 6 – Solutions of Fault Trees.
Linear Inequalities and Linear Programming Chapter 5
4 4.3 © 2012 Pearson Education, Inc. Vector Spaces LINEARLY INDEPENDENT SETS; BASES.
5.4 Simplex method: maximization with problem constraints of the form
Linear Programming Applications
Chapter 1 Section 1.2 Echelon Form and Gauss-Jordan Elimination.
Chapter 4 Systems of Linear Equations; Matrices
CELL FORMATION IN GROUP TECHNOLOGY There are numerous methods available for machine grouping in Group Technology (GT) Here we will discuss the simple method.
How to create tables in HTML…
tables Objectives: By the end of class today I will:
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Chapter 7 Transportation, Assignment & Transshipment Problems
Extending the Definition of Exponents © Math As A Second Language All Rights Reserved next #10 Taking the Fear out of Math 2 -8.
Analytic Hierarchy Process (AHP)
SORTING & SEARCHING - Bubble SortBubble Sort - Insertion SortInsertion Sort - Quick SortQuick Sort - Binary SearchBinary Search 2 nd June 2005 Thursday.
CELLULAR MANUFACTURING. Definition Objectives of Cellular Manufacturing  To reduce WIP inventory  To shorten manufacturing lead times  To simplify.
SECTION 2 BINARY OPERATIONS Definition: A binary operation  on a set S is a function mapping S X S into S. For each (a, b)  S X S, we will denote the.
Clustering [Idea only, Chapter 10.1, 10.2, 10.4].
Chapter 4 Systems of Linear Equations; Matrices
Chapter 4 Systems of Linear Equations; Matrices
Multiplication Past Paper Questions.
Lecture No.43 Data Structures Dr. Sohail Aslam.
Data Clustering Michael J. Watts
Karnaugh Maps Topics covered in this presentation: Karnaugh Maps
Numerical Analysis Lecture 16.
Graph Operations And Representation
Numerical Analysis Lecture 17.
MACHINE GROUPING IN CELLULAR MANUFACTURING With Reduction Of Material Handling As the Objective 19/04/2013 lec # 25 & 26.
N7 Prime factor decomposition, HCF and LCM
CUT SET TRANSFORMATION
Presentation transcript:

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Lecture 12 introduced hierarchical cluster analysis, observed that construction of a hierarchical cluster tree was a two-step process --creation of a vector-distance table, and construction of the tree on the basis of that table-- and outlined the first of these two steps. This lecture deals with the second step.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction There are two main ways of constructing a cluster tree, which the literature on the subject generally refers to as 'top-down' and 'bottom-up'. These terms won't be explained here since an explanation would take us too far afield. Suffice it to say that this module confines itself to the 'bottom up' approach, and that nothing further is said about 'top-down'.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction As noted, construction of a cluster tree for a data matrix is based on the distance table abstracted from the matrix. In what follows we will use the distance table constructed in the last lecture, but a 6 x 6 subset of the original 30 x 30 distance table will be used. This makes it possible to show the whole table rather than just a fragment, thereby baking the discussion clearer.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction

To further simplify the presentation, it is observed that the table is symmetrical on either side of the diagonal of zero-vales. This is because the distance between any pair of vectors is the same in either direction: the distance between vector 2 and vector 3 is the same as that between vector 3 and vector 2. Since the upper-right triangle simply duplicates the lower-left triangle, one of the two can be deleted without losing any information; the upper-right one is deleted:

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction A cluster tree for the first 6 rows of the original data matrix will now be constructed step-by-step, showing how the distance table is used to do this. The procedure is based on the principle that a set of vectors has a cluster structure if it can be divided into two or more groups in which the members of any given group are close to one another in the data space, and far from members of other cluster in the space. At each step in tree construction, therefore, one looks for the clusters that are closest to one another and amalgamates them into a superordinate cluster, and this continues until all the vectors have been assigned to one of the clusters. The following example will demonstrate this.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Initially, each vector is taken to be a cluster on its own, that is, a cluster with only one member. The distance table is now searched to find the smallest distance between clusters. This is the distance between clusters (2) and (3): 2.24 Clusters (2) and (3) are now combined into a superordinate cluster (2,3) by drawing the tree, as below, and then emending the distance table to incorporate the new cluster.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Emendation of the distance table takes a bit of understanding, so it is described in detail. Remove the rows and columns 2 and 3 from the table, and replace them with a single blank row and column to represent the new (2,3) cluster. Note that 0 is inserted as the distance between (2,3) and itself for the self-evident reason that the distance of any object to itself is always 0.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Insert into the blank cells of the (2,3) row and column the minimum distance from (2,3) to the remaining clusters (1), (4), (5), and (6). What does this mean? Referring to the original distance table above, the distance between (2) and (1) is 2.83 and between (3) and (1) it is 5.00; the minimum here is 2.83, and it is inserted into the relevant cell:

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between (2) and (4) in the original distance table is 4.24 and between (3) and (4) it is 2.25; the minimum here is 2.25, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between (2) and (5) in the original distance table is 7.81 and between (3) and (5) it is 5.66; the minimum here is 5.66, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between (2) and (6) in the original distance table is and between (3) and (6) it is 48.02; the minimum here is 46.87, and it is inserted into the relevant cell. Emendation of the distance table is now complete, and the result is the basis for the next step in the construction of the cluster tree. Note that the table has shrunk by one row/column. This shrinkage will continue as we proceed.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance table created in Step 1 is searched to find the smallest distance between clusters. This is the distance between clusters (2,3) and (4): 2.25 Clusters (2,3) and (4) are now combined into a superordinate cluster ((2,3),4) by drawing the tree, as below, and then emending the distance table to incorporate the new cluster. Emendation of the distance table proceeds as in Step 1.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Remove the rows and columns (2,3) and 4 from the table, and replace them with a single blank row and column to represent the new ((2,3),4) cluster.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Insert into the blank cells of the ((2,3),4) row and column the minimum distance from ((2,3),4) to the remaining clusters (1), (5), and (6). The distance between (2,3) and (1) is 2.83 and between (4) and (1) it is 7.07; the minimum here is 2.83, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between (2,3) and (5) is 5.66 and between (4) and (5) it is 3.61; the minimum here is 3.61, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between (2,3) and (6) is and between (4) and (6) it is 47.89; the minimum here is 46.87, and it is inserted into the relevant cell. Emendation of the distance table is now complete, and the result is the basis for Step 3 below. Note that the table has again shrunk by one row/column.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance table created in Step 2 is searched to find the smallest distance between clusters. This is the distance between clusters ((2,3),4) and (1): 2.83 Clusters ((2,3),4) and (1) are now combined into a superordinate cluster (((2,3),4),1) by drawing the tree, as below, and then emending the distance table to incorporate the new cluster. Emendation of the distance table proceeds as in Steps 1 and 2.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Remove the rows and columns (2,3) and 4 from the table, and replace them with a single blank row and column to represent the new (((2,3),4),1) cluster.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Insert into the blank cells of the (((2,3),4),1) column the minimum distance from (((2,3),4),1) to the remaining clusters (5) and (6). The distance between ((2,3),4) and (5) is 3.61 and between (1) and (5) it is 10.63; the minimum here is 3.61, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance between ((2,3),4) and (6) is and between (1) and (6) it is 46.40; the minimum here is 46.40, and it is inserted into the relevant cell. Emendation of the distance table is now complete, and the result is the basis for Step 4 below. Note again that the table has again shrunk by one row/column.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance table created in Step 3 is searched to find the smallest distance between clusters. This is the distance between clusters (((2,3),4),1) and (5): 3.61 Clusters (((2,3),4),1) and (5) are now combined into a superordinate cluster ((((2,3),4),1),5) by drawing the tree and then emending the distance table to incorporate the new cluster. Emendation of the distance table proceeds as in Steps 1-3.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Remove the rows and columns (((2,3),4),1) and 5 from the table, and replace them with a single blank row and column to represent the new ((((2,3),4),1),5) cluster.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Insert into the blank cell of the ((((2,3),4),1),5) column the minimum distance from ((((2,3),4),1),5) to the remaining cluster (6). The distance between (((2,3),4),1) and (6) in Table 4 is and between (5) and (6) it is 49.66; the minimum here is 46.40, and it is inserted into the relevant cell.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction The distance table created in Step 4 is searched to find the smallest distance between clusters. There is only one remaining value. Clusters ((((2,3),4),1),5) and (6) are now combined into a superordinate cluster (((((2,3),4),1),5),6) by drawing the tree and then emending the distance table to incorporate the new cluster.

SEL3053: Analyzing Geordie Lecture 13. Hierarchical cluster analysis 2 - cluster tree construction Remove the rows and columns ((((2,3),4),1),5) and 6 from the table, and replace them with a single blank row and column to represent the new (((((2,3),4),1),5),6) cluster. All 6 vectors have now been incorporated into the cluster tree, and tree construction stops.