Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hierarchical and K-means Clustering

Similar presentations


Presentation on theme: "Hierarchical and K-means Clustering"— Presentation transcript:

1 Hierarchical and K-means Clustering

2 “closest” Build a tree

3

4

5

6 End up with a large binary tree (dendogram)

7 Hierarchical Clustering
How to measure “closeness”? Each cluster is a set of points so not clear how to find the distance

8 Measuring Closeness Complete-Linkage – use the greatest distance from any member of one cluster to any member of the other cluster Compute for all pairs and choose the pair with the maximum distance

9 Complete Linkage Similarity of two clusters based on their least similar members

10 Complete Linkage - Crowding

11 Complete Linkage - Crowding
How it should be clustered

12 Complete Linkage - Crowding
p p p3 p4 p5

13 Complete Linkage - Crowding
p p p3 p4 p5 Already clustered Result?

14 Complete Linkage - Crowding
Max distances: p1 p2 3 p1 p5 7 p2 p5 4 p p p3 p4 p5 Already clustered

15 Complete Linkage - Crowding
Max distances: p1 p2 3 p1 p5 7 p2 p5 4 p p p3 p4 p5 min

16 Complete Linkage - Crowding
Max distances: p1 p2 3 p1 p5 7 p2 p5 4 p p p3 p4 p5 min

17 Measuring Closeness Average-Linkage – use the average distance from any member of one cluster to any member of the other cluster Compute for all pairs and use the average distance Number of points in G, H

18 Average Linkage (Not all lines shown) Similarity of two clusters based on average similarity of members

19

20 Merge the two closest clusters
New distance?

21 dist(X to MI/TO )= min {dist(X to MI), dist(X to TO)}

22 Next merge?

23 Recompute distances Next merge?

24 Next merge?

25 One more merge

26 Final Dendogram (Tree/Hierarchy)
How to split into clusters? ex. Want 3 clusters

27 Final Dendogram (Tree/Hierarchy)
How to split into clusters? ex. Want 3 clusters For k clusters, cut the k-1 longest links

28 Final Dendogram (Tree/Hierarchy)
How to split into clusters? ex. Want 3 clusters For k clusters, cut the k-1 longest links

29 Final Dendogram (Tree/Hierarchy)
How to split into clusters? ex. Want 3 clusters For k clusters, cut the k-1 longest links

30 Final Dendogram (Tree/Hierarchy)
How to split into clusters? ex. Want 3 clusters For k clusters, cut the k-1 longest links

31 Final Dendogram (Tree/Hierarchy)
Intuitively what does this mean?

32 Final Dendogram (Tree/Hierarchy)
These are the 3 groups of cities that are closest to each other.

33 Hierarchical Clustering Algorithm
Compute distance matrix Let each example be its own cluster while (# clusters > 1): Merge the two closest clusters Update distance matrix Run Time? (Simple implementation) Assume n data points, d dimensions How many iterations? First iteration: compute distance between all pairs: O(n2d) All other iterations: compute distance between most recently created cluster to all other clusters: O(nd) Total: O(n2d) k=2 k=4 k=3

34 Hierarchical Clustering Algorithm (for any k)
Compute distance matrix Let each example be its own cluster while (# clusters > 1): Merge the two closest clusters Update distance matrix Run Time? (Simple implementation) Assume n data points, d dimensions Using min-heap: deletions take O(log n2) How many iterations? First iteration: compute distance between all pairs: O(n2d) All other iterations: compute distance between most recently created cluster to all other clusters: O(nd) Total: O(n2d)  Slow (does not scale well)

35 Hierarchical Clustering Algorithm
Compute distance matrix Let each example be its own cluster while (# clusters > 1): Merge the two closest clusters Update distance matrix Run Time? (Simple implementation) Assume n data points, d dimensions Using min-heap: deletions take O(log n2) n – 1 iterations First iteration: compute distance between all pairs: O(n2d) All other iterations: compute distance between most recently created cluster to all other clusters: O(nd) Total: O(n2d)  Slow (does not scale well)

36 Hierarchical Clustering Pros/Cons
Simple Do not need to know k ahead of time Cons: Sensitive to noise Slow – does not scale well Does not “learn” – can’t undo any steps

37 K-Means Clustering Input: Data points k (number of desired clusters)

38 K-Means Clustering Choose k random points (“means”)

39 K-Means Clustering Assign each data point to closest mean.

40 K-Means Clustering Assign each data point to closest mean.
How can we adjust the means to be closer to their data points?

41 K-Means Clustering Adjust each mean to be the center of its
data points. Reassign data points to closest means.

42 K-Means Clustering Repeat until means no longer move.
(Convergence guaranteed)

43 K-Means Algorithm Assume n data points, k clusters desired
Choose k random points as means Repeat until means no longer move: Assign each data point to closest mean Move each mean to center of cluster of points that are assigned to it Run Time? (Simple implementation) Assume n data points, d dimensions, i iterations Total: O(i2ndk)

44 K-Means Algorithm Assume n data points, k clusters desired
Choose k random points as means Repeat until means no longer move: Assign each data point to closest mean Move each mean to center of cluster of points that are assigned to it Run Time? (Simple implementation) Assume n data points, d dimensions, i iterations Total: O(i2ndk)

45 K-Means Algorithm Assume n data points, k clusters desired
Choose k random points as means Repeat until means no longer move: Assign each data point to closest mean Move each mean to center of cluster of points that are assigned to it Run Time? (Simple implementation) Assume n data points, d dimensions, i iterations Total: O(i2ndk) O(ndk) O(knd)

46 K-Means in action (k=3)

47 K-Means in action (k=3) All means shifted down

48 K-Means in action (k=3) Red points are converging

49 K-Means in action (k=3)

50 K-Means in action (k=3)

51 K-Means in action (k=3)

52 K-Means Pros/Cons Pros: Cons: Efficient
Some knowledge of k is required

53 Knowledge in AI

54 Agent sensors actuators environment agent ? ?

55 Algorithms: Search, CSP, Probabilistic Inference, Learning
Agent sensors actuators environment agent ? Algorithms: Search, CSP, Probabilistic Inference, Learning

56 Algorithms: Search, CSP, Probabilistic Inference, Learning +
Agent sensors actuators environment agent ? Algorithms: Search, CSP, Probabilistic Inference, Learning + Knowledge Base

57 Recall 8-Puzzle 1 2 3 4 5 6 8 7 Turns out, some puzzles are not solvable n x n board Inversion: number of pairs of tiles i, j such that i < j but i appears after j in row-major order If n odd and number of inversions is odd, then unsolvable

58 Recall 8-Puzzle 1 2 3 4 5 6 8 7 Turns out, some puzzles are not solvable n x n board Inversion: number of pairs of tiles i, j such that i < j but i appears after j in row-major order If n odd and number of inversions is odd, then unsolvable Agent should have this information

59 Knowledge Base Set of sentences/facts (in logical language)
“n is odd” “1 is odd” “Number of inversions is odd” Agent should be able to expand the knowledge base through inference “Puzzle is unsolvable”

60 Hunt the Wumpus Invented in the early 70s
originally command-line (think black screen with greenish text)

61 Wumpus World

62 Wumpus World Environment: Actuators: 4x4 grid of rooms
Agent starts in [1,1] Gold in a random room Wumpus in a different random room Bottomless pits in some rooms Wumpus can eat agent if in same room Agent can shoot Wumpus with arrow Actuators: Move left Move right Move up Move down Grab gold Shoot

63 Wumpus World Continued
Sensors: Stench (adjacent square contains Wumpus) Breeze (adjacent square contains pit) Glitter (this square contains gold) Scream (Wumpus killed) Performance measures: Gold: +1000 Death: (falling into pit or eaten by Wumpus) -1 per step -10 for using the arrow

64 Wumpus world environment
Fully Observable? No…unaware of environment until we explore Deterministic (state of environment determined by current state and action)? Yes Static? Adversarial?

65 Wumpus world environment
Fully Observable? No…unaware of environment until we explore Deterministic (state of environment determined by current state and action)? Yes Static? Adversarial?

66 Exploring a wumpus world
A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square Language to represent knowledge

67 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold OK = Safe Square P = Pit S = Stench W = Wumpus

68 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square What can we infer?

69 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square

70 Exploring a wumpus world
Breeze = yes, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square B What can we infer?

71 Exploring a wumpus world
Breeze = yes, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square P? B P?

72 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = yes, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square P? B P? S What can we infer?

73 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = yes, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square P? B P? OK S W?

74 Exploring a wumpus world
Breeze = no, Glitter = no, Pit = no, Stench = no, Wumpus = no A = Agent B = Breeze G = Glitter/Gold P = Pit S = Stench W = Wumpus OK = Safe Square P? B OK S W?

75 Wumpus with propositional logic
Using logic statements, we can determine all of the “safe” squares How to implement?

76 Propositional logic Syntax: Defines what makes a valid statement:
Statements are constructed from propositions A proposition can be either true or false Proposition made up of symbols and connectives Semantics: Rules for determining the truth of a statement <Later>

77 Propositional Logic - Syntax
Symbols: represents a proposition that can be true or false ex. Breeze in [2, 1]  B2,1 Pit in [2, 2]  P2,2 ex. n is Odd  n_odd Connectives: proposition operators Negation: not, , ~ Conjunction: and,  Disjunction: or,  Implication: implies, => Biconditional: iff, <=>

78 Propositional Logic - Syntax
Symbols: represents a proposition that can be true or false ex. Breeze in [2, 1]  B2,1 Pit in [2, 2]  P2,2 ex. n is Odd  n_odd Connectives: proposition operators Negation: not, , ~ Conjunction: and,  Disjunction: or,  Implication: implies, => Biconditional: iff, <=>

79 Propositional Logic - Syntax
Sentence: statement composed of symbols and operators ex. P2,2  P1,3 Formally: Sentence: True | False | Symbol | Sentence| Sentence  Sentence | Sentence  Sentence | Sentence => Sentence | Sentence <=> Sentence

80 Propositional logic Syntax: Defines what makes a valid statement:
Statements are constructed from propositions A proposition can be either true or false Proposition made up of symbols and connectives Semantics: Rules for determining the truth of a statement Truth table Rules of logic

81 Propositional Logic Semantics
Some Rules of Logic: Modus Ponens: P => Q, P: can derive Q deMorgan’s: (AB): can derive A  B

82 Inference with Propositional Logic
Suppose we want to infer something: Wumpus in (2, 2) Goal: Given initial knowledge base, use semantics to make inferences to expand knowledge and ultimately prove a proposition Look familiar?

83 Inference with propositional logic
View it as a search problem: starting state: Initial Knowledge Base (KB) actions: all ways of deriving new propositions from the current KB result: add the new proposition to the KB goal: end up with the proposition we want to prove


Download ppt "Hierarchical and K-means Clustering"

Similar presentations


Ads by Google