1 Efficient Algorithms for Non-Parametric Clustering With Clutter Weng-Keen Wong Andrew Moore.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Chapter 9 Greedy Technique. Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible - b feasible.
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Hierarchical Clustering, DBSCAN The EM Algorithm
Weighted graphs Example Consider the following graph, where nodes represent cities, and edges show if there is a direct flight between each pair of cities.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Comments We consider in this topic a large class of related problems that deal with proximity of points in the plane. We will: 1.Define some proximity.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Chapter 4 The Greedy Approach. Minimum Spanning Tree A tree is an acyclic, connected, undirected graph. A spanning tree for a given graph G=, where E.
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
Clustering (1) Clustering Similarity measure Hierarchical clustering Model-based clustering Figures from the book Data Clustering by Gan et al.
Chapter 3 The Greedy Method 3.
Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.
On Computing Compression Trees for Data Collection in Wireless Sensor Networks Jian Li, Amol Deshpande and Samir Khuller Department of Computer Science,
Locally Constraint Support Vector Clustering
7.3 Kruskal’s Algorithm. Kruskal’s Algorithm was developed by JOSEPH KRUSKAL.
CSE 780 Algorithms Advanced Algorithms Minimum spanning tree Generic algorithm Kruskal’s algorithm Prim’s algorithm.
HCS Clustering Algorithm
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Scalable Data Mining The Auton Lab, Carnegie Mellon University Brigham Anderson, Andrew Moore, Dan Pelleg, Alex Gray, Bob Nichols, Andy.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Chapter 9 Greedy Technique Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Topological Data Analysis MATH 800 Fall Topological Data Analysis (TDA) An ε-chain is a finite sequence of points x 1,..., x n such that |x i –
Feature Subset Selection using Minimum Cost Spanning Trees Mike Farah Supervisor: Dr. Sid Ray.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
1 Efficient Algorithms for Non-parametric Clustering With Clutter Weng-Keen Wong Andrew Moore (In partial fulfillment of the speaking requirement)
Minimum Spanning Trees What is a MST (Minimum Spanning Tree) and how to find it with Prim’s algorithm and Kruskal’s algorithm.
October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.
Routing and Network Design: Algorithmic Issues Kamesh Munagala Duke University.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Algorithms: Design and Analysis Summer School 2013 at VIASM: Random Structures and Algorithms Lecture 3: Greedy algorithms Phan Th ị Hà D ươ ng 1.
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Minimum Spanning Trees Easy. Terms Node Node Edge Edge Cut Cut Cut respects a set of edges Cut respects a set of edges Light Edge Light Edge Minimum Spanning.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 9 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
CS654: Digital Image Analysis Lecture 30: Clustering based Segmentation Slides are adapted from:
Minimal Spanning Tree Problems in What is a minimal spanning tree An MST is a tree (set of edges) that connects all nodes in a graph, using.
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
CS654: Digital Image Analysis Lecture 5: Pixels Relationships.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
1 Greedy Technique Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible b locally optimal.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Clustering (1) Chapter 7. Outline Introduction Clustering Strategies The Curse of Dimensionality Hierarchical k-means.
Detection of closed sharp edges in point clouds Speaker: Liuyu Time:
1 Assignment #3 is posted: Due Thursday Nov. 15 at the beginning of class. Make sure you are also working on your projects. Come see me if you are unsure.
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by David Williams Paper Discussion Group ( )
May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
1 8. Estimating the cluster tree of a density from the MST by Runt Pruning Problem: 1-nn density estimate is very noisy --- singularity at each observation.
CSE 4705 Artificial Intelligence
Data Science Algorithms: The Basic Methods
Chapter 5. Greedy Algorithms
Minimum Spanning Trees
CS 3343: Analysis of Algorithms
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Enumerating Distances Using Spanners of Bounded Degree
Minimum Spanning Tree.
Graphs & Graph Algorithms 2
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Introduction Wireless Ad-Hoc Network
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Minimum Spanning Trees (MSTs)
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research)
Data Mining CSCI 307, Spring 2019 Lecture 23
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

1 Efficient Algorithms for Non-Parametric Clustering With Clutter Weng-Keen Wong Andrew Moore

2 Problems From the Physical Sciences Minefield detection (Dasgupta and Raftery 1998) Earthquake faults (Byers and Raftery 1998)

3 Problems From the Physical Sciences (Pereira 2002)(Sloan Digital Sky Survey 2000)

4 A Simplified Example

5 Clustering with Traditional Algorithms Single Linkage ClusteringMixture of Gaussians with a Uniform Background Component

6 Clustering with CFF Cuevas-Febrero-FraimanOriginal Dataset

7 Related Work (Dasgupta and Raftery 98) Mixture model approach – mixture of Gaussians for features, Poisson process for clutter (Byers and Raftery 98) K-nearest neighbour distances for all points modeled as a mixture of two gamma distributions, one for clutter and one for the features Classify each data point based on which component it was most likely generated from

8 Outline 1. Introduction: Clustering and Clutter 2. The Cuevas-Febreiro-Fraiman Algorithm 3. Optimizing Step One of CFF 4. Optimizing Step Two of CFF 5. Results

9 The CFF Algorithm Step One Find the high density datapoints

10 The CFF Algorithm Step Two Cluster the high density points using Single Linkage Clustering Stop when link length > 

11 The CFF Algorithm Originally intended to estimate the number of clusters Can also be used to find clusters against a noisy background

12 Step One: Non-Parametric Density Estimator A datapoint is a high density datapoint if: The number of datapoints within a hypersphere of radius h is > threshold c

13 Speeding up the Non-Parametric Density Estimator Addressed in a separate paper (Gray and Moore 2001) Two basic ideas: 1. Use a dual tree algorithm (Gray and Moore 2000) 2. Cut search off early without computing exact densities (Moore 2000)

14 Step Two: Euclidean Minimum Spanning Trees (EMSTs) Traditional MST algorithms assume you are given all the distances Implies O(N 2 ) memory usage Want to use a Euclidean Minimum Spanning Tree algorithm

15 Optimizing Clustering Step Exploit recent results in computational geometry for efficient EMSTs Involves modification to GeoMST2 algorithm by (Narasimhan et al 2000) GeoMST2 is based on Well-Separated Pairwise Decompositions (WSPDs) (Callahan 1995) Our optimizations gain an order of magnitude speedup, especially in higher dimensions

16 Outline for Optimizing Step Two 1. High level overview of GeoMST2 2. Example of a WSPD 3. More detailed description of GeoMST2 4. Our optimizations

17 Intuition behind GeoMST2

18 Intuition behind GeoMST2

19 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition

20 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) Each Pair (A i,B i ) represents a possible edge in the MST 1.Create the Well- Separated Pairwise Decomposition

21 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition 2.Take the pair (A i,B i ) that corresponds to the shortest edge 3.If the vertices of that edge are not in the same connected component, add the edge to the MST. Repeat Step 2.

22 A Well-Separated Pair (Callahan 1995) Let A and B be point sets in  d Let R A and R B be their respective bounding hyper-rectangles Define MargDistance(A,B) to be the minimum distance between R A and R B

23 A Well-Separated Pair (Cont) The point sets A and B are considered to be well-separated if: MargDistance(A,B)  max{Diam(R A ),Diam(R B )}

24 A Well-Separated Pairwise Decomposition Pair #1: ([0],[1]) Pair #2: ([0,1], [2]) Pair #3: ([0,1,2],[3,4]) Pair #4: ([3], [4]) The set of pairs {([0],[1]), ([0,1], [2]), ([0,1,2],[3,4]), ([3], [4])} form a Well-Separated Pairwise Decomposition.

25 The Size of a WSPD If there are n points, a WSPD can be constructed with O(n) pairs using a fair split tree (Callahan 1995) (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) A WSPD

26 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition 2.Take the pair (A i,B i ) that corresponds to the shortest edge 3.If the vertices of that edge are not in the same connected component, add the edge to the MST. Repeat Step 2

27 Bichromatic Closest Pair Distance Given two sets (A i,B i ), the Bichromatic Closest Pair Distance is the closest distance from a point in A i to a point in B i

28 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition 2.Take the pair (A i,B i ) with the shortest BCP distance 3.If A i and B i are not already connected, add the edge to the MST. Repeat Step 2.

29 GeoMST2 Example Start Current MST

30 GeoMST2 Example Iteration 1 Current MST

31 GeoMST2 Example Iteration 2 Current MST

32 GeoMST2 Example Iteration 3 Current MST

33 GeoMST2 Example Iteration 4 Current MST

34 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition 2.Take the pair (A i,B i ) with the shortest BCP distance 3.If A i and B i are not already connected, add the edge to the MST. Repeat Step 2. Modification for CFF: If BCP distance > , terminate

35 Optimizations We don’t need the EMST We just need to cluster all points that are within  distance or less from each other Allows two optimizations to GeoMST2 code

36 High Level Overview of GeoMST2 (A 1,B 1 ) (A 2,B 2 ). (A m,B m ) 1.Create the Well- Separated Pairwise Decomposition 2.Take the pair (A i,B i ) with the shortest BCP distance 3.If A i and B i are not already connected, add the edge to the MST. Repeat Step 2. Optimizations take place in Step 1

37 Optimization 1 Illustration

38 Optimization 1 Ignore all links that are >  Every pair (A i, B i ) in the WSPD becomes an edge unless it joins two already connected components If MargDistance(A i,B i ) > , then an edge of length  cannot exist between a point in A i and B i Don’t include such a pair in the WSPD

39 Optimization 2 Illustration

40 Optimization 2 Join all elements that are within  distance of each other If the max distance separating the bounding hyper-rectangles of A i and B i is  , then join all the points in A i and B i if they are not already connected Do not add such a pair (A i,B i ) to the WSPD

41 Implications of the optimizations Reduce the amount of time spent in creating the WSPD Reduce the number of WSPDs, thereby speeding up the GeoMST2 algorithm by reducing the size of the priority queue

42 Results Ran step two algorithms on subsets of the Sloan Digital Sky Survey Compared Kruskal, GeoMST2, and  -clustering 7 attributes – 4 colors, 2 sky coordinates, 1 redshift value

43 Results (GeoMST2 vs  -Clustering vs Kruskal in 4D)

44 Results (GeoMST2 vs  -Clustering in 3D)

45 Results (GeoMST2 vs  -Clustering in 4D)

46 Results (Change in Time as  changes for 4D data)

47 Results (Increasing Dimensions vs Time

48 Conclusions  -clustering outperforms GeoMST2 by nearly an order of magnitude in higher dimensions Combining the optimizations in both steps will yield an efficient algorithm for clustering against clutter on massive data sets