Greedy clustering methods

Slides:



Advertisements
Similar presentations
Lindsey Bleimes Charlie Garrod Adam Meyerson
Advertisements

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Point-and-Line Problems. Introduction Sometimes we can find an exisiting algorithm that fits our problem, however, it is more likely that we will have.
Discrete geometry Lecture 2 1 © Alexander & Michael Bronstein
ALADDIN Workshop on Graph Partitioning in Vision and Machine Learning Jan 9-11, 2003 Welcome! [Organizers: Avrim Blum, Jon Kleinberg, John Lafferty, Jianbo.
Clustering and greedy algorithms — Part 2 Prof. Noah Snavely CS1114
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Stereo Computation using Iterative Graph-Cuts
Clustering and greedy algorithms Prof. Noah Snavely CS1114
Lecture 2 Geometric Algorithms. A B C D E F G H I J K L M N O P Sedgewick Sample Points.
Voronoi diagrams and applications Prof. Ramin Zabih
Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Image segmentation Prof. Noah Snavely CS1114
Greedy Algorithms and Matroids Andreas Klappenecker.
Optimization revisited Prof. Noah Snavely CS1114
Medians and blobs Prof. Ramin Zabih
CS654: Digital Image Analysis
Clustering Prof. Ramin Zabih
CS 361 – Chapter 10 “Greedy algorithms” It’s a strategy of solving some problems –Need to make a series of choices –Each choice is made to maximize current.
1 Algorithms CSCI 235, Fall 2015 Lecture 29 Greedy Algorithms.
1 CS 430: Information Discovery Lecture 24 Cluster Analysis.
What types of problems we study, Part 1: Statistical problemsHighlights of the theoretical results What types of problems we study, Part 2: ClusteringFuture.
Lecture 3: Uninformed Search
Optimization Problems
Optimization via Search
School of Computer Science & Engineering
Lightstick orientation; line fitting
CSCI 4310 Lecture 10: Local Search Algorithms
Approximation Algorithms
For Monday Chapter 6 Homework: Chapter 3, exercise 7.
Haim Kaplan and Uri Zwick
Spanning Trees Lecture 21 CS2110 – Fall 2016.
Lecture 16 CSE 331 Oct 5, 2016.
Spanning Trees, greedy algorithms
CS 3343: Analysis of Algorithms
CS330 Discussion 6.
Greedy Algorithms.
Objective of This Course
Lecture 20 CSE 331 Oct 14, 2016.
Lecture 17 CSE 331 Oct 7, 2016.
Prof. Ramin Zabih Least squares fitting Prof. Ramin Zabih
Optimization Problems
Data Structures Review Session
Lecture 31: Graph-Based Image Segmentation
Discrete Mathematics CMP-101 Lecture 12 Sorting, Bubble Sort, Insertion Sort, Greedy Algorithms Abdul Hameed
Autumn 2015 Lecture 10 Minimum Spanning Trees
Advanced Algorithms Analysis and Design
Lecture 16 CSE 331 Oct 4, 2017.
Graph Searching.
Lecture 6 Topics Greedy Algorithm
Prof. Ramin Zabih Beyond binary search Prof. Ramin Zabih
Lecture 18 CSE 331 Oct 12, 2011.
Sorting "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
CS5112: Algorithms and Data Structures for Applications
More on Search: A* and Optimization
Clustering Wei Wang.
Lecture 18 CSE 331 Oct 9, 2017.
Algorithms CSCI 235, Spring 2019 Lecture 29 Greedy Algorithms
Dynamic Programming II DP over Intervals
Richard Anderson Winter 2019 Lecture 7
Clustering.
Local Search Algorithms
CSSE463: Image Recognition Day 23
Lecture 2 Geometric Algorithms.
Approximation Algorithms
Quicksort and selection
Spanning Trees, greedy algorithms
Getting to the bottom, fast
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Prims’ spanning tree algorithm
Presentation transcript:

Greedy clustering methods Prof. Ramin Zabih http://cs100r.cs.cornell.edu

Administrivia Guest lecture on Tuesday Prof. Charlie van Loan, on ellipse fitting Prelim 3: Thursday 11/29 (last lecture) Review session the day before A6 is due Friday 11/30 (LDOC) Final projects due Friday 12/7 Course evals! http://www.engineering.cornell.edu/courseeval/

Clustering Generally speaking, the goal is to maximize similarity within a cluster, and to minimize similarity between clusters We will assume that the space and the number of clusters are known Usually (but not always) a safe assumption Sometimes you only know distances between pairs of points May not be embeddable in a space Often assume a known number of clusters

Figure from Johan Everts Example Figure from Johan Everts

Applications of clustering Economics or politics Finding similar-minded or similar behaving groups of people (market segmentation) Find stocks that behave similarly Spatial clustering Earthquake centers cluster along faults Find epidemics (famous example: cholera) Classify documents for web search Automatic directory construction (like Yahoo!) Find more documents “like this one”

Clustering algorithms There are many types of methods How is a cluster is represented? Is a cluster represented by a data point, or by a point in the middle of the cluster? This is similar to an argument we saw before… An interesting class of methods uses graph partitioning Edge weights are distances There are many classes of algorithms

Greedy methods Many CS problems can be solved by repeatedly doing whatever seems best at the moment I.e., without needing a long-term plan These are called greedy algorithms Example: hill climbing for convex function minimization Example: sorting by swapping out-of-order pairs

K-center clustering Find K cluster centers that minimize the maximum distance between any point and its nearest center We want the worst point in the worst cluster to still be good (i.e., close to its center) This is a nice optimization problem Given a clustering, we can describe how good it is, so we have a figure of merit But to find the optimal clustering we have to try every one!

A greedy method Pick a random point to start with, this is your first cluster center Find the farthest point from the cluster center, this is a new cluster center Find the farthest point from any cluster center and add it

Example Figure from Jia Li

An amazing property This algorithm gives you a figure of merit that is no worse than twice the optimum Such results are very difficult to achieve, and the subject of much research Mostly in CS681, a bit in CS482 You can’t find the optimum, yet you can prove something about it!