Computational Geometry and Spatial Data Mining

Slides:



Advertisements
Similar presentations
Polygon Triangulation
Advertisements

Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
2/14/13CMPS 3120 Computational Geometry1 CMPS 3120: Computational Geometry Spring 2013 Planar Subdivisions and Point Location Carola Wenk Based on: Computational.
 Distance Problems: › Post Office Problem › Nearest Neighbors and Closest Pair › Largest Empty and Smallest Enclosing Circle  Sub graphs of Delaunay.
Voronoi Diagrams in n· 2 O(√lglg n ) Time Timothy M. ChanMihai Pătraşcu STOC’07.
Brute-Force Triangulation
CS16: Introduction to Data Structures & Algorithms
Polynomial Time Approximation Schemes Presented By: Leonid Barenboim Roee Weisbert.
One of the most important problems is Computational Geometry is to find an efficient way to decide, given a subdivision of E n and a point P, in which.
UNC Chapel Hill M. C. Lin Polygon Triangulation Chapter 3 of the Textbook Driving Applications –Guarding an Art Gallery –3D Morphing.
1 Voronoi Diagrams. 2 Voronoi Diagram Input: A set of points locations (sites) in the plane.Input: A set of points locations (sites) in the plane. Output:
The Divide-and-Conquer Strategy
By Groysman Maxim. Let S be a set of sites in the plane. Each point in the plane is influenced by each point of S. We would like to decompose the plane.
Advanced Topics in Algorithms and Data Structures Lecture 7.1, page 1 An overview of lecture 7 An optimal parallel algorithm for the 2D convex hull problem,
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
9/5/06CS 6463: AT Computational Geometry1 CS 6463: AT Computational Geometry Fall 2006 Plane Sweep Algorithms and Segment Intersection Carola Wenk.
Voronoi Diagram Presenter: GI1 11號 蔡逸凡
Lecture 7: Voronoi Diagrams Presented by Allen Miu Computational Geometry September 27, 2001.
Query Processing in Databases Dr. M. Gavrilova.  Introduction  I/O algorithms for large databases  Complex geometric operations in graphical querying.
Robert Pless, CS 546: Computational Geometry Lecture #3 Last Time: Convex Hulls Today: Plane Sweep Algorithms, Segment Intersection, + (Element Uniqueness,
Computational Geometry -- Voronoi Diagram
17. Computational Geometry Chapter 7 Voronoi Diagrams.
1 Lecture 8: Voronoi Diagram Computational Geometry Prof. Dr. Th. Ottmann Voronoi Diagrams Definition Characteristics Size and Storage Construction Use.
TU/e computational geometry introduction Mark de Berg.
UMass Lowell Computer Science Advanced Algorithms Computational Geometry Prof. Karen Daniels Spring, 2007 Chapter 5: Voronoi Diagrams Wednesday,
Voronoi Diagrams Computational Geometry, WS 2006/07 Lecture 10 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät.
Computational Geometry and Spatial Data Mining Marc van Kreveld Department of Information and Computing Sciences Utrecht University.
Mark Waitser Computational Geometry Seminar December Iterated Snap Rounding.
Computing the Delaunay Triangulation By Nacha Chavez Math 870 Computational Geometry; Ch.9; de Berg, van Kreveld, Overmars, Schwarzkopf By Nacha Chavez.
Voronoi Diagrams.
1 University of Denver Department of Mathematics Department of Computer Science.
Applications of Voronoi Diagrams to GIS Rodrigo I. Silveira Universitat Politècnica de Catalunya Geometria Computacional FIB - UPC.
Area, buffer, description Area of a polygon, center of mass, buffer of a polygon / polygonal line, and descriptive statistics.
Brute-Force Triangulation
Efficient Partition Trees Jiri Matousek Presented By Benny Schlesinger Omer Tavori 1.
Delaunay Triangulations Presented by Glenn Eguchi Computational Geometry October 11, 2001.
Advanced Algorithm Design and Analysis (Lecture 9) SW5 fall 2004 Simonas Šaltenis E1-215b
CSE53111 Computational Geometry TOPICS q Preliminaries q Point in a Polygon q Polygon Construction q Convex Hulls Further Reading.
A Navigation Mesh for Dynamic Environments Wouter G. van Toll, Atlas F. Cook IV, Roland Geraerts CASA 2012.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Triangulating a monotone polygon
Voronoi diagrams and applications Prof. Ramin Zabih
UNC Chapel Hill M. C. Lin Line Segment Intersection Chapter 2 of the Textbook Driving Applications –Map overlap problems –3D Polyhedral Morphing.
C o m p u t i n g C O N V E X H U L L S. Presentation Outline 2D Convex Hulls –Definitions and Properties –Approaches: Brute Force Gift Wrapping QuickHull.
Center for Graphics and Geometric Computing, Technion 1 Computational Geometry Chapter 8 Arrangements and Duality.
1 Prune-and-Search Method 2012/10/30. A simple example: Binary search sorted sequence : (search 9) step 1  step 2  step 3  Binary search.
Center for Graphics and Geometric Computing, Technion 1 Computational Geometry Chapter 8 Arrangements and Duality.
2/19/15CMPS 3130/6130 Computational Geometry1 CMPS 3130/6130 Computational Geometry Spring 2015 Voronoi Diagrams Carola Wenk Based on: Computational Geometry:
CMPS 3130/6130 Computational Geometry Spring 2015
L8 - Delaunay triangulation L8 – Delaunay triangulation NGEN06(TEK230) – Algorithms in Geographical Information Systems.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Geometric Description
UNC Chapel Hill M. C. Lin Computing Voronoi Diagram For each site p i, compute the common inter- section of the half-planes h(p i, p j ) for i  j, using.
UNC Chapel Hill M. C. Lin Delaunay Triangulations Reading: Chapter 9 of the Textbook Driving Applications –Height Interpolation –Constrained Triangulation.
9/8/10CS 6463: AT Computational Geometry1 CS 6463: AT Computational Geometry Fall 2010 Triangulations and Guarding Art Galleries Carola Wenk.
Center for Graphics and Geometric Computing, Technion 1 Computational Geometry Chapter 9 Line Arrangements.
1/20/15CMPS 3130/6130 Computational Geometry1 CMPS 3130/6130 Computational Geometry Spring 2015 Plane Sweep Algorithms I Carola Wenk.
Polygon Triangulation
Computational Geometry
VORONOI DIAGRAMS BY KATHARINE TISCHER Coordinating Seminar Spring 2013.
Computer Graphics Filling.
CMPS 3130/6130 Computational Geometry Spring 2017
Query Processing in Databases Dr. M. Gavrilova
Algorithm design techniques Dr. M. Gavrilova
Localizing the Delaunay Triangulation and its Parallel Implementation
Craig Schroeder October 26, 2004
Chapter 7 Voronoi Diagrams
Coverage Approximation Algorithms
Computational Geometry
Presentation transcript:

Computational Geometry and Spatial Data Mining Marc van Kreveld Department of Information and Computing Sciences Utrecht University

Two-part presentation Morning: Introduction to computational geometry with examples from spatial data mining Afternoon: Geometric algorithms for spatial data mining (and spatio-temporal data mining)

Spatial data mining and computation “Geographic data mining involves the application of computational tools to reveal interesting patterns in objects and events distributed in geographic space and across time” (Miller & Han, 2001) [  data analysis ? ] Large data sets  attempt to carefully define interesting patterns (to avoid finding non-interesting patterns)  advanced algorithms needed for efficiency

Introduction to CG Some words on algorithms and efficiency Computational geometry algorithms through examples from spatial data mining Voronoi diagrams and clustering Arrangements and largest clusters Approximation for the largest cluster

Algorithms and efficiency You may know it all already: Please look bored if you know all of this Please look bewildered if you haven’t got a clue what I’m talking about

Algorithms Computational problems have an input size, denoted by n A set of n numbers A set of n points in the plane (2n coordinates) A simple polygon with n vertices A planar subdivision with n vertices A computational problem defines desired output in terms of the input

Algorithms Examples of computational problems: Given a set of n numbers, put them in sorted order Given a set of n points, find the two that are closest Given a simple polygon P with n vertices and a point q, determine if q is inside P P q

Algorithms An algorithm is a scheme (sequence of steps) that always gives the desired output from the given input An algorithm solves a computational problem An algorithm is the basis of an implementation

Algorithms An algorithm can be analyzed for its running time efficiency Efficiency is expressed using O(..) notation, it gives the scaling behavior of the algorithm O(n) time: the running time doubles (roughly) if the input size doubles O(n2) time: the running time quadruples (roughly) if the input size doubles

Algorithms Why big-Oh notation? Because it is machine-independent Because it is programming language-independent Because it is compiler-independent unlike running time in seconds It is only algorithm/method-dependent

Algorithms Algorithms research is concerned with determining the most efficient algorithm for each computational problem Until ~1978: O(n2) time Until 1990: O(n log n) time Now: O(n) time } polygon triangulation

Algorithms For some problems, efficient algorithms are unknown to exist Approximation algorithms may be an option. E.g. TSP Exact: exponential time 2-approx: O(n log n) time 1.5-approx: O(n3) time (1+)-approx: O(n1/) time

Voronoi diagrams and clustering A Voronoi diagram stores proximity among points in a set

Voronoi diagrams and clustering Single-link clustering attempts to maximize the distance between any two points in different sets

Voronoi diagrams and clustering

Voronoi diagrams and clustering

Voronoi diagrams and clustering Algorithm (point set P; desired: k clusters): Compute Voronoi diagram of P Take all O(n) neighbors and sort by distance While #clusters > k do Take nearest neighbor pair p and q If they are in different clusters, then merge them and decrement #clusters (else, do nothing)

Voronoi diagrams and clustering Analysis; n points in P: Compute Voronoi diagram: O(n log n) time Sort by distance: O(n log n) time While loop that merges clusters: O(n log n) time (using union-find structure) Total: O(n log n) + O(n log n) + O(n log n) = O(n log n) time

Voronoi diagrams and clustering What would an “easy” algorithm have given? really easy: O(n3) time slightly less easy: O(n2 log n) time n3 time 10 n2 log n 1000 n log n 100 200 300

Computing Voronoi diagrams By plane sweep By randomized incremental construction By divide-and-conquer  all give O(n log n) time

Computing Voronoi diagrams Study the geometry, find properties 3-point empty circle  Voronoi vertex 2-point empty circle  Voronoi edge

Computing Voronoi diagrams Some geometric properties are needed, regardless of the computational approach Other geometric properties are only needed for some approach

Computing Voronoi diagrams Fortune’s sweep line algorithm (1987) An imaginary line moves from left to right The Voronoi diagram is computed while the known space expands (left of the line)

Computing Voronoi diagrams Beach line: boundary between known and unknown  sequence of parabolic arcs Geometric property: beach line is y-monotone  it can be stored in a balanced binary tree

Computing Voronoi diagrams Events: changes to the beach line = discovery of Voronoi diagram features Point events

Computing Voronoi diagrams Events: changes to the beach line = discovery of Voronoi diagram features Point events

Computing Voronoi diagrams Events: changes to the beach line = discovery of Voronoi diagram features Circle events

Computing Voronoi diagrams Events: changes to the beach line = discovery of Voronoi diagram features Circle events

Computing Voronoi diagrams Events: changes to the beach line = discovery of Voronoi diagram features Only point events and circle events exist

Computing Voronoi diagrams For n points, there are n point events at most 2n circle events

Computing Voronoi diagrams Handling an event takes O(log n) time due to the balanced binary tree that stores the beach line  in total O(n log n) time

Intermediate summary Voronoi diagrams are useful for clustering (among many other things) Voronoi diagrams can be computed efficiently in the plane, in O(n log n) time The approach is plane sweep (by Fortune) Figures from the on-line animation of Allan Odgaard & Benny Kjær Nielsen

Arrangements and largest clusters Suppose we want to identify the largest subset of points that is in some small region formalize “region” to circle formalize “small’’ to radius r Place circle to maximize point containment r

Arrangements and largest clusters Bad idea: Try m = 1, 2, ... and test every subset of size m Not so bad idea: for every 3 points, compute the smallest enclosing circle, test the radius and test the other points for being inside

Arrangements and largest clusters Bad idea analysis: A set of n points has roughly ( ) = O(nm) subsets of size m Not so bad idea analysis: n points give ( ) = O(n3) triples of points. Each can be tested in O(n) time  O(n4) time algorithm n m n 3

Arrangements and largest clusters The placement space of circles of radius r C p A circle C of radius r contains a point p if and only if the center of C lies inside a circle of radius r centered at p

Arrangements and largest clusters The placement space of circles of radius r Circles with center here contain 2 points of P Circles with center here contain 3 points of P

Arrangements and largest clusters Maximum point containment is obtained for circles whose center lies in the most covered cell of the placement space

Computing the most covered cell Compute the circle arrangement in a topological data structure Fill the cells by the cover value by traversal of the arrangement The value to be assigned to a cell is +/- 1 of its (known) neighbor 1 2 1 1 2 3 1

Computing the most covered cell Compute the circle arrangement: by plane sweep: O(n log n + k log n) time by randomized incremental construction in O(n log n + k) time where k is the complexity of the arrangement; k = O(n2) If the maximum coverage is denoted m, then k = O(nm) and the running time is O(n log n + nm)

Computing the most covered cell Randomized incremental construction: Put circles in random order “Glue” them into the topological structure for the arrangement with vertical extensions Every cell has ≤ 4 sides (2 vertical and 2 circular)

Computing the most covered cell Every cell has ≤ 4 sides (2 vertical and 2 circular) Trace a new circle from its leftmost point to glue it into the arrangement  the exit from any cell can be determined in O(1) time

Computing the most covered cell Randomized analysis can show that adding one circle C takes O(log n + k’ ) time, where k’ is the number of intersections with C The whole algorithm takes O(n log n + k) time, where k =  k’ is the arrangement size The O(n + k) vertical extensions can be removed in O(n + k) time

Computing the most covered cell Traverse the arrangement (e.g., depth-first search) to fill the cover numbers in O(n + k) time into a circle +1 out of a circle -1

Intermediate summary The largest cluster for a circle of radius r can be computed in O(n log n + nm) time if it has m entities We use arrangement construction and traversal The technique for arrangement construction is randomized incremental construction (Mulmuley, 1990)

Largest cluster for approximate radius Suppose the specified radius r for a cluster is not so strict, e.g. it may be 10% larger Place circle to maximize point containment We may choose epsilon! r (1+) r If the largest cluster of radius r has m entities, we must guarantee to find a cluster of m entities and radius (1+) r

Approximate radius clustering The idea: snap the entity locations to grid points of a well-chosen grid Snapping should not move points too much: less than r /4  grid spacing r /4 works

Approximate radius clustering The idea: snap the entity locations to grid points of a well-chosen grid 1 1 1 1 1 1 For each grid point, collect and add the count of all grid points within distance (1+/2) r 2 1 2 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1

Approximate radius clustering The idea: snap the entity locations to grid points of a well-chosen grid 1 1 1 1 1 1 For each grid point, collect and add the count of all grid points within distance (1+/2) r 2 1 2 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 Collected count = 10

Approximate radius clustering The idea: snap the entity locations to grid points of a well-chosen grid 1 1 1 1 8 1 1 For each grid point, collect and add the count of all grid points within distance (1+/2) r 2 9 1 2 2 1 10 1 1 1 1 1 6 1 2 2 1 1 1 1 1 1 1 1 1 1

Approximate radius clustering Claim: a largest approximate radius cluster is given by the highest count 1 1 1 1 8 1 1 2 9 1 2 2 1 10 1 1 1 1 1 6 1 2 2 1 1 1 1 1 1 1 1 1 1

Approximate radius clustering Let Copt be a radius-r circle with the most entities inside Due to the grid spacing, we have a grid point within distance r /4 from the center of Copt that must have a count

Approximate radius clustering Snapping moves entities at most r /4 C and Copt differ in radius r /2  no point in Copt can have moved outside C Snapped points inside C have their origins inside a circle of radius at most (1+) r  no points too far from C can have entered C

Approximate radius clustering Intuition: We use the  in different places Snapping points Trying only circle centers on grid points ... and we guarantee to test a circle that contains all entities in the optimal circle, but not other entities too far away

Approximate radius clustering Efficiency analysis n entities: each gives a count to O(1/2) grid cells in O(n /2) time we have all collected counts and hence the largest count

Exact or approximate? O(n log n + nm) versus O(n /2) time In practice: What is larger: m or 1 /2 ? If the largest cluster is expected to be fairly small, then the exact algorithm is fine If the largest cluster may be large and we don’t care about the precise radius, the approximate radius algorithm is better

Concluding this session Basic computational geometry ... Voronoi diagrams, arrangements, -approximation techniques ... is already useful for spatial data mining Afternoon: spatial and spatio-temporal data mining and more geometric algorithms