I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Comments We consider in this topic a large class of related problems that deal with proximity of points in the plane. We will: 1.Define some proximity.
Dynamic Graph Algorithms - I
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
An Optimal Dynamic Interval Stabbing-Max Data Structure? Pankaj K. Agarwal, Lars Arge and Ke Yi Department of Computer Science Duke University.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Lars Arge 1/43 Big Terrain Data Analysis Algorithms in the Field Workshop SoCG June 19, 2012 Lars Arge.
Great Theoretical Ideas in Computer Science for Some.
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
The Divide-and-Conquer Strategy
External Memory Geometric Data Structures
Chapter 3 The Greedy Method 3.
I/O-Efficient Construction of Constrained Delaunay Triangulations Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Data Structures & Algorithms Graph Search Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Graphs & Graph Algorithms 2
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus.
I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.
From Elevation Data to Watershed Hierarchies Pankaj K. Agarwal Duke University Supported by ARO W911NF
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
External Memory Algorithms Kamesh Munagala. External Memory Model Aggrawal and Vitter, 1988.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
External-Memory MST (Arge, Brodal, Toma). Minimum-Spanning Tree Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is.
Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Important Problem Types and Fundamental Data Structures
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
Week -7-8 Topic - Graph Algorithms CSE – 5311 Prepared by:- Sushruth Puttaswamy Lekhendro Lisham.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
I/O-Efficient Graph Algorithms Norbert Zeh Duke University EEF Summer School on Massive Data Sets Århus, Denmark June 26 – July 1, 2002.
Chapter 6: Union-Find and Related Structures CS6310 ADVANCED DATA STRUCTURE SHADHA MUHI & HASNAA IMAD.
10/2/2015 3:00 PMCampus Tour1. 10/2/2015 3:00 PMCampus Tour2 Outline and Reading Overview of the assignment Review Adjacency matrix structure (§12.2.3)
Chapter 2 Graph Algorithms.
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
Union-find Algorithm Presented by Michael Cassarino.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Data Structures and Algorithms in Parallel Computing Lecture 2.
Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006.
Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick January 2014.
Union-Find  Application in Kruskal’s Algorithm  Optimizing Union and Find Methods.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
Equivalence Between Priority Queues and Sorting in External Memory
LIMITATIONS OF ALGORITHM POWER
ProblemAssumption and PreliminariesAlgorithm  How does the water flow and depressions fill during non-uniform rain over a terrain?  Can we efficiently.
Discrete Structures CISC 2315 FALL 2010 Graphs & Trees.
Introduction Terrain Level set and Contour tree Problem Maintaining the contour tree of a terrain under the following operation: ChangeHeight(v, r) : Change.
Union By Rank Ackermann’s Function Graph Algorithms Rajee S Ramanikanthan Kavya Reddy Musani.
Polygon Triangulation
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Campus Tour 11/16/2018 3:14 PM Campus Tour Campus Tour
Graph Algorithm.
Minimum Spanning Tree.
Graphs & Graph Algorithms 2
Campus Tour 2/23/ :26 AM Campus Tour Campus Tour
Important Problem Types and Fundamental Data Structures
A Variation of Minimum Latency Problem on Path, Tree and DAG
Presentation transcript:

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus

The Union-Find Problem A universe of N elements: x 1, x 2, …, x N Initially N singleton sets: {x 1 }, {x 2 }, …, {x N } Each set has a representative Maintain the partition under –Union( x i, x j ) : Joins the sets containing x i and x j –Find( x i ) : Returns the representative of the set containing x i

The Solution d bja eg h fl n m i srczk p representatives d bja eg h fl n m Union(d, h) : link-by-rank d bja eg h fl n Find(n) : path compression m

Complexity O(N α(N)) for a sequence of N union and find operations [Tarjan 75] – α() : Inverse Ackermann function (very slow!) –Optimal in the worst case [Tarjan79, Fredman and Saks 89] Batched (Off-line) version –Entire sequence known in advance –Can be improved to linear on RAM [Gabow and Tarjan 85] –Not possible on a pointer machine [Tarjan79]

Simple and Good, as long as … The entire data structure fits in memory

The I/O Model Main memory of size M Disk of infinite size One I/O transfers B items between memory and disk

Sources of “Non-Locality” Two operands in a union Nodes on a leaf-to-root path Operands in consecutive operations –Cannot remove for the on-line case Need to eliminate all of them in order to get less than one I/O per operation!

Our Results An I/O-efficient algorithm for the batched union-find problem using O(sort( N )) = O( N/B log M/B (N/B) ) I/Os –Same as sorting –optimal in the worst case A practical algorithm using O(sort( N ) log(N/M) ) I/Os –Implemented Applications to terrain analysis –Topological persistence : O(sort( N )) I/Os Implemented –Contour trees : O(sort( N )) I/Os

I/O-Efficient Batched Union-Find Assumption: No redundant unions –Each union must join two different sets –Will remove later Two-stage algorithm –Convert to interval union-find Compute an order on the elements s.t. each union joins two adjacent sets –Solve batched interval union-find

Union Tree r ab cdef ghi 1: Union(d, g) 2: Union(a, c) 3: Union(r, b) 4: Union(a, e) 5: Union(e, i) 6: Union(r, a) 7: Union(a, d) g 8: Union(d, h) r 9: Union(b, f) r ab cde f g h i Equivalent union trees

Transforming the Union Tree r ab cdef ghi r ab cdef g h i r ab c d efg h i r ab c d e f g h i Weights along root-to-leaf path decrease

Formulating as a Batched Problem r ab cdef ghi r ab c d e f g h i For each edge, find the lowest ancestor edge with a higher weight

Cast in a Geometry Setting r ab cdef ghi Euler Tour In O(sort( N )) I/Os [Chiang et al. 95] x : weight y : positions in the tour

Cast in a Geometry Setting r ab cdef ghi For each edge, find the lowest ancestor edge with a higher weight For each segment, find the shortest segment above and containing it

Distribution Sweeping M/B vertical slabs checked here checked recursively Total cost: O(sort( N ))

In-Order Traversal r ab c d e f g h i Weights along root-to-leaf path decrease At u, with child u 1,…, u k (in increasing order of weight) 1.Recursively visit subtree at u 1 2.Return u 3.For i=2,…, k Recursively visit subtree at u i br 8 aceigdhf Claim: this traversal produces the right order

Solving Interval Union-Find Union: x : two operands y : time stamp Find: x : operand y : time stamp Four instances of batched ray shooting: O(sort( N ))

Handling Redundant Unions Union tree becomes a graph Compute the minimum spanning tree –O(sort( N )) I/Os (randomized) [Chiang et al. 95] O(sort( N ) loglog B ) I/Os (deterministic) [Arge et al. 04] –Deterministic O(sort( N )) I/Os if graph is planar –Only MST edges are non-redundant

A Practical Algorithm Previous algorithm too complicated –2 Euler tours –4 instances of batched ray shooting –MST A simple and practical algorithm –Divide-and-conquer –O(sort( N ) log(N/M) ) I/Os –Implemented

Applications 1.Topological Persistence 2.Contour Trees

Topological Persistence

Formulated as Batched Union-Find Represented as a triangulated mesh Consider minimum-saddle pairs When reach –A minimum or maximum: do nothing –A regular poin u : Issue union( u,v ) for a lower neighbor v –A saddle u : let v and w be nodes from u ’s two connected pieces in its lower link Issue: find( v ), find( w ), union( u,v ), union( u,w ) lower link

Contour Trees

Previous Results Directly maintain contours –O( N log N ) time [van Kreveld et al. 97] –Needs union-split-find for circular lists –Do not extend to higher dimensions Two sweeps by maintaining components, then merge –O( N log N ) time [Carr et al. 03] –Extend to arbitrary dimensions

Join Tree and Split Tree Join tree Split tree Qualified nodes Join tree Split tree

Final Contour Tree Join tree Split tree Contour tree Hard to BATCH!

Another Characterization Join tree Split tree Contour tree u v w u v w u u w Let w be the highest node that is a descendant of v in join tree and ancestor of u in split tree, (u, w) is a contour tree edge Now can BATCH!

Experiment 1: Random Union-Find

Experiment 2: Topological Persistence on Terrain Data Neuse River Basin of NC

Experiment 2: Topological Persistence on Terrain Data

Summary An I/O-efficient algorithm for the batched union-find problem using O(sort( N )) = O( N/B log M/B (N/B) ) I/Os –optimal in the worst case A practical algorithm using O(sort( N ) log(N/M) ) I/Os Applications to terrain analysis –Topological persistence : O(sort( N )) I/Os –Contour trees : O(sort( N )) I/Os Open Question: On-line case –Can we get below O(N α(N)) I/Os?

Thank you!