I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.

Slides:



Advertisements
Similar presentations
I/O-Algorithms Lars Arge Spring 2012 April 17, 2012.
Advertisements

Computational Geometry
Orthogonal range searching. The problem (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of.
Dynamic Planar Convex Hull Operations in Near- Logarithmic Amortized Time TIMOTHY M. CHAN.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Lars Arge 1/43 Big Terrain Data Analysis Algorithms in the Field Workshop SoCG June 19, 2012 Lars Arge.
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
Convex Hull obstacle start end Convex Hull Convex Hull
External Memory Geometric Data Structures
Query Processing in Databases Dr. M. Gavrilova.  Introduction  I/O algorithms for large databases  Complex geometric operations in graphical querying.
I/O-Algorithms Lars Arge January 31, Lars Arge I/O-algorithms 2 Random Access Machine Model Standard theoretical model of computation: –Infinite.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
Lecture 12 : Special Case of Hidden-Line-Elimination Computational Geometry Prof. Dr. Th. Ottmann 1 Special Cases of the Hidden Line Elimination Problem.
Optimal Planar Point Enclosure Indexing Lars Arge, Vasilis Samoladas and Ke Yi Department of Computer Science Duke University Technical University of Crete.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Aarhus University March 16, 2006.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
Special Cases of the Hidden Line Elimination Problem Computational Geometry, WS 2007/08 Lecture 16 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen,
I/O-Algorithms Lars Arge Spring 2009 January 27, 2009.
I/O-Algorithms Lars Arge Spring 2007 January 30, 2007.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge Aarhus University February 7, 2005.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
I/O-Algorithms Lars Arge Spring 2009 April 28, 2009.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University February 6, 2007.
I/O-Algorithms Lars Arge Spring 2006 February 2, 2006.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj.
I/O-Algorithms Lars Arge Aarhus University April 16, 2008.
I/O-Algorithms Lars Arge Aarhus University February 9, 2006.
I/O-Algorithms Lars Arge Aarhus University March 9, 2006.
I/O-Algorithms Lars Arge Aarhus University February 14, 2008.
I/O-Algorithms Lars Arge Aarhus University March 6, 2007.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University University of Aarhus.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
I/O-Algorithms Lars Arge Spring 2008 January 31, 2008.
1 Geometric Intersection Determining if there are intersections between graphical objects Finding all intersecting pairs Brute Force Algorithm Plane Sweep.
I/O-Algorithms Lars Arge Fall 2014 August 28, 2014.
Heavily based on slides by Lars Arge I/O-Algorithms Thomas Mølhave Spring 2012 February 9, 2012.
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
MotivationFundamental ProblemsProblems on Graphs Parallel processors are becoming common place. Each core of a multi-core processor consists of a CPU and.
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis Pankaj K. Agarwal, Lars Arge, Ke Yi Duke University University of Aarhus.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Lecture 3: External Memory Indexing Structures (Contd) CS6931 Database Seminar.
External Memory Geometric Data Structures Lars Arge Duke University June 27, 2002 Summer School on Massive Datasets.
Lecture 1: Basic Operators in Large Data CS 6931 Database Seminar.
Problem Definition I/O-efficient Rectangular Segment Search Gautam K. Das and Bradford G. Nickerson Faculty of Computer science, University of New Brunswick,
Optimal Planar Orthogonal Skyline Counting Queries Gerth Stølting Brodal and Kasper Green Larsen Aarhus University 14th Scandinavian Workshop on Algorithm.
ProblemData StructuresLower Bound Preprocess a set of N 3-dimensional points into an I/O-efficient data structure, such that all points inside an axis.
Internal Memory Pointer MachineRandom Access MachineStatic Setting Data resides in records (nodes) that can be accessed via pointers (links). The priority.
Michal Balas1 I/O-efficient Point Location using Persistent B-Trees Lars Arge, Andrew Danner, and Sha-Mayn Teh Department of Computer Science, Duke University.
Query Processing in Databases Dr. M. Gavrilova
Algorithm design techniques Dr. M. Gavrilova
Advanced Topics in Data Management
Advanced Topics in Data Management
8th Workshop on Massive Data Algorithms, August 23, 2016
Presentation transcript:

I/O-Algorithms Lars Arge University of Aarhus March 7, 2005

Lars Arge I/O-algorithms 2 I/O-Model Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory T = # output size in searching problem I/O: Movement of block between memory and disk D P M Block I/O

Lars Arge I/O-algorithms 3 Fundamental Bounds [AV88] Internal External Scanning: N Sorting: N log N Permuting –List rank Searching:

Lars Arge I/O-algorithms 4 Until now: Data Structures B-trees –Trees with fanout B, balanced using split/fuse – query, space, update Weight-balanced B-trees –Weight balancing constraint rather than degree constraint –Ω(w(v)) updates below v between consecutive operations on v Persistent B-trees –Update current version (getting new version) –Query all versions Buffer-trees – amortized bounds using buffers and lazyness

Lars Arge I/O-algorithms 5 Until now: Data Structures Special cases of two-dimensional range search: –Diagonal corner queries: External interval tree –Three-sided queries: External priority search tree query, space, update Same bounds cannot be obtained for general planar range searching q3q3 q2q2 q1q1 q q

Lars Arge I/O-algorithms 6 Until now: Data Structures General planer range searching: –External range tree: query, space, update –O-tree: query, space, update q3q3 q2q2 q1q1 q4q4

Lars Arge I/O-algorithms 7 Techniques Tools: –B-trees –Persistent B-trees –Buffer trees –Logarithmic method –Weight-balanced B-trees –Global rebuilding Techniques: –Bootstrapping –Filtering q3q3 q2q2 q1q1 q3q3 q2q2 q1q1 q4q4 (x,x)

Lars Arge I/O-algorithms 8 Point Enclosure Queries Dual of 2d range searching problem –Report all rectangles containing query point (x,y) Internal memory: –Can be solved in O(N) space and O(log N + T) time x y

Lars Arge I/O-algorithms 9 Point Enclosure Queries Similarity between internal and external results (space, query) –in general tradeoff between space and query I/O InternalExternal 1d range search(N, log N + T)(N/B, log B N + T/B) 3-sided 2d range search(N, log N + T)(N/B, log B N + T/B) 2d range search 2d point enclosure(N, log N + T) (N/B, log N + T/B)? 2 B (N/B, log N+T/B) (N/B 1-ε, log B N+T/B)

Lars Arge I/O-algorithms 10 Rectangle Range Searching Report all rectangles intersecting query rectangle Q Often used in practice when handling complex geometric objects Q

Lars Arge I/O-algorithms 11 R-Trees Herman!

Lars Arge I/O-algorithms 12 Geometric Algorithms We will now (shortly) look at geometric algorithms –Solves problem on set of objects Example: Orthogonal line segment intersection –Given set of axis-parallel line segments, report all intersections In internal memory many problems is solved using sweeping

Lars Arge I/O-algorithms 13 Plane Sweeping Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates) –Top endpoint of vertical segment: Insert in T –Bottom endpoint of vertical segment: Delete from T –Horizontal segment: Perform range query with x-interval on T

Lars Arge I/O-algorithms 14 Plane Sweeping In internal memory algorithm runs in optimal O(Nlog N+T) time In external memory algorithm performs badly (>N I/Os) if |T|>M –Even if we implements T as B-tree  O(Nlog B N+T/B) I/Os Solution: Distribution sweeping

Lars Arge I/O-algorithms 15 Distribution Sweeping Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each Sweep plane top-down while reporting intersections between –part of horizontal segment spanning slab(s) and vertical segments Distribute data to M/B-1 slabs –vertical segments and non-spanning parts of horizontal segments Recurse in each slab

Lars Arge I/O-algorithms 16 Distribution Sweeping Sweep performed in O(N/B+T’/B) I/Os  I/Os Maintain active list of vertical segments for each slab ( <B in memory) –Top endpoint of vertical segment: Insert in active list –Horizontal segment: Scan through all relevant active lists *Removing “expired” vertical segments *Reporting intersections with “non-expired” vertical segments

Lars Arge I/O-algorithms 17 Distribution Sweeping Other example: Rectangle intersection –Given set of axis-parallel rectangles, report all intersections.

Lars Arge I/O-algorithms 18 Distribution Sweeping Divide plane into M/B-1 slabs with O(N/(M/B)) endpoints each Sweep plane top-down while reporting intersections between –part of rectangles spanning slab(s) and other rectangles Distribute data to M/B-1 slabs – Non-spanning parts of rectangles Rcurse in each slab

Lars Arge I/O-algorithms 19 Distribution Sweeping Seems hard to perform sweep in O(N/B+T’/B) I/Os Solution: Multislabs –Reduce fanout of distribution to –Recursion height still –Room for block from each multislab (activlist) in memory

Lars Arge I/O-algorithms 20 Distribution Sweeping Sweep while maintaining rectangle active list for each multisslab –Top side of spanning rectangle: Insert in active multislab list –Each rectangle: Scan through all relevant multislab lists *Removing “expired” rectangles *Reporting intersections with “non-expired” rectangles  I/Os

Lars Arge I/O-algorithms 21 Distribution Sweeping Distribution sweeping can relatively easily be used to solve a number of other problems in the plane I/O-efficiently By decreasing distribution fanout to for c≥1 a number of higher-dimensional problems can also be solved I/O-efficiently

Lars Arge I/O-algorithms 22 Other Results Other geometric algorithms results include: –Red blue line segment intersection (using distribution sweep, buffer trees/batched filtering, external fractional cascading) –General planar line segment intersection (as above and external priority queue) –2d and 2d Convex hull: (Complicated deterministic 3d and simpler randomized) –2d Delaunay triangulation