UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

INTERVAL TREE & SEGMENTATION TREE
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Dynamic Planar Convex Hull Operations in Near- Logarithmic Amortized Time TIMOTHY M. CHAN.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing.
Searching on Multi-Dimensional Data
UNC Chapel Hill M. C. Lin Polygon Triangulation Chapter 3 of the Textbook Driving Applications –Guarding an Art Gallery –3D Morphing.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part II Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Orthogonal Range Searching 3Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Orthogonal Range Searching-1Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
Computational Geometry, WS 2007/08 Lecture 15 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für Angewandte Wissenschaften.
Lecture 5: Orthogonal Range Searching Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part III Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
1 Geometric Solutions for the IP-Lookup and Packet Classification Problem (Lecture 12: The IP-LookUp & Packet Classification Problem, Part II) Advanced.
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
UNC Chapel Hill M. C. Lin Overview of Last Lecture About Final Course Project –presentation, demo, write-up More geometric data structures –Binary Space.
© 2006 Pearson Addison-Wesley. All rights reserved11 A-1 Chapter 11 Trees.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 - Part I Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
O RTHOGONAL R ANGE S EARCHING الهه اسلامی فروردین 92, 1.
CHAPTER 12 Trees. 2 Tree Definition A tree is a non-linear structure, consisting of nodes and links Links: The links are represented by ordered pairs.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Fractional Cascading and Its Applications G. S. Lueker. A data structure for orthogonal range queries. In Proc. 19 th annu. IEEE Sympos. Found. Comput.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Multimedia Databases Chapter 4.
Binary Trees Chapter 6.
 This lecture introduces multi-dimensional queries in databases, as well as addresses how we can query and represent multi- dimensional data.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
Trees for spatial data representation and searching
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
UNC Chapel Hill M. C. Lin Linear Programming Reading: Chapter 4 of the Textbook Driving Applications –Casting/Metal Molding –Collision Detection Randomized.
UNC Chapel Hill M. C. Lin Line Segment Intersection Chapter 2 of the Textbook Driving Applications –Map overlap problems –3D Polyhedral Morphing.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
14/13/15 CMPS 3130/6130 Computational Geometry Spring 2015 Windowing Carola Wenk CMPS 3130/6130 Computational Geometry.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Multi-dimensional Search Trees
P ARALLEL O RTHOGONAL R ANGE S EARCHING Project Presentation by Savitha Parur Venkitachalam.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Orthogonal Range Search
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
Computational Geometry Piyush Kumar (Lecture 5: Range Searching) Welcome to CIS5930.
Segment Trees Basic data structure in computational geometry. Computational geometry.  Computations with geometric objects.  Points in 1-, 2-, 3-, d-space.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
UNC Chapel Hill M. C. Lin Computing Voronoi Diagram For each site p i, compute the common inter- section of the half-planes h(p i, p j ) for i  j, using.
Interval S = [3,10]  {x | 3 ≤ x ≤ 10} Closed segment S = (3,10)  {x | 3 < x < 10} Opened segment S = [3,3]  {3} Point.
UNC Chapel Hill M. C. Lin Delaunay Triangulations Reading: Chapter 9 of the Textbook Driving Applications –Height Interpolation –Constrained Triangulation.
CMPS 3130/6130 Computational Geometry Spring 2015
February 17, 2005Lecture 6: Point Location Point Location (most slides by Sergi Elizalde and David Pritchard)
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
UNC Chapel Hill M. C. Lin Geometric Data Structures Reading: Chapter 10 of the Textbook Driving Applications –Windowing Queries Related Application –Query.
May 2012Range Search Algorithms1 Shmuel Wimer Bar Ilan Univ. Eng. Faculty Technion, EE Faculty.
Geometric Data Structures
CMPS 3130/6130 Computational Geometry Spring 2017
Lecture 22 Binary Search Trees Chapter 10 of textbook
KD Tree A binary search tree where every node is a
Orthogonal Range Searching and Kd-Trees
Priority Search Trees Keys are pairs (x,y).
Reporting (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of the from: Given an interval I=[x1,x2]
Orthogonal Range Searching Querying a Database
Shmuel Wimer Bar Ilan Univ., School of Engineering
CMPS 3130/6130 Computational Geometry Spring 2017
CMPS 3130/6130 Computational Geometry Spring 2017
Presentation transcript:

UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal Structure Determination

UNC Chapel Hill M. C. Lin Interprete DB Queries Geometrically Transform records in database into points in multi-dimensional space. Transform queries on d -fields of records in the database into queries on this set of points in d -dimensional space. A query asking to report all records whose fields lie between specific values trans- forms to a query asking for all points inside a d -dimensional axis-parallel box.

UNC Chapel Hill M. C. Lin 1-D Range Searching Let P := {p 1, p 2, …, p n } be a given set of points on the real line. A query asks for the points inside a 1-D query rectangle -- i.e. an interval [ x:x’ ] Use a balanced binary search tree T. The leaves of T store the points of P and the internal nodes of T store splitting values to guide the search. Let the splitting value at a node v be x v. The left subtree of v contains all points  x v and the right subtree of v contains all points > x v. To report points in [ x:x’ ], we search with x and x’ in T. Let u and u’ be the two leaves where the search ends resp. Then the points in [ x:x’ ] are the ones stored in leaves between u and u’, plus possibly points stored at u & u’.

UNC Chapel Hill M. C. Lin FindSplitNode(T, x, x’) Input: A tree T and two values x and x’ with x  x’ Output: The node v where the paths to x and x’ splits, or the leaf where both paths end. 1. v  root (T) 2. while v is not a leaf and ( x’  x v or x > x v ) 3. do if x’  x v 4. then v  lc ( v )(* left child of the node v *) 5. else v  rc ( v )(* right child of the node v *) 6. return v

UNC Chapel Hill M. C. Lin 1DRangeQuery(T, [x:x’]) Input: A range tree T and a range [ x:x’ ] Output: All points that lie in the range. 1. v split  FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then Check if the point stored at v split must be reported 4. else (* Follow the path to x and report the points in subtrees right of the path *) 5. v  lc ( v split )

UNC Chapel Hill M. C. Lin 1DRangeQuery(T, [x:x’]) 6. while v is not a leaf 7. do if x  x v 8. then ReportSubTree( rc ( v )) 9. v  lc ( v ) 10. else v  rc ( v ) 11. Check if the point stored at leaf v must be reported 12. Similarly, follow the path to x’, report the points in subtrees left of the path, and check if point stored at the leaf where the path ends must be reported.

UNC Chapel Hill M. C. Lin 1D-Range Search Algorithm Analysis Let P be a set of n points in one-dimensional space. The set P can be stored in a balanced binary search tree, which uses O(n) storage and has O(n log n) construction time, s.t. the points in a query range can be reported in time O(k + log n), where k is the number of reported points. –The time spent in ReportSubtree is linear in the number of reported points, i.e. O(k). –The remaining nodes that are visited on the search path of x or x’. The length is O(log n) and time spent at each node is O(1), so the total time spent at these nodes is O(log n).

UNC Chapel Hill M. C. Lin Kd-Trees A 2d rectangular query on P asks for points from P lying inside a query rectangle [ x:x’ ] x [ y:y’ ]. A point p:= (p x, p y ) lies inside this rectangle iff p x  [ x:x’ ] and p y  [ y:y’ ]. In 2D case, there are two important values: its x - and y - coordinates. Thus, we first split on x -coordinate, next on y -coordinate, then again on x-coordinate, and so on. At the root, we split P with l into 2 sets and store l. Each set is then split into 2 subsets and stores its splitting line. We proceed recursively as such. In general, we split with vertical lines at nodes whose depth is even, and split with horizontal lines at nodes whose depth is odd.

UNC Chapel Hill M. C. Lin BuildKDTree(P, depth) Input: A set P of points and the current depth, depth. Output: The root of kd-tree storing P. 1. if P contains only 1 point 2. then return a leaf storing at this point 3. else if depth is even 4. then Split P into 2 subsets with a vertical line l thru median x -coordinate of points in P. Let P 1 and P 2 be the sets of points to the left or on and to the right of l respectively. 5. else Split P into 2 subsets with a horizontal line l thru median y -coordinate of points in P. Let P 1 and P 2 be the sets of points below or on l and above l respectively. 5. v left  BuildKDTree(P 1, depth + 1) 6. v right  BuildKDTree(P 2, depth + 1) 7. Create a node v storing l, make v left & v right lef/right children of v 8. return v

UNC Chapel Hill M. C. Lin Construction Time Analysis The most expensive step is median finding, which can be done in linear time. T(n) = O(1), if n = 1 T(n) = O(n) + 2 T(n/2), if n > 1  T(n) = O(n log n) A kd-tree for a set of n points uses O(n) storage and can be constructed in O(nlogn) time.

UNC Chapel Hill M. C. Lin Querying on a kd-Tree In general, region corresponding to a node v is a rectangle, or region(v), which can be unbounded on  1 sides. It’s bounded by splitting lines stored at ancestors of v. A point is stored in the sub-tree rooted at v if and only if it lies in region(v). We traverse the kd-tree, but visit only nodes whose region is intersected by the query rectangle. When a region is fully contained in the query rectangle, we report all points stored in its sub-trees. When the traversal reaches a leaf, we have to check whether the point stored at the leaf is contained in the query region.

UNC Chapel Hill M. C. Lin SearchKDTree(v, R) Input: The root of a (subtree of a) kd-tree and a range R. Output: All points at leaves below v that lie in the range. 1. if v is a leaf 2. then Report the point stored at v if it lies in R. 3. else if region ( lc ( v )) is fully contained in R 4. then ReportSubtree( lc ( v )) 5. else if region ( lc ( v )) intersects R 6. then SearchKDTree( lc ( v ), R ) 7. If region ( rc ( v )) is fully contained in R 8. then ReportSubtree( rc ( v )) 9. else if region ( rc ( v )) intersects R 10. then SearchKDTree( rc ( v ), R )

UNC Chapel Hill M. C. Lin 2D KD-Tree Query Time Analysis The time to traverse a subtree and report points in its leaves is linear in the number of reported points. To bound the number of other nodes visited, we bound the number of regions intersected by edges of the query rectangle. Q(n) = O(1), if n = 1 Q(n) = Q(n/4), if n > 1  Q(n) = O(n 1/2 )

UNC Chapel Hill M. C. Lin KD-Tree Query Time Analysis A rectangle range query on the kd-tree for a set of n points in the plane takes O(n 1/2 +k) time, where k is number of reported points. A d -dimensional kd-tree for a set of n points takes O(d·n) storage and O(d ·nlogn) time to construct. The query time is bounded by O(n 1-1/d + k).

UNC Chapel Hill M. C. Lin Basics of Range Trees Similar to 1D, to report points in [ x:x’ ] x [ y:y’ ], we search with x and x’ in T. Let u and u’ be the two leaves where the search ends resp. Then the points in [ x:x’ ] are the ones stored in leaves between u and u’, plus possibly points stored at u & u’. Call the subset of points stored in the leaves of the subtree rooted at v, the canonical subset of v. For the root, it’s whole set P ; for a leave, it’s the point stored there. We can perform the 2D range query similarly by only visiting the associated binary search tree on y - coordinate of the canonical subset of v, whose x- coordinate lies in the x- interval of the query rectangle.

UNC Chapel Hill M. C. Lin Range Trees The main tree is a balanced binary search tree T built on the x -coordinate of the points in P. For any internal or leaf node v in T, the canonical subset of P(v) is stored in a balanced binary search tree T assoc (v) on the y -coordinate of the points. The node v stores a pointer to the root of T assoc (v), which is called the “associated structure” of v.  This is called a “Range Tree”. The DS where nodes have pointers to associated structures are often called multi-level data structures. The main tree T is then called the first-level tree and the associated structures are second-level trees.

UNC Chapel Hill M. C. Lin Build2DRangeTree(P) Input: A set P of points in the plane. Output: The root of 2-dimensional tree. 1. Build a binary search tree T assoc on the set P y of y -coordinates of the points in P. Store at leaves of T assoc the points themselves. 2. if P contains only 1 point 3. then Create a leaf v storing this point and make T assoc the associated structure of v. 4. else Split P into 2 subsets: P left containing points with x -coordinate  x mid, the median x -coordinate, and P right containing points with x -coordinate  x mid 5. v left  Build2DRangeTree(P left ) 6. v right  Build2DRangeTree(P right ) 7. Create a node v storing x mid, make v left left child of v & v right right child of v, and make T assoc the associated structure of v 8. return v

UNC Chapel Hill M. C. Lin 2DRangeQuery(T, [x:x’] x [y:y’]) Input: A 2D range tree T and a range [ x:x’ ] x [ y:y’ ] Output: All points in T that lie in the range. 1. v split  FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then Check if the point stored at v split must be reported 4. else (* Follow the path to x and call 1DRangeQuery on the subtrees right of the path *) 5. v  lc ( v split )

UNC Chapel Hill M. C. Lin 2DRangeQuery(T, [x:x’] x [y:y’]) 6. while v is not a leaf 7. do if x  x v 8. then 1DRangeQuery( T assoc ( rc ( v )), [ y:y’ ]) 9. v  lc ( v ) 10. else v  rc ( v ) 11. Check if the point stored at leaf v must be reported 12. Similarly, follow the path from rc ( v split ) to x’, call 1DRangeQuery with the range [ y:y’ ] on the associated structures of subtrees left of the path, and check if point stored at the leaf where the path ends must be reported.

UNC Chapel Hill M. C. Lin Range Tree Algorithm Analysis Let P be a set of n points in the plane. A range tree for P uses O(n log n) storage and can be constructed in O(n log n) time. By querying this range tree, one can report the points in P that lie in a rectangular query range in O(log 2 n + k) time, where k is the number of reported points.

UNC Chapel Hill M. C. Lin Higher-D Range Trees Let P be a set of n points in d -dimensional space, where d  2. A range tree for P uses O(nlog d-1 n) storage and it can be constructed in O(nlog d-1 n) time. One can report the points in P that lies in a rectangular query range in O(log d n + k) time, where k is the number of reported points.

UNC Chapel Hill M. C. Lin General Sets of Points Replace the coordinates, which are real numbers, by elements of the so-called composite-number space. The elements of this space are pair of reals, denoted by (a|b). We denote the order by: (a|b) < (a’|b’)  a < a’ or (a=a’ and b<b’) Many points are distince. But for the ones with same x - or y -coordinate, replace each point p := (p x, p y ) by p’ := ((p x | p y ), (p y | p x )) Replace R := [x:x’] x [y:y’] by R’ := [(x| -  ) : (x’| +  )] x [(y| -  ) : (y’| +  )]

UNC Chapel Hill M. C. Lin Basics of Fractional Cascading Both P(lc(v)) and P(rc(v)) are canonical subsets of P(v). Each canonical subset P(v) is stored in an associated structure. Instead of using a binary search tree, we now store it in an array A(v). The array is sorted on the y -coordinate of the points. Each entry in A(v) stores 2 pointers: a pointer into A(lc(v)) & a pointer into A(rc(v)). A(v)[i] stores a point p := (p x, p y ). We store a pointer from A(v)[i] to the entry of A(lc(v)) s.t. the y -coordinate of p’ stored there is the smallest one  p y. The pointer to A(rc(v)) is defined the same way: it points to the entry s.t. the y -coordinate of the point stored there is the smallest one that is  p y.  This modified version is called “layered range tree”.

UNC Chapel Hill M. C. Lin Fractional Cascading Algorithm Analysis Let P be a set of n points in d -dimensional space, with d  2. A layered range tree for P uses O(nlog d-1 n) storage and it can be constructed in O(nlog d-1 n) time. With this range tree, one can report the points in P that lies in a rectangular query range in O(log d-1 n + k) time, where k is the number of reported points.