2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Introduction to Algorithms
INTERVAL TREE & SEGMENTATION TREE
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing.
Searching on Multi-Dimensional Data
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 10.
2IL50 Data Structures Spring 2015 Lecture 8: Augmenting Data Structures.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part II Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Orthogonal Range Searching 3Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 12.
Orthogonal Range Searching-1Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
Computational Geometry, WS 2007/08 Lecture 15 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für Angewandte Wissenschaften.
Lecture 5: Orthogonal Range Searching Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part III Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 - Part I Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part II Prof. Dr. Th. Ottmann Summer Semester 2006.
Lecture 11 : More Geometric Data Structures Computational Geometry Prof. Dr. Th. Ottmann 1 Geometric Data Structures 1.Rectangle Intersection 2.Segment.
O RTHOGONAL R ANGE S EARCHING الهه اسلامی فروردین 92, 1.
Introduction to Analysis of Algorithms CAS CS 330 Lecture 16 Shang-Hua Teng Thanks to Charles E. Leiserson and Silvio Micali of MIT for these slides.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
1 Geometric Intersection Determining if there are intersections between graphical objects Finding all intersecting pairs Brute Force Algorithm Plane Sweep.
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
Lecture X Augmenting Data Structures
14/13/15 CMPS 3130/6130 Computational Geometry Spring 2015 Windowing Carola Wenk CMPS 3130/6130 Computational Geometry.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
10/20/2015 2:03 PMRed-Black Trees v z. 10/20/2015 2:03 PMRed-Black Trees2 Outline and Reading From (2,4) trees to red-black trees (§9.5) Red-black.
© 2004 Goodrich, Tamassia Red-Black Trees v z.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Multi-dimensional Search Trees
P ARALLEL O RTHOGONAL R ANGE S EARCHING Project Presentation by Savitha Parur Venkitachalam.
Orthogonal Range Search
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 11 Prof. Erik Demaine.
B-trees Eduardo Laber David Sotelo. What are B-trees? Balanced search trees designed for secondary storage devices Similar to AVL-trees but better at.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2015.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2013.
CMPS 3130/6130 Computational Geometry Spring 2015
February 17, 2005Lecture 6: Point Location Point Location (most slides by Sergi Elizalde and David Pritchard)
2IL05 Data Structures Spring 2010 Lecture 9: Augmenting Data Structures.
UNC Chapel Hill M. C. Lin Geometric Data Structures Reading: Chapter 10 of the Textbook Driving Applications –Windowing Queries Related Application –Query.
May 2012Range Search Algorithms1 Shmuel Wimer Bar Ilan Univ. Eng. Faculty Technion, EE Faculty.
CSE 554 Lecture 5: Contouring (faster)
Introduction to Algorithms
Red-Black Trees v z Red-Black Trees Red-Black Trees
Geometric Data Structures
CMPS 3130/6130 Computational Geometry Spring 2017
KD Tree A binary search tree where every node is a
Orthogonal Range Searching and Kd-Trees
(2,4) Trees (2,4) Trees 1 (2,4) Trees (2,4) Trees
Red-Black Trees v z /20/2018 7:59 AM Red-Black Trees
Red-Black Trees v z Red-Black Trees Red-Black Trees
Chapter 14: Augmenting Data Structures
Reporting (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of the from: Given an interval I=[x1,x2]
Orthogonal Range Searching Querying a Database
Shmuel Wimer Bar Ilan Univ., School of Engineering
Red-Black Trees v z /17/2019 4:20 PM Red-Black Trees
Introduction to Algorithms
CMPS 3130/6130 Computational Geometry Spring 2017
Chapter 20: Binary Trees.
CMPS 3130/6130 Computational Geometry Spring 2017
Red-Black Trees v z /6/ :10 PM Red-Black Trees
Presentation transcript:

2IL50 Data Structures Fall 2015 Lecture 9: Range Searching

Augmenting data structures Methodology for augmenting a data structure 1.Choose an underlying data structure. 2.Determine additional information to maintain. 3.Verify that we can maintain additional information for existing data structure operations. 4.Develop new operations.  You don’t need to do these steps in strict order!  Red-black trees are very well suited to augmentation …

Augmenting red-black trees Theorem Augment a R-B tree with field f, where f[x] depends only on information in x, left[x], and right[x] (including f[left[x]] and f[right[x]]). Then can maintain values of f in all nodes during insert and delete without affecting O(log n) performance. When we alter information in x, changes propagate only upward on the search path for x … Examples OS-tree new operations: OS-Select and OS-Rank Interval-tree new operation: Interval-Search

Range Searching

Example: Database for personnel administration (name, address, date of birth, salary, …) Query: Report all employees born between 1950 and 1955 who earn between $3000 and $4000 per month. Application: Database queries More parameters? Report all employees born between 1950 and 1955 who earn between $3000 and $4000 per month and have between two and four children. ➨ more dimensions

“Report all employees born between 1950 and 1955 who earn between $3000 and $4000 per month and have between two and four children.” Application: Database queries Rectangular range query or orthogonal range query

1-Dimensional range searching P={p 1, p 2, …, p n } set of points on the real line Query: given a query interval [x : x’] report all p i ∈ P with p i ∈ [x : x’]. Solution: Use a balanced binary search tree T.  leaves of T store the points p i  internal nodes store splitting values (node v stores value x v )

Query [x : x’] ➨ search with x and x’ ➨ end in two leaves μ and μ’ Report1. all leaves between μ and μ’ 2. possibly points stored at μ and μ’ 1-Dimensional range searching [18 : 77] μμ’μ’

Query [x : x’] ➨ search with x and x’ ➨ end in two leaves μ and μ’ Report1. all leaves between μ and μ’ 2. possibly points stored at μ and μ’ 1-Dimensional range searching [18 : 77] μμ’μ’ How do we find all leaves between μ and μ’?

 How do we find all leaves between μ and μ’ ? Solution: They are the leaves of the subtrees rooted at nodes v in between the two search paths whose parents are on the search paths. ➨ we need to find the node v split where the search paths split 1-Dimensional range searching

FindSplitNode(T, x, x’) ► Input. A tree T and two values x and x’ with x ≤ x’ ► Output. The node v where the paths to x and x’ split, or the leaf where both paths end. 1. v = root(T) 2. while v is not a leaf and (x’ ≤ x v or x > x v ) 3. do if x’ ≤ x v 4. then v = left(v) 5. else v = right(v) 6. return v

1-Dimensional range searching 1. Starting from v split follow the search path to x. when the paths goes left, report all leaves in the right subtree check if µ ∈ [x : x’] 2. Starting from v split follow the search path to x’. when the paths goes right, report all leaves in the left subtree check if µ’ ∈ [x : x’]

1-Dimensional range searching 1DRangeQuery(T, [x : x’]) ► Input. A binary search tree T and a range [x : x’]. ► Output. All points stored in T that lie in the range. 1.v split = FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then Check if the point stored at v split must be reported. 4. else (Follow the path to x and report the points in subtrees right of the path) 5. v = left(v split ) 6. while v is not a leaf 7. do if x ≤ x v 8. then ReportSubtree(right(v)) 9. v = left(v) 10. else v = right(v) 11. Check if the point stored at the leaf v must be reported. 12. Similarly, follow the path to x’, report the points in subtrees left of the path, and check if the point stored at the leaf where the path ends must be reported.

1-Dimensional range searching Correctness? Need to show two things: 1DRangeQuery(T, [x, x’]) ► Input. A binary search tree T and a range [x : x’]. ► Output. All points stored in T that lie in the range. 1.v split = FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then Check if the point stored at v split must be reported. 4. else (Follow the path to x and report the points in subtrees right of the path) 5. v = left(v split ) 6. while v is not a leaf 7. do if x ≤ x v 8. then ReportSubtree(right(v)) 9. v = left(v) 10. else v = right(v) 11. Check if the point stored at the leaf v must be reported. 12. Similarly, follow the path to x’, report the points in subtrees left of the path, and check if the point stored at the leaf where the path ends must be reported. 1.every reported point lies in the query range 2.every point in the query range is reported.

1-Dimensional range searching Query time? ReportSubtree = O(1 + reported points) ➨ total query time = O(log n + reported points) Storage? 1DRangeQuery(T, [x, x’]) ► Input. A binary search tree T and a range [x : x’]. ► Output. All points stored in T that lie in the range. 1.v split = FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then Check if the point stored at v split must be reported. 4. else (Follow the path to x and report the points in subtrees right of the path) 5. v = left(v split ) 6. while v is not a leaf 7. do if x ≤ x v 8. then ReportSubtree(right(v)) 9. v = left(v) 10. else v = right(v) 11. Check if the point stored at the leaf v must be reported. 12. Similarly, follow the path to x’, report the points in subtrees left of the path, and check if the point stored at the leaf where the path ends must be reported. O(n)

2-Dimensional range searching P={p 1, p 2, …, p n } set of points in the plane Query: given a query rectangle [x : x’] x [y : y’] report all p i ∈ P with p i ∈ [x : x’] x [y : y’], that is, p x ∈ [x : x’] and p y ∈ [y : y’] ➨ a 2-dimensional range query is composed of two 1-dimensional sub-queries  How can we generalize our 1-dimensional solution to 2 dimensions? for now: no two points have the same x-coordinate, no two points have the same y-coordinate

Back to one dimension …

Back to one dimension …

p1p1 And now in two dimensions … p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 p3p3 p4p4 p5p5 p8p8 p9p9 p2p2 p6p6 p7p7  Split alternating on x- and y-coordinate 2-dimensional kd-tree

Kd-trees BuildKDTree(P, depth) ► Input. A set of points P and the current depth. ► Output. The root of a kd-tree storing P. 1. if P contains only one point 2. then return a leaf storing this point 3. else if depth is even 4. then split P into two subsets with a vertical line l through the median x-coordinate of the points in P. Let P 1 be the set of points to the left or on l, and let P 2 be the set of points to the right of l. 5. elsesplit P into two subsets with a horizontal line l through the median y-coordinate of the points in P. Let P 1 be the set of points below or on l, and let P 2 be the set of points above l. 6. v left = BuildKdTree(P 1, depth+1) 7. v right = BuildKdTree(P 2, depth+1) 8. Create a node v storing l, make v left the left child of v, and make v right the right child of v 9. return v

Kd-trees BuildKDTree(P, depth) ► Input. A set of points P and the current depth. ► Output. The root of a kd-tree storing P. 1. if P contains only one point 2. then return a leaf storing this point 3. else if depth is even 4. then split P into two subsets with a vertical line l through the median x-coordinate of the points in P. Let P 1 be the set of points to the left or on l, and let P 2 be the set of points to the right of l. 5. else split P into two subsets with a horizontal line l through the median y-coordinate of the points in P. Let P 1 be the set of points below or on l, and let P 2 be the set of points above l. 6. v left = BuildKdTree(P 1, depth+1) 7. v right = BuildKdTree(P 2, depth+1) 8. Create a node v storing l, make v left the left child of v, and make v right the right child of v 9. return v Running time? ➨ T(n) = O(n log n) O(1)if n = 1 O(n) + 2T( n/2 ) if n > 1 T(n) = presort to avoid linear time median finding …

Kd-trees BuildKDTree(P, depth) ► Input. A set of points P and the current depth. ► Output. The root of a kd-tree storing P. 1. if P contains only one point 2. then return a leaf storing this point 3. else if depth is even 4. then split P into two subsets with a vertical line l through the median x-coordinate of the points in P. Let P 1 be the set of points to the left or on l, and let P 2 be the set of points to the right of l. 5. else split P into two subsets with a horizontal line l through the median y-coordinate of the points in P. Let P 1 be the set of points below or on l, and let P 2 be the set of points above l. 6. v left = BuildKdTree(P 1, depth+1) 7. v right = BuildKdTree(P 2, depth+1) 8. Create a node v storing l, make v left the left child of v, and make v right the right child of v 9. return v Storage? O(n)

Querying a kd-tree  Each node v corresponds to a region region(v).  All points which are stored in the subtree rooted at v lie in region(v). ➨ 1. if a region is contained in query rectangle, report all points in region. 2. if region is disjoint from query rectangle, report nothing. 3. if region intersects query rectangle, refine search (test children or point stored in region if no children)

p1p1 Querying a kd-tree p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p8p8 p 11 p 12 p 13 p3p3 p4p4 p5p5 p 11 p 12 p 13 p2p2 p6p6 Disclaimer: This tree cannot have been constructed by BuildKdTree … p7p7 p9p9 p 10 p7p7 p8p8 p9p9

Querying a kd-tree SearchKdTree(v, R) ► Input. The root of (a subtree of) a kd-tree, and a range R. ► Output. All points at leaves below v that lie in the range. 1. if v is a leaf 2. then report the point stored at v if it lies in R 3. else if region(left(v)) is fully contained in R 4. then ReportSubtree(left(v)) 5. else if region(left(v)) intersects R 6. then SearchKdTree(left(v), R) 7. if region(right(v)) is fully contained in R 8. then ReportSubtree(right(v)) 9. else if region(right(v)) intersects R 10. then SearchKdTree(right(v), R) Query time?

Querying a kd-tree: Analysis  Time to traverse subtree and report points stored in leaves is linear in number of leaves ➨ ReportSubtree takes O(k) time, k total number of reported points  need to bound number of nodes visited that are not in the traversed subtrees (grey nodes)  the query range properly intersects the region of each such node We are only interested in an upper bound … How many regions can a vertical line intersect?

Querying a kd-tree: Analysis Question: How many regions can a vertical line intersect? Q(n): number of intersected regions Answer 1: Answer 2: Master theorem ➨ Q(n) = O( √n) p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p8p8 p 11 p 12 p 13 p7p7 p9p9 p 10 Q(n) = 1 + Q(n/2) Q(n) = O(1)if n = Q(n/4)if n > 1

KD-trees Theorem A kd-tree for a set of n points in the plane uses O(n) storage and can be built in O(n log n) time. A rectangular range query on the kd- tree takes O(√n + k) time, where k is the number of reported points. If the number k of reported points is small, then the query time O(√n + k) is relatively high. Can we do better? Trade storage for query time …

Back to 1 dimension … (again)  A 1DRangeQuery with [x : x’] gives us all points whose x- coordinates lie in the range [x : x’] × [y : y’].  These points are stored in O(log n) subtrees. Canonical subset of node v points stored in the leaves of the subtree rooted at v Idea: store canonical subsets in binary search tree on y-coordinate

Range trees Range tree The main tree is a balanced binary search tree T built on the x-coordinate of the points in P. For any internal or leaf node ν in T, the canonical subset P(ν) is stored in a balanced binary search tree T assoc (ν) on the y- coordinate of the points. The node ν stores a pointer to the root of T assoc (ν), which is called the associated structure of ν.

Range trees Build2DRangeTree(P) ► Input. A set P of points in the plane. ► Output. The root of a 2-dimensional range tree. 1. Construct the associated structure: Build a binary search tree T assoc on the set P y of y-coordinates of the points in P. Store at the leaves of T assoc not just the y-coordinates of the points in P y, but the points themselves. 2. if P contains only one points 3. then create a leaf v storing this point, and make T assoc the associated structure of v 4. else split P into two subsets; one subset P left contains the points with x- coordinate less than or equal to x mid, the median x-coordinate, and the other subset P right contains the points with x-coordinate larger than x mid 5. v left = Build2DRangeTree(P left ) 6. v right = Build2DRangeTree(P right ) 7. create a node v storing x mid, make v left the left child of v, make v right the right child of v, and make T assoc the associated structure of v 8. return v

Range trees Build2DRangeTree(P) ► Input. A set P of points in the plane. ► Output. The root of a 2-dimensional range tree. 1. Construct the associated structure: Build a binary search tree T assoc on the set P y of y-coordinates of the points in P. Store at the leaves of T assoc not just the y-coordinates of the points in P y, but the points themselves. 2. if P contains only one points 3. then create a leaf v storing this point, and make T assoc the associated structure of v 4. else split P into two subsets; one subset P left contains the points with x- coordinate less than or equal to x mid, the median x-coordinate, and the other subset P right contains the points with x-coordinate larger than x mid 5. v left = Build2DRangeTree(P left ) 6. v right = Build2DRangeTree(P right ) 7. create a node v storing x mid, make v left the left child of v, make v right the right child of v, and make T assoc the associated structure of v 8. return v Running time? presort to built binary search trees in linear time … O(n log n)

Range trees: Storage Lemma A range tree on a set of n points in the plane requires O(n log n) storage. Proof: each point is stored only once per level storage for associated structures is linear in number of points there are O(log n) levels

Querying a range tree 2DRangeQuery(T, [x : x’] × [y : y’]) ► Input. A 2-dimensional range tree T and a range [x : x’] × [y : y’]. ► Output. All points in T that lie in the range. 1. v split = FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then check if the point stored at v split must be reported 4. else (Follow the path to x and call 1DRangeQuery on the subtrees right of the path) 5. v = left(v split ) 6. while v is not a leaf 7. do if x ≤ x v 8. then 1DRangeQuery(T assoc (right(v)), [y : y’]) 9. v = left(v) 10. else v = right(v) 11. Check if the point stored at v must be reported 12. Similarly, follow the path from right(v split ) to x’, call 1DRangeQuery with the range [y : y’] on the associated structures of subtrees left of the path, and check if the point stored at the leaf where the path ends must be reported.

Querying a range tree 2DRangeQuery(T, [x : x’] × [y : y’]) ► Input. A 2-dimensional range tree T and a range [x : x’] × [y : y’]. ► Output. All points in T that lie in the range. 1. v split = FindSplitNode(T, x, x’) 2. if v split is a leaf 3. then check if the point stored at v split must be reported 4. else (Follow the path to x and call 1DRangeQuery on the subtrees right of the path) 5. v = left(v split ) 6. while v is not a leaf 7. do if x ≤ x v 8. then 1DRangeQuery(T assoc (right(v)), [y : y’]) 9. v = left(v) 10. else v = right(v) 11. Check if the point stored at v must be reported 12. Similarly, follow the path from right(v split ) to x’, call 1DRangeQuery with the range [y : y’] on the associated structures of subtrees left of the path, and check if the point stored at the leaf where the path ends must be reported. Running time? O(log 2 n + k)

Range trees Theorem A range tree for a set of n points in the plane uses O(n log n) storage and can be built in O(n log n) time. A rectangular range query on the range tree takes O(log 2 n + k) time, where k is the number of reported points. Conclusion kd-treeRange tree storageO(n)O(n log n) query timeO(√n + k)O(log 2 n + k)