Download presentation
Presentation is loading. Please wait.
Published byCatherine Marshall Modified over 9 years ago
1
Mehdi Mohammadi March 2015 1 Western Michigan University Department of Computer Science CS 6310 - Advanced Data Structure
2
Orthogonal Range Trees Higher-Dimensional Segment Trees Other Systems of Building Blocks Range-Counting and the Semigroup Model KD-Trees and Related Structures 2
3
Orthogonal Range-Searching problem ◦ Input: a (d-dimensional) box, a set of points ◦ Output: all the points in the set that lies in that box Applications ◦ Geometric applications ◦ Database Queries Select Emp from T where 50K<Salary<75K AND age > 50 AND salesAmount > 500K AND 2011<salesYear<2015 A 5-d orthogonal range query ◦ Preprocessing for queries 3
4
General situation ◦ Set of data points p 1, …, p n P i = (p i1, …, p id ) ◦ d-dimensional query interval [a 1,b 1 [ ×…×[a d,b d [ ◦ Return all points p i contained in that interval: a 1 ≤p i1 <b 1, …, a d ≤ p id <b d O(f d (n) + k) Structure ◦ Build a balanced search tree for the first coordinates of data points Each node has its Associated Interval: points whose first coordinate falls into that interval Build recursively a range search tree for the remaining d-1 coordinates on each node 4
5
Query: ◦ find O(log n) nodes correspond to [a 1,b 1 [ ◦ In each of those nodes perform d-1 dimensional range search for [a 2,b 2 [ × … × [a d, b d [ 5
6
Example 2-d ◦ (0,1), (1,5), (2,8), (3,3), (5,0), (6,4), (7,6), (8,7), (9,9) 6 1-d range tree 5 5 1 1 8 8 5 5 8 8 {(2,8)} {(1,5)} {(1,5), (2,8)} {(0,1)} {(0,1), (1,5), (2,8)}
7
Theorem: Orthogonal search trees are static structure supporting d-dimensional range queries in a set of d-dimensional points ◦ Query time Output sensitive time O((log n) d + k) if output consists of k points ◦ Building tree time O(n(log n) d ) ◦ Space requirement O(n(log n) d-1 ) 7
8
Fractional Cascading ◦ When we make a sequence of searches in different but related sets, we can use the information of search in previous set into the next set. Algorithm ◦ For each node, sort the Associated Intervals by second coordinate ◦ Link each point on this list to The same point on the left or right lower neighbor The point with the next smaller second coordinate if the point is missing on that side Or the first point on the list if there is no point with smaller coordinate 8
9
Fractional Cascading 9
10
Fractional Cascading Search ◦ We have a search tree for the first coordinate We have to select the corresponding nodes to the canonical interval decomposition of the first interval query ◦ Attached to each node is a structure for the search in the second coordinate These structure are linked together for fractional cascading ◦ So that we need to search only in the set associated with the first node Then reuse that information in all later searches 10
11
Theorem: Orthogonal range trees with fractional cascading are a static data structure that support d-dimensional orthogonal range queries in a set of d- dimensional point (d>1); ◦ Query time O((log n) d-1 + k) if output consists of k points ◦ Building tree time O(n(log n) d-1 ) ◦ Space requirement O(n(log n) d-1 ) 11
12
The inverse problem of orthogonal range searching problem Input: ◦ A set of n ranges (d-dimensional intervals) ◦ A query point Output: ◦ All ranges that contain that point Solvable by generalization over segment tree ◦ It is defined recursively 12
13
Main structure: ◦ A balanced search tree whose keys are the first coordinates of d-dimensional intervals ◦ Each node of that tree contains a d-1 dimensional segment tree. ◦ In this d-1 dimensional segment tree associated with node p, all intervals are stored for which p is part of the canonical interval decomposition of the first dimension. 13
14
Query ◦ Follow the search path of the first coordinate of the query point ◦ In each node perform a (d-1) dimensional query with the remaining coordinates associated with the node. Theorem: d-dimensional segment tree is a static data structure that lists all d-dimensional intervals containing a given query key, ◦ Build time: O(n(log n) d ) ◦ Space need: O(n(log n) d ) ◦ Query time: O((log n) d + k) if there are k such intervals 14
15
Improvement: S-tree using fractional cascading Algorithm ◦ Input: rectangles [a i,b i [ × [c i,d i [ for i = 1,…, n 1. create balanced search tree T1 for {a 1,b 1,a 2,b 2,…,a n,b n } 2. attach an empty secondary balanced tree to each node of the first tree 3. for i=1 to n ◦ 3.1 start from T1 root, put it on a stack. ◦ 3.2 Repeat As long as stack is not empty Take the current node v from the stack Insert {c i, d i } as keys into the tree T2(v) 15
16
If intervalOf(v) is not in [a i,b i [, check v’s left and right subtrees. If their intervals have some intersection with [a i,b i [, then put them on the stack. 4. for each i=1,…n ◦ 4.1. for all nodes v that belong to the canonical interval decomposition of [a i,b i [ in T1 Insert rectangle [a i,b i [ × [c i,d i [ into the segment tree T2(v) 16
17
5. for each node v of T1 ◦ Create pointers from each leaf of T2(v) to the corresponding leaves of T2(v->left) and T2(v->right) 6. for each node v of T1 ◦ For each node w of T2(v) create a pointer to the next node above w in T2(v) that has some rectangle associated with it. Theorem: S-tree is a static data structure that keeps track of a set of n rectangles, and for a given point list all rectangles containing that point ◦ Space: O(n(log n) 2 ) ◦ Query time: O(log n + k); if there are k output intervals 17
18
Canonical interval decomposition ◦ Decompose an interval in a union of a small number of building blocks To answer a query interval ◦ Decompose the query interval into a union of building blocks ◦ Execute the query on those building blocks. 18
19
Building block query requires ◦ Decompose the queries ◦ Reconstruct the answer from the answer of building blocks ◦ Also, some structure that answers the query for a fixed block ◦ Represent each interval as a union of a small number of blocks Choice of building blocks tradeoff ◦ Reduce interval query to a small number of blocks needs many building blocks For each block we have to build a structure to answer queries 19
20
Bentley and Maurer (1980) proposal ◦ Use an r-level structure for system of blocks Interpreted as writing numbers to the base n (1/r). 20 Intervals of blocks for top level [an (1-1/r), bn (1-1/r) ] 0≤a<b≤n 1/r O(n 2/r ) blocks O(n (j+1)/r ) blocks
21
Using r-level blocking we obtain a structure to perform d-dimensional orthogonal range searching ◦ Query time: O(r d log n + k) ◦ Preprocessing time: O(r d n 1+(2d-2)/r log n) ◦ Query time is output sensitive for large r and n. 21
22
Range counting problem ask just for the number of points in a range ◦ We do not need output sensitive time complexity Use orthogonal range tree ◦ Instead of concatenating lists, just add up numbers ◦ Generalization by giving weight to points In 1-dimensional version, just ask for the number of keys in an interval 22 2 3 5 2 2 2 4 9
23
All operations in O(n(log n) d ) for a set of n points Difference with range searching ◦ Allow to make dynamic structure Insertion, deletion and rebalance Range searching has large associated trees for nodes ◦ lower bounds for operations are possible: O((log n) d ) In the semigroup version ◦ a commutative semigroup (S,+) is specified, ◦ each point is assigned a weight from S, ◦ Return semigroup sum of the weights of the keys in an interval Directly from canonical interval decomposition 23
24
Another structure to support orthogonal range searching ◦ Easy to understand and implement ◦ Unsatisfactory performance 2-dimensional KD-Tree:O(n 1/2 + k) Orthogonal range tree: O((log n) 2 + k) d-dimensional KD-Tree: O(n 1-1/d + k) Orthogonal range tree: O((log n) d + k) 24
25
In each node make a comparison to enter the left or right sub-trees ◦ In different levels compare against different coordinates In the root compare against x In the second level compare against y, and so on. 25
26
Building KD-Tree 26
27
Building KD-Tree 27
28
KD-Tree range query ◦ Starting in the root, descend into each node whose node interval has an intersection with the query region ◦ Stop branches when an intersection is empty Time complexity is as large as Ω(√n) ◦ Even in completely balanced tree with distinct keys ◦ This bound cannot be improved 28
29
Theorem: KD-Trees are a static data structure that supports d-dimensional or orthogonal range queries in a set of d-dimensional points ◦ output sensitive time O(n 1-1/d + k) if output consist of k points ◦ Can be built in O(n (log n)) ◦ Need space O(n) 29
30
Thank you for your attention 30
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.