Download presentation
Presentation is loading. Please wait.
1
Spatial Indexing SAMs
2
Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert R-tree
3
R-tree A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Multi-way external memory structure, indexes MBRs Dynamic structure
4
R-tree: properties Main points: every parent node completely covers its ‘children’ a child MBR may be covered by more than one parent - it is stored under ONLY ONE of them. (ie., no need for dup. elim.) a point query may follow multiple branches.
5
R-tree The original R-tree tries to minimize the area of each enclosing rectangle in the index nodes. Is there any other property that can be optimized? R*-tree Yes!
6
R*-tree Optimization Criteria: (O1) Area covered by an index MBR (O2) Overlap between directory MBRs (O3) Margin of a directory rectangle (O4)Storage utilization Sometimes it is impossible to optimize all the above criteria at the same time!
7
R*-tree ChooseSubtree: If next node is a leaf node, choose the node using the following criteria: Least overlap enlargement Least area enlargement Smaller area Else Least area enlargement Smaller area
8
R*-tree SplitNode Choose the axis to split Choose the two groups along the chosen axis ChooseSplitAxis Along each axis, sort rectangles and break them into two groups (M-2m+2 possible ways where one group contains at least m rectangles). Compute the sum S of all margin-values of each pair of groups. Choose the one that minimizes S ChooseSplitIndex Along the chosen axis, choose the grouping that gives the minimum overlap-value
9
R*-tree Forced Reinsert: defer splits, by forced-reinsert, i.e.: instead of splitting, temporarily delete some entries, shrink overflowing MBR, and re- insert those entries Which ones to re-insert? How many? A: 30%
10
R-tree: variations What about static datasets? (no ins/del) Hilbert What about other bounding shapes?
11
R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points?
12
R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’
13
R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; bad for ‘y’
14
R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’ Q: how to improve?
15
R-trees - variations A: plane-sweep on HILBERT curve!
16
R-trees - variations A: plane-sweep on HILBERT curve! In fact, it can be made dynamic (how?), as well as to handle regions (how?)
17
R-trees - variations Dynamic (‘Hilbert R- tree): each point has an ‘h’- value (hilbert value) insertions: like a B-tree on the h-value but also store MBR, for searches
18
Hilbert R-tree Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR
19
R-trees - variations Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR ~B-tree
20
R-trees - variations Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR ~ R-tree
21
R-trees - variations What if we have regions, instead of points? I.e., how to impose a linear ordering (‘h- value’) on rectangles?
22
R-trees - variations What if we have regions, instead of points? I.e., how to impose a linear ordering (‘h-value’) on rectangles? A1: h-value of center A2: h-value of 4-d point (center, x-radius, y-radius) A3:...
23
R-trees - variations What if we have regions, instead of points? I.e., how to impose a linear ordering (‘h-value’) on rectangles? A1: h-value of center A2: h-value of 4-d point (center, x-radius, y-radius) A3:...
24
R-trees - variations with h-values, we can have deferred splits, 2-to-3 splits (3-to-4, etc) Instead of splitting a full node, find the siblings (using the h-values) and redistribute the rectangles among the nodes. Split only when all siblings are full. experimentally: faster than R*-trees (reference: [Kamel Faloutsos vldb 94])
25
R-trees - variations what about other bounding shapes? (and why?) A1: arbitrary-orientation lines (cell-tree, [Guenther] A2: P-trees (polygon trees) (MB polygon: 0, 90, 45, 135 degree lines)
26
R-trees - variations A3: L-shapes; holes (hB-tree) A4: TV-trees [Lin+, VLDB-Journal 1994] A5: SR-trees [Katayama+, SIGMOD97] (used in Informedia)
27
R-trees - conclusions Popular method; like multi-d B-trees guaranteed utilization good search times (for low-dim. at least) Informix ships DataBlade with R-trees
28
Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
29
Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
30
Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
31
Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
32
Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
33
R-trees - Range search pseudocode: check the root for each branch, if its MBR intersects the query rectangle apply range-search (or print out, if this is a leaf)
34
R-trees - NN search A B C D E F G H I J P1 P2 P3 P4 q
35
R-trees - NN search Q: How? (find near neighbor; refine...) A B C D E F G H I J P1 P2 P3 P4 q
36
R-trees - NN search A1: depth-first search; then, range query A B C D E F G H I J P1 P2 P3 P4 q
37
R-trees - NN search A1: depth-first search; then, range query A B C D E F G H I J P1 P2 P3 P4 q
38
R-trees - NN search A1: depth-first search; then, range query A B C D E F G H I J P1 P2 P3 P4 q
39
R-trees - NN search A2: [Roussopoulos+, sigmod95]: priority queue, with promising MBRs, and their best and worst-case distance main idea: Every face of any MBR contains at least one point of an actual spatial object!
40
R-trees - NN search A B C D E F G H I J P1 P2 P3 P4 q consider only P2 and P4, for illustration
41
R-trees - NN search D E H J P2 P4 q worst of P2 best of P4 => P4 is useless for 1-nn
42
R-trees - NN search D E P2 q worst of P2 what is really the worst of, say, P2?
43
R-trees - NN search P2 q what is really the worst of, say, P2? A: the smallest of the two red segments!
44
R-trees - NN search variations: [Hjaltason & Samet] incremental nn: build a priority queue scan enough of the tree, to make sure you have the k nn to find the (k+1)-th, check the queue, and scan some more of the tree ‘optimal’ (but, may need too much memory)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.