Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.

Similar presentations


Presentation on theme: "Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation."— Presentation transcript:

1 Spatial Indexing SAMs

2 Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation technique and a PAM New methods: Spatial Access Methods SAMs R-tree and variations

3 Problem Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer spatial queries (range, nn, etc)

4 Transformation Technique Map an d-dim MBR into a point: ex. [(x min, x max ) (y min, y max )] => (x min, x max, y min, y max ) Use a PAM to index the 2d points Given a range query, map the query into the 2d space and use the PAM to answer it

5 R-tree [Guttman 84] Main idea: allow parents to overlap! => guaranteed 50% utilization => easier insertion/split algorithms. (only deal with Minimum Bounding Rectangles - MBRs)

6 R-tree A multi-way external memory tree Index nodes and data (leaf) nodes All leaf nodes appear on the same level Every node contains between m and M entries The root node has at least 2 entries (children)

7 Example eg., w/ fanout 4: group nearby rectangles to parent MBRs; each group -> disk page A B C D E F G H J I

8 Example F=4 A B C D E F G H I J P1 P2 P3 P4 FGDEHIJABC

9 Example F=4| m=2, M=4 A B C D E F G H I J P1 P2 P3 P4 P1P2 P3P4 FGDEHIJABC P5P6 P5 P6

10 R-trees - format of nodes {(MBR; obj_ptr)} for leaf nodes P1P2P3P4 ABC x-low; x-high y-low; y-high... obj ptr...

11 R-trees - format of nodes {(MBR; node_ptr)} for non-leaf nodes P1P2P3P4 ABC x-low; x-high y-low; y-high... node ptr...

12 R-trees:Search A B C D E F G H I J P1 P2 P3 P4 P1P2 P3P4 FGDEHIJABC P5P6

13 R-trees:Search P1P2 P3P4 FGDEHIJABC P5P6 A B C D E F G H I J P1 P2 P3 P4

14 R-trees:Search Main points: every parent node completely covers its ‘children’ a child MBR may be covered by more than one parent - it is stored under ONLY ONE of them. (ie., no need for dup. elim.) a point query may follow multiple branches. everything works for any(?) dimensionality

15 R-trees:Insertion A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC X X Insert X

16 R-trees:Insertion A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Y Insert Y

17 R-trees:Insertion Extend the parent MBR A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Y Y

18 R-trees:Insertion How to find the next node to insert the new object? Using ChooseLeaf: Find the entry that needs the least enlargement to include Y. Resolve ties using the area (smallest) Other methods (later)

19 R-trees:Insertion If node is full then Split : ex. Insert w A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC W K K

20 R-trees:Insertion If node is full then Split : ex. Insert w A B C D E F G H I J P1 P2 P3 P4 Q1Q2FGDEHIJAB W K CKW P5 P1P5P2P3 P4 Q1 Q2

21 R-trees:Split Split node P1: partition the MBRs into two groups. A B C W K P1 (A1: plane sweep, until 50% of rectangles) A2: ‘linear’ split A3: quadratic split A4: exponential split: 2 M-1 choices

22 R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’ ‘seed’ seed1 seed2 R

23 R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’ ‘seed’: ‘closest’: the smallest increase in area seed1 seed2 R

24 R-trees:Split How to pick Seeds: Linear:Find the highest and lowest side in each dimension, normalize the separations, choose the pair with the greatest normalized separation Quadratic: For each pair E1 and E2, calculate the rectangle J=MBR(E1, E2) and d= J-E1-E2. Choose the pair with the largest d

25 R-trees:Insertion Use the ChooseLeaf to find the leaf node to insert an entry E If leaf node is full, then Split, otherwise insert there Propagate the split upwards, if necessary Adjust parent nodes

26 R-Trees:Deletion Find the leaf node that contains the entry E Remove E from this node If underflow: Eliminate the node by removing the node entries and the parent entry Reinsert the orphaned (other entries) into the tree using Insert Other method (later)

27 R-trees: Variations R+-tree: DO not allow overlapping, so split the objects (similar to z-values) R*-tree: change the insertion, deletion algorithms (minimize not only area but also perimeter, forced re-insertion ) Hilbert R-tree: use the Hilbert values to insert objects into the tree

28 Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert R-tree

29 R-tree A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Multi-way external memory structure, indexes MBRs Dynamic structure

30 R-tree The original R-tree tries to minimize the area of each enclosing rectangle in the index nodes. Is there any other property that can be optimized? R*-tree  Yes!

31 R*-tree Optimization Criteria: (O1) Area covered by an index MBR (O2) Overlap between directory MBRs (O3) Margin of a directory rectangle (O4) Storage utilization Sometimes it is impossible to optimize all the above criteria at the same time!

32 R*-tree ChooseSubtree: If next node is a leaf node, choose the node using the following criteria: Least overlap enlargement Least area enlargement Smaller area Else Least area enlargement Smaller area

33 R*-tree SplitNode Choose the axis to split Choose the two groups along the chosen axis ChooseSplitAxis Along each axis, sort rectangles and break them into two groups (M-2m+2 possible ways where one group contains at least m rectangles). Compute the sum S of all margin-values (perimeters) of each pair of groups. Choose the one that minimizes S ChooseSplitIndex Along the chosen axis, choose the grouping that gives the minimum overlap-value

34 R*-tree Forced Reinsert: defer splits, by forced-reinsert, i.e.: instead of splitting, temporarily delete some entries, shrink overflowing MBR, and re- insert those entries Which ones to re-insert? How many? A: 30%

35 R-tree: variations What about static datasets? (no ins/del) Hilbert What about other bounding shapes?

36 R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points?

37 R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’

38 R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; bad for ‘y’

39 R-trees - variations what about static datasets (no ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’ Q: how to improve?

40 R-trees - variations A: plane-sweep on HILBERT curve!

41 R-trees - variations A: plane-sweep on HILBERT curve! In fact, it can be made dynamic (how?), as well as to handle regions (how?)

42 R-trees - variations Dynamic (‘Hilbert R- tree): each point has an ‘h’- value (hilbert value) insertions: like a B-tree on the h-value but also store MBR, for searches

43 Hilbert R-tree Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR

44 R-trees - variations Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR ~B-tree

45 R-trees - variations Data structure of a node? LHV x-low, ylow x-high, y-high ptr h-value >= LHV & MBRs: inside parent MBR ~ R-tree


Download ppt "Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation."

Similar presentations


Ads by Google