Presentation is loading. Please wait.

Presentation is loading. Please wait.

R-Trees Extension of B+-trees.  Collection of d-dimensional rectangles.  A point in d-dimensions is a trivial rectangle.

Similar presentations


Presentation on theme: "R-Trees Extension of B+-trees.  Collection of d-dimensional rectangles.  A point in d-dimensions is a trivial rectangle."— Presentation transcript:

1 R-Trees Extension of B+-trees.  Collection of d-dimensional rectangles.  A point in d-dimensions is a trivial rectangle.

2 Non-rectangular Data Non-rectangular data may be represented by minimum bounding rectangles (MBRs).

3 Operations Insert Delete Find all rectangles that intersect a query rectangle. Good for large rectangle collections stored on disk.

4 R-Trees—Structure Data nodes (leaves) contain rectangles. Index nodes (non-leaves) contain MBRs for data in subtrees. MBR for rectangles or MBRs in a non-root node is stored in parent node.

5 R-Trees—Structure R-tree of order M.  Each node other than the root has between m <= ceil(M/2) and M rectangles/MBRs. Assume m = ceil(M/2) henceforth.  Typically, m = ceil(M/2).  Root has between 2 and M rectangles/MBRs.  Each index node has as many MBRs as children.  All data nodes are at the same level.

6 Example R-tree of order 4.  Each node may have up to 4 rectangles/MBRs.

7 Example Possible partitioning of our example data into 12 leaves.

8 Example Possible R-tree of order 4 with 12 leaves. a b cde fghij kl mnop Leaves are data nodes that contain 4 input rectangles each. a-p are MBRs

9 Example Possible corresponding grouping. a b c d m a b cde fghij kl mnop

10 Example a b c d m e f n Possible corresponding grouping. a b cde fghij kl mnop

11 Example a b c d m e f n h g i o p Possible corresponding grouping. a b cde fghij kl mnop

12 Query Report all rectangles that intersect a given rectangle.

13 Query Start at root and find all MBRs that overlap query. Search corresponding subtrees recursively.

14 Query m n op    x a b cde fghij kl mnop

15 Search m. m n opa b c d   x x a b cde fghij kl mnop

16 Insert Similar to insertion into B+-tree but may insert into any leaf; leaf splits in case capacity exceeded.  Which leaf to insert into?  How to split a node?

17 Insert—Leaf Selection Follow a path from root to leaf. At each node move into subtree whose MBR area increases least with addition of new rectangle. m n op

18 Insert—Leaf Selection Insert into m. m

19 Insert—Leaf Selection Insert into n. n

20 Insert—Leaf Selection Insert into o. o

21 Insert—Leaf Selection Insert into p. p

22 Insert—Split A Node Split set of M+1 rectangles/MBRs into 2 sets A and B.  A and B each have at least m rectangles/MBRs.  Sum of areas of MBRs of A and B is minimum. M = 8, m = 4

23 Insert—Split A Node Split set of M+1 rectangles/MBRs into 2 sets A and B.  A and B each have at least m rectangles/MBRs.  Sum of areas of MBRs of A and B is minimum. M = 8, m = 4

24 Insert—Split A Node Split set of M+1 rectangles/MBRs into 2 sets A and B.  A and B each have at least m rectangles/MBRs.  Sum of areas of MBRs of A and B is minimum. M = 8, m = 4

25 Insert—Split A Node Exhaustive search for best A and B.  Compute area(MBR(A)) + area(MBR(B)) for each possible A.  Note—for each A, the B is unique.  Select partition that minimizes this sum. When |A| = m = ceil(M/2), number of choices for A is (M+1)! m!(M+1-m)! Impractical for large M.

26 Insert—Split A Node Grow A and B using a clustering strategy.  Start with a seed rectangle a for A and b for B.  Grow A and B one rectangle at a time.  Stop when the M+1 rectangles have been partitioned into A and B.

27 Insert—Split A Node Quadratic Method—seed selection.  Let S be the set of M+1 rectangles to be partitioned.  Find a and b in  S that maximize area(MBR(a,b)) – area(a) – area(b) M = 8, m = 4

28 Insert—Split A Node Quadratic Method—seed selection.  Let S be the set of M+1 rectangles to be partitioned.  Find a and b in  S that maximize area(MBR(a,b)) – area(a) – area(b) M = 8, m = 4

29 Insert—Split A Node Quadratic Method—assign remaining rectangles/MBRs.  Find an unassigned rectangle c that maximizes |area(MBR(A,c)) – area(MBR(A)) - (area(MBR(B,c)) – area(MBR(B)))| M = 8, m = 4

30 Insert—Split A Node Quadratic Method—assign remaining rectangles/MBRs.  Find an unassigned rectangle c that maximizes |area(MBR(A,c)) – area(MBR(A)) - (area(MBR(B,c)) – area(MBR(B)))| M = 8, m = 4

31 Insert—Split A Node Quadratic Method—assign remaining rectangles/MBRs.  Assign c to partition whose area increases least. M = 8, m = 4

32 Insert—Split A Node Quadratic Method—assign remaining rectangles/MBRs.  Continue assigning in this way until all remaining rectangles must necessarily be assigned to one of the two partitions for that partition to have m rectangles. M = 8, m = 4

33 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4

34 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Separation in x- dimension

35 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Rectangles with max x-separation

36 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Divide by x-width to normalize

37 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Separation in y- dimension

38 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Rectangles with max y-separation

39 Insert—Split A Node Linear Method—seed selection.  Choose a and b to have maximum normalized separation. M = 8, m = 4 Divide by y-width to normalize

40 Insert—Split A Node Linear Method—assign remainder.  Assign remaining rectangles in random order.  Rectangle is assigned to partition whose MBR area increases least.  Stop when all remaining rectangles must be assigned to one of the partitions so that the partition has its minimum required m rectangles. M = 8, m = 4

41 Delete If leaf doesn’t become deficient, simply readjust MBRs in path from root. If leaf becomes deficient, get from nearest sibling (if possible) and readjust MBRs. Combine with sibling as in B+ tree. Could instead do a more global reorganization to get better R-tree.

42 Variants R*-tree  Leaf selection and node overflows in insertion handled differently. Hilbert R-tree

43 Related Structures R + -tree  Index nodes have non-overlapping rectangles.  A data object may be represented in several data nodes.  No upper bound on size of a data node.  No bounds (lower/upper) on degree of an index node.

44 Related Structures Cell tree  Combines BSP and R+-tree concepts.  Index nodes have non-overlapping convex polyhedrons.  No lower/upper bound on size of a data node.  Lower bound (but not upper) on degree of an index node.


Download ppt "R-Trees Extension of B+-trees.  Collection of d-dimensional rectangles.  A point in d-dimensions is a trivial rectangle."

Similar presentations


Ads by Google