Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 1 Notes #7.

Similar presentations


Presentation on theme: "CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 1 Notes #7."— Presentation transcript:

1 CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Email: chen@cs.tamu.edu 1 Notes #7

2 Terms Sequential file (data file) Index file (key-pointer pairs) Search key Primary index Secondary index Dense index Sparse index Multi-level index 2

3 B+Trees Support fast search Support range search Support dynamic changes Could be either dense or sparse 3

4 Root B+Tree Examplen=3 100 120 150 180 30 3 5 11 30 35 100 101 110 120 130 150 156 179 180 200 4

5 Sample non-leaf to keysto keysto keys to keys < 5757  k<8181  k<95  95 57 81 95 5

6 Sample leaf node: From non-leaf node to next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 85 6

7 Size of nodes:n+1 pointers n keys FIXED 7

8 Don’t want nodes to be too empty Use at least Non-leaf:  (n+1)/2  pointers Leaf:  (n+1)/2  pointers to data 8

9 Full nodeMin. node Non-leaf Leaf n=3 120 150 180 30 3 5 11 30 35 9

10 B+tree rulestree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” 10

11 (3) Number of pointers/keys for B+tree Non-leaf (non-root) n+1n  (n+1)/ 2   (n+1)/ 2  – 1 Leaf (non-root) n+1n Rootn+1n21 Max Max Min Min ptrs keys ptrs  data keys  (n+ 1) / 2  11 could be 1

12 Search in a B+tree Start from the root Search in a leaf block May not have to go to the data file 12

13 Pseudo Code for Search in a B+tree Search(ptr, k); \\ search a record of key value k in the subtree rooted at *ptr Case 1. *ptr is a leaf IF (k = ki) for a key ki in *ptr THEN return(pi); ELSE return(Null); Case 2. *ptr is not a leaf find a key ki in *ptr such that ki <= k < k(i+1); return(Search(pi, k); 13

14 Insert into B+tree (a) simple case –space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root 14

15 Pseudo Code for Insertion in a B+tree Insert(ptr, (k,p), (k',p')); \\ p is a pointer to a data record with key value k, which are inserted into the subtree rooted at \\ *ptr; p' is a pointer to a new brother of *ptr, if created, in which k' is the smallest key value; Case 1. *ptr is a leaf IF there is room in *ptr, THEN insert (k,p) into *ptr; return(k'=0, p'=Null); ELSE re-arrange the content in *ptr and (k,p) into (p0, k1, p1,..., k(n+1), p(n+1)); leave (p0, k1,..., k(r-1), p(r-1)) in *ptr; create a new leaf q; put (pr, k(r+1), p(r+1),..., k(n+1), p(n+1)) in *q; return( k'= k_r, p' = q ); Case 2. *ptr is not a leaf find a key ki in *ptr such that ki <= k < k(i+1); Insert(pi, (k,p), (k",p")); IF (p" = Null) THEN return(k'=0, p'=Null); ELSE IF there is room in *ptr, THEN insert (k",p") into *ptr; return(k'=0, p'=Null); ELSE re-arrange the content in *ptr and (k",p") into (p0, k1, p1,..., k(n+1), p(n+1)); leave (p0, k1,..., k(r-1), p(r-1)) in *ptr; create a new leaf q; put (pr, k(r+1), p(r+1),..., k(n+1), p(n+1)) in *q; return( k'= k_r, p' = q ); 15

16 (a) Insert key = 32 n=3 3 5 11 30 31 30 100 16

17 (a) Insert key = 32 n=3 3 5 11 30 31 30 100 32 17

18 (b) Insert key = 7 n=3 3 5 11 30 31 30 100 18

19 (b) Insert key = 7 n=3 3 5 11 30 31 30 100 3535 7 7 19

20 (c) Insert key = 160 n=3 100 120 150 180 150 156 179 180 200 20

21 (c) Insert key = 160 n=3 100 120 150 180 150 156 179 180 200 160 179 180 21

22 (d) New root, insert 45 n=3 10 20 30 123123 10 12 20 25 30 32 40 22

23 (d) New root, insert 45 n=3 10 20 30 123123 10 12 20 25 30 32 40 30 new root 40 45 40 23

24 (a) Simple case (no example) (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf Deletion from B+tree 24

25 (b) Coalesce with sibling –Delete 50 10 40 100 10 20 30 40 50 n=4 25

26 (b) Coalesce with sibling –Delete 50 10 40 100 10 20 30 40 50 n=4 40 26

27 (c) Redistribute keys –Delete 50 10 40 100 10 20 30 35 40 50 n=4 27

28 (c) Redistribute keys –Delete 50 10 40 100 10 20 30 35 40 50 n=4 35 28

29 40 45 30 37 25 26 20 22 10 14 1313 10 2030 40 (d) Non-leaf coalesce –Delete 37 n=4 25 30 40 25 new root 29

30 B+tree deletions in practice –Often, coalescing is not implemented –Too hard and not worth it! 30

31 Interesting problem: For a B+tree, how large should n be? … n is number of keys per node 31

32 Sample assumptions: (1)Time to read node from disk is (S+Tn) ms (S, T: constants, e.g. S=70, T=0.5) (2) Once block in memory, use binary search: (a + b log 2 n) ms (a, b: constants, a « S) (3) Assume B+tree is full, i.e., nodes to examine is log n N, where N = # records 32 (4) Total search time: f(n) = (log n N+1)(S+Tn) + (a + b log 2 n)

33 øCan get: f(n) = time to find a record f(n) n opt n 33

34  FIND n opt by f’(n) = 0 Answer is n opt = “few hundred” 34

35 Variation on B+tree: B-tree (no +) Idea: –Avoid duplicate keys –Have record pointers in non-leaf nodes 35

36 to record to record to record with K1 with K2 with K3 to keys to keys to keys to keys k3 K1 P1K2 P2K3 P3 36

37 B-tree examplen=2 65 125 145 165 85 105 25 45 10 20 30 40 110 120 90 100 70 80 170 180 50 60 130 140 150 160 37

38 B-tree examplen=2 65 125 145 165 85 105 25 45 10 20 30 40 110 120 90 100 70 80 170 180 50 60 130 140 150 160 sequence pointers not useful now! 38

39 Tradeoffs: B-trees have faster lookup than B+trees  in B-tree, non-leaf & leaf different sizes  in B-tree, insertion and deletion more complicated  B+trees preferred! 39


Download ppt "CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 1 Notes #7."

Similar presentations


Ads by Google