Presentation is loading. Please wait.

Presentation is loading. Please wait.

File Organizations March 2007R McFadyen ACS - 39021 In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced.

Similar presentations


Presentation on theme: "File Organizations March 2007R McFadyen ACS - 39021 In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced."— Presentation transcript:

1 File Organizations March 2007R McFadyen ACS - 39021 In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced b + -tree - split on overflow; merge on underflow - in practice it is usually 3 or 4 levels deep search, insert, delete algorithms

2 File Organizations March 2007R McFadyen ACS - 39022 Using Enterprise Manager to create an index

3 File Organizations March 2007R McFadyen ACS - 39023 Using Query Analyzer to create a create table / index Create an index on the employeeID column of the emp_pay table that enforces uniqueness. This index physically orders the data on disk because the CLUSTERED clause is specified. CREATE TABLE emp_pay ( employeeID int NOT NULL, base_pay money NOT NULL, commission decimal(2, 2) NOT NULL ) INSERT emp_pay VALUES (1, 500,.10) INSERT emp_pay VALUES (2, 1000,.05) … CREATE UNIQUE CLUSTERED INDEX employeeID_ind ON emp_pay (employeeID) GO

4 File Organizations March 2007R McFadyen ACS - 39024 Using Query Analyzer to create a create table / index Create an index on the orderID and employeeID columns of the order_emp table. CREATE TABLE order_emp ( orderID int IDENTITY(1000, 1), employeeID int NOT NULL, orderdate datetime NOT NULL DEFAULT GETDATE(), orderamount money NOT NULL ) INSERT order_emp (employeeID, orderdate, orderamount) VALUES (5, '4/12/98', 315.19) INSERT order_emp (employeeID, orderdate, orderamount) VALUES (5, '5/30/98', 1929.04) INSERT order_emp (employeeID, orderdate, orderamount) VALUES (1, '1/03/98', 2039.82) … CREATE INDEX emp_order_ind ON order_emp (orderID, employeeID)

5 File Organizations March 2007R McFadyen ACS - 39025 Using Enterprise Manager to create a create table / index PRIMARY KEY Is a constraint that enforces entity integrity for a given column or columns through a unique index. Only one PRIMARY KEY constraint can be created per table. UNIQUE Is a constraint that provides entity integrity for a given column or columns through a unique index. A table can have multiple UNIQUE constraints.

6 File Organizations March 2007R McFadyen ACS - 39026 Using Enterprise Manager to create table / index CLUSTERED Creates an object where the physical order of rows is the same as the indexed order of the rows, and the bottom (leaf) level of the clustered index contains the actual data rows. (note this is a variation from the text’s discussion) A table or view is allowed one clustered index at a time. (another variation … indexing a view) A view with a clustered index is called an indexed view. FILLFACTOR Specifies how full SQL Server should make each index page used to store the index data. User-specified fillfactor values can be from 1 through 100, with a default of 0. A lower fill factor creates the index with more space available for new index entries without having to allocate new space.

7 File Organizations March 2007R McFadyen ACS - 39027 Motivation (finding one record given its key) Scanning a file is time consuming B + -tree provides a short access path file of records page1 page2 page3 B + -tree

8 File Organizations March 2007R McFadyen ACS - 39028 Motivation A B + -tree is a tree, in which each node is a page. A B + -tree for a file is stored in a separate file. A file could have many B + -trees file of records page1 page2 page3 B + -tree

9 File Organizations March 2007R McFadyen ACS - 39029 b + -tree based on b-tree (Bayer, balanced, Boeing) dynamic Root Internal nodes Leaf nodes...

10 File Organizations March 2007R McFadyen ACS - 390210 Node structure for b + -tree of order p non-leaf node (internal node or a root) (q  p) K 1 < K 2 <... < K q-1 (i.e. it’s an ordered set) for any key value, X, in the subtree pointed to by P i K i-1 < X  K i for 1 < i < q X  K 1 for i = 1 K q-1 < X for i = q each internal node has at most p pointers each node except root must have at least  p/2  pointers the root, if it has some children, must have at least 2 pointers One more pointer than there are keys All but root must be at least half full

11 File Organizations March 2007R McFadyen ACS - 390211 Node structure for b + -tree of order p leaf node (terminal node) K 1 < K 2 <... < K q-1 Pr i points to a record with key value K i or Pr i points to a block containing a record with key value K i each leaf has at least  p/2  keys maximum of p keys all leaves are at the same level (balanced) P next points to the next leaf for key sequencing Pairs of key/pointer plus one next pointer Dense or non- dense index

12 File Organizations March 2007R McFadyen ACS - 390212 Example insert records with key values Diane, Cory, Ramon, Amy, Miranda, Ahmed, Marshall, Zena, Rhonda, Vincent, Hok into a b + -tree with p=3. internal node will have minimum 2 pointers and maximum 3 pointers - inserting a fourth will cause a split leaf can have at least 2 key/pointer pairs and a maximum of 3 key/pointer pairs - inserting a fourth will cause a split Typically p is very large … 100, 200, …

13 File Organizations March 2007R McFadyen ACS - 390213 insert Diane Diane Pointer to data Pointer to next leaf in ascending key sequence insert Cory Cory, Diane

14 File Organizations March 2007R McFadyen ACS - 390214 Example insert Ramon Cory, Diane, Ramon inserting Amy will cause the node to overflow: Amy, Cory, Diane, Ramon This leaf must split see next =>

15 File Organizations March 2007R McFadyen ACS - 390215 continuing with insertion of Amy - split the node and promote a key value upwards (this must be Cory because it’s the highest key value in the left subtree) Amy, Cory, Diane, Ramon Amy, CoryDiane, Ramon Cory Tree has grown one level, from the bottom up

16 File Organizations March 2007R McFadyen ACS - 390216 Splitting Nodes Any value being promoted upwards will come from the node that is splitting. When a leaf splits, a ‘copy’ of a key value is promoted. When an internal node splits, the middle key value ‘moves’ from a child to a parent node. There are three situations to be concerned with: a leaf splits, an internal node splits, a new root is generated.

17 File Organizations March 2007R McFadyen ACS - 390217 Leaf splitting When a leaf splits, a new leaf is allocated the original leaf is the left sibling, the new one is the right sibling key and pointer pairs of the overflowing node are redistributed: the left sibling will have smaller keys than the right sibling a 'copy' of the key value which is the largest of the keys in the left sibling is promoted to the parent Two situations arise: the parent exists or not. If the parent exists, then a copy of the key value and the pointer to the right sibling are promoted upwards. Otherwise, the b+-tree is just beginning to grow... 33 12 22 3344 48 5512 2244 48 5531 33 22 33 insert 31

18 File Organizations March 2007R McFadyen ACS - 390218 Internal node splitting If an internal node splits and it is not the root, insert the key and pointer and then determine the middle key a new 'right' sibling is allocated everything to its left stays in the left sibling everything to its right goes into the right sibling the middle key value along with the pointer to the new right sibling is promoted to the parent (the middle key value 'moves' to the parent to become the discriminator between this left and right sibling) 22 33 55 22 26 55 Insert 26 33

19 File Organizations March 2007R McFadyen ACS - 390219 Internal node splitting When a new root is formed, a key value and two pointers must be placed into it. 26 55 Insert 56 26 56 55

20 File Organizations March 2007R McFadyen ACS - 390220 A sample trace Diane, Cory, Ramon, Amy, Miranda, Marshall, Zena, Rhonda, Vincent, Simon, Mary into a b + -tree with p=3. Amy, CoryDiane, Ramon Cory Miranda

21 File Organizations March 2007R McFadyen ACS - 390221 Amy, Cory Cory Diane, Miranda, Ramon Marshall Amy, Cory Diane, MarshallMiranda, Ramon Cory Marshall Zena

22 File Organizations March 2007R McFadyen ACS - 390222 Amy, Cory Diane, MarshallMiranda, Ramon, Zena Cory Marshall Rhonda Amy, Cory Diane, Marshall Rhonda, Zena Cory Marshall Ramon Miranda, Ramon

23 File Organizations March 2007R McFadyen ACS - 390223 Amy, Cory Diane, Marshall Rhonda, Zena Marshall Miranda, Ramon Cory Ramon Vincent

24 File Organizations March 2007R McFadyen ACS - 390224 Amy, Cory Diane, Marshall Rhonda, Vincent,Zena Marshall Miranda, Ramon Cory Ramon Simon

25 File Organizations March 2007R McFadyen ACS - 390225 Marshall Miranda, Ramon Ramon Simon Rhonda, Simon Vincent, Zena Mary

26 File Organizations March 2007R McFadyen ACS - 390226 A sample b + -tree 5 37 8 67 91258 13 Records p = 3, p leaf = 2.

27 File Organizations March 2007R McFadyen ACS - 390227 Searching a b + -tree - search a record with key = 8: 5 37 8 67 91258 13

28 File Organizations March 2007R McFadyen ACS - 390228 Entry deletion - deletion sequence: 8, 12, 9, 7 5 37 9 67 12 59 13 Deleting 8 results in underflow and key redistribution

29 File Organizations March 2007R McFadyen ACS - 390229 Entry deletion - deletion sequence: 8, 12, 9, 7 5 37 67 59 13 12 is removed.

30 File Organizations March 2007R McFadyen ACS - 390230 Entry deletion - deletion sequence: 8, 12, 9, 7 5 36 657 13 9 is removed.

31 File Organizations March 2007R McFadyen ACS - 390231 Entry deletion - deletion sequence: 8, 12, 9, 7 5 36 65 13 Deleting 7 makes this pointer no use. Therefore, a merge at the level above the leaf level occurs.

32 File Organizations March 2007R McFadyen ACS - 390232 Entry deletion - deletion sequence: 8, 12, 9, 7 For this merge, 5 will be taken as a key value in A since any key value in B is less than or equal to 5 but any key value in C is larger than 5. 65 13 535 A B C 5 This pointer becomes useless. The corresponding node should also be removed.

33 File Organizations March 2007R McFadyen ACS - 390233 Entry deletion - deletion sequence: 8, 12, 9, 7 5 136 535

34 File Organizations March 2007R McFadyen ACS - 390234 b + -tree operations search - always the same search length - tree height retrieval - sequential access is facilitated - how? insert - may cause overflow - tree may grow delete - may cause underflow - tree may shrink What do you expect for storage utilization?


Download ppt "File Organizations March 2007R McFadyen ACS - 39021 In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced."

Similar presentations


Ads by Google