Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 10 2-3-4 Trees and External Storage © John Urrutia 2014, All Rights Reserved1.

Similar presentations


Presentation on theme: "Chapter 10 2-3-4 Trees and External Storage © John Urrutia 2014, All Rights Reserved1."— Presentation transcript:

1 Chapter 10 2-3-4 Trees and External Storage © John Urrutia 2014, All Rights Reserved1

2 2-3-4 Trees Binary Tree Each parent node may have up to 2 children. Each child can have only 1 data item. Multi-way tree (2-3-4) Each parent node must have 2 to 4 children. The max number of children is call the order of the tree Each child node will have 1 data item & can have up to 3 2-3-4 Trees are self-balancing Just like binary trees © John Urrutia 2014, All Rights Reserved2

3 2-3-4 Trees (the Rules) Leaf nodes have no children All leaf nodes are always at the same level All leaf nodes must have at least 1 Data item but may have as many as 3 50 30 10 20 40 55 62 6466 75 83 86 60 7080 © John Urrutia 2014, All Rights Reserved3

4 2-3-4 Trees (the Rules) Non-leaf nodes The data items in the node dictates the number of children 1 Data item – exactly 2 children 2 Data items – exactly 3 children 3 Data items – exactly 4 children This relationship sets the structure of the tree Empty Nodes are not allowed © John Urrutia 2014, All Rights Reserved4

5 2-3-4 Trees (the Rules) Nodes with: 2 Links are called 2-node 3 Links are called 3-node 4 Links are called 4-node Unlike binary trees 2-3-4 do not have nodes with only 1 link © John Urrutia 2014, All Rights Reserved5

6 2-3-4 Tree Organization Data items are numbered 0, 1, 2 and are stored in ascending sequence Data links are numbered 0, 1, 2, 3 All Data in a child of Link 0 have values < the data value of Link 0 All Data in a child of Link 1 have values > the data value of Link 0 but < the data value of Link 1 All Data in a child of Link 2 have values > the data value of Link 1 but < the data value of Link 2 All Data in a child of Link 3 have values > the data value of Link 2 © John Urrutia 2014, All Rights Reserved6

7 2-3-4 Tree Organization 30 35 5578100 105 50 75 95 0 1 2 0 1 2 3 © John Urrutia 2014, All Rights Reserved7

8 2-3-4 Tree Organization All Data in a child of Link 0 have values < the data value of Link 0 All Data in a child of Link 1 have values > the data value of Link 0 but < the data value of Link 1 All Data in a child of Link 2 have values > the data value of Link 1 but < the data value of Link 2 All Data in a child of Link 3 have values > the data value of Link 2 Duplicate values are normally not permitted © John Urrutia 2014, All Rights Reserved8

9 Keys & Children A B C Keys > C B < Keys < CA < Keys < BKeys < A © John Urrutia 2014, All Rights Reserved9

10 Searching 2-3-4 Trees Search for the value (64) in the parent Select link whose value is greater than the 64 (Link 1 ) Search Link 1 and repeat as necessary until value found or at leaf node 50 30 10 20 40 55 62 6466 75 83 86 60 7080 © John Urrutia 2014, All Rights Reserved10

11 Inserting 2-3-4 Trees Insertion always occurs in a leaf node Search for the value to insert in the root and select the first link whose value is > the insert value Navigate to the Link If the Link is full split it If not follow link to next level Repeat as necessary until the appropriate leaf is found. If leaf is full Split the leaf into two and insert the value If not Insert the value © John Urrutia 2014, All Rights Reserved11

12 Inserting 2-3-4 Trees The simple process: Find the leaf node that should contain the new value If the node isn’t full simply insert the value. 28|55| 11| |42| |74| | 05|09|30| |97| |44|47|63|67|72 13|23| 13| |23 18 © John Urrutia 2014, All Rights Reserved12

13 Inserting 2-3-4 Trees Splitting a full node: Insert 25 Create a new node 40|50|60 39| |41| | 52| | 63| | | | 10| | © John Urrutia 2014, All Rights Reserved13

14 Inserting 2-3-4 Trees Splitting a full node: Insert 25 Move 50 to the parent 40| |60 39| |41| | 52| | 63| | | | 10|50| | | © John Urrutia 2014, All Rights Reserved14

15 Inserting 2-3-4 Trees Splitting a full node: Insert 25 Move 60 to the new node with the children 40| | 39| |41| | 52| | 63| | 60| | 10|50| | | © John Urrutia 2014, All Rights Reserved15

16 Inserting 2-3-4 Trees Splitting the root node: Create 2 new nodes 1 for each left and right children Middle becomes the new root 9| |41| | 52| | 91| | 10|50|90 © John Urrutia 2014, All Rights Reserved16

17 Inserting 2-3-4 Trees Splitting the root node: Create 2 new nodes 1 for each left and right children Middle becomes the new root 9| |41| | 52| | 91| | 10|50|90 10| | 90| | © John Urrutia 2014, All Rights Reserved17

18 Inserting 2-3-4 Trees Splitting the root node: Create 2 new nodes 1 for each left and right children Middle becomes the new root 50| | 9| |41| | 52| | 91| | 10| | 90| | © John Urrutia 2014, All Rights Reserved18

19 2-3-4 DataItem Class class DataItem { public long dData; public DataItem(long dd) { dData = dd; } public void displayItem() { Console.Write("/"+dData); } } © John Urrutia 2014, All Rights Reserved19 Data

20 2-3-4 Node Class Data class Node { private const int ORDER = 4; private int numItems; private Node parent; private Node[] childArray = new Node[ORDER]; private DataItem[] itemArray = new DataItem[ORDER-1]; //-------------------------------------------- © John Urrutia 2014, All Rights Reserved20

21 2-3-4 Node Class Node Methods public void connectChild(int childNum, Node child) public Node disconnectChild(int childNum) public Node getChild(int childNum) public Node getParent() © John Urrutia 2014, All Rights Reserved21

22 2-3-4 Node Class Data Methods public DataItem getItem(int index) public int insertItem(DataItem newItem) public DataItem removeItem() © John Urrutia 2014, All Rights Reserved22

23 2-3-4 Node Class Utility Methods public Boolean isFull() public Boolean isLeaf() public int getNumItems() public int findItem(long key) public void displayNode() © John Urrutia 2014, All Rights Reserved23

24 2-3-4 Tree Class private Node root = new Node(); public int find(long key) public void insert(long dValue) public void split(Node thisNode) public Node getNextChild(Node theNode, long Value) public void displayTree() private void recDisplayTree(Node thisNode, int level, int childNumber) © John Urrutia 2014, All Rights Reserved24

25 2-3-4 Tree Class Code walk through © John Urrutia 2014, All Rights Reserved25

26 2-3-4 Trees & Red-Black Trees 2-3-4 trees don’t look like Red-Black tree or do they?? Red-black trees were developed after 234 trees We can transform 2-3-4 to Red-Black because they are isomorphic using these rules: Transform any 2-node in the 2-3-4 tree into a black node in the red-black tree. Transform any 3-node into a child node and a parent node Transform any 4-node into a parent and two children © John Urrutia 2014, All Rights Reserved26

27 2-3-4 Trees & Red-Black Trees © John Urrutia 2014, All Rights Reserved27 41| | 41 2 Node

28 2-3-4 Trees & Red-Black Trees © John Urrutia 2014, All Rights Reserved28 41|52| 41 3 Node 52 41 Either Is Okay

29 2-3-4 Trees & Red-Black Trees © John Urrutia 2014, All Rights Reserved29 41|52|63 41 4 Node 52 63

30 2-3-4 Trees & Red-Black Trees Color Flips Are the same as a 4-node split Rotations are the result of a 3-node split Right rotation is the for the Left node split Left rotation is for the Right node split Efficiency – with some slight differences, they are roughly the same © John Urrutia 2014, All Rights Reserved30

31 2-3 Trees Created by J. E. Hopcroft in 1970 Similar to 2-3-4 trees except a Node can hold 2 data items and can have 0 to 3 children. The split process is similar but cannot happen on the way down to the insertion point After insertion splits percolate up the tree to maintain balance © John Urrutia 2014, All Rights Reserved31

32 External Storage Processor speed is rated in clock speed (Gigahertz) or Instructions per second (MIPS or FLOpS) 2.67 Gigahertz = 2,670,000,000 ticks per sec. Approx. 333,000,000 instructions per sec. The most expensive operation a system performs is I/O Approx. 1,100,000 bytes per sec. 300 times as long as an average instruction. © John Urrutia 2014, All Rights Reserved32

33 External Storage

34 Disk Organization Data Terms © John Urrutia 2014, All Rights Reserved34 Block Buffer Cylinder Sector Track Partition Seek Read Write Transfer Operation Terms

35 External Storage Data Terms Block – the amount of data transferred in one I/O Buffer – RAM to store one or more blocks of data. Usually in multiples of sector size 4,8,16,32KB Cluster –the set of blocks that match the I/O buffer size. Which are read or written together. Cylinder –the set of tracks simultaneously accessible by the read/write heads Sector –the physical area on a platter to hold one block Track –The circle scribed by the read/write head Partition –a logical division on a disk drive © John Urrutia 2014, All Rights Reserved35

36 External Storage Disk Organization Data Terms © John Urrutia 2014, All Rights Reserved36 Block / Sector Track

37 External Storage Disk Organization Data Terms © John Urrutia 2014, All Rights Reserved37 Cylinder

38 External Storage © John Urrutia 2014, All Rights Reserved38 Operation Terms Seek –The physical movement of the read/write head to a particular cylinder on the platter Read – The process of retrieving data from the drive Write – The process of storing data on the drive Transfer – The movement of data to or from the drive

39 External Storage Disk Specifications © John Urrutia 2014, All Rights Reserved39 Manufacturer Seagate Technology Model ST9250410AS Spindle Speed 7200 rpm Avg. Latency 4.17msec I/O data transfer rate 3.0 (Gbits/sec max) T2T seek time (read) 1.5msec Avg. seek (read) 11.0msec Avg. seek (write) 13.0msec

40 External Storage Disk Organization © John Urrutia 2014, All Rights Reserved40 Bytes/Sector512 Sectors/Track63 Size232.88 GB (250,056,737,280 bytes) Total Cylinders30,401 Total Sectors488,392,065 Total Tracks7,752,255 Tracks/Cylinder2

41 External Storage File system Organization Sequential Access Stream of bytes blocked together Must be read in sequential order beginning to end or vice versa. Can only add data to either end of the file. Can’t delete records without copying entire file. Direct (random) Access Data organized into record blocks based on a key value Can be read sequentially or randomly by record Can add or delete anywhere in the file provided there is room. © John Urrutia 2014, All Rights Reserved41

42 B-Trees and I/O We structure our b-tree so the data in the nodes correspond to the size of the disk clusters. We use the key values to designate the cluster that contains the data. This provides us with log n access to any record in our dataset, where n represents the number of children for each node in the tree. Each level in the tree requires 1 I/O when searching for a prospective record.

43 Summary A multiway tree has more keys and children than a binary tree. A 2-3-4 tree is a multiway tree with up to three keys and four children per node. In a multiway tree, the keys in a node are arranged in ascending order. In a 2-3-4 tree, all insertions are made in leaf nodes, and all leaf nodes are on the same level. © John Urrutia 2014, All Rights Reserved43

44 Summary Three kinds of nodes are possible in a 2-3-4 tree: A 2-node has one key and two children A 3-node has two keys and three children A 4-node has three keys and four children. There is no 1-node in a 2-3-4 tree. In a search in a 2-3-4 tree, at each node the keys are examined. If the search key is not found the next node will be: Child 0 If the search key is less than key 0 Child 1 if the search key is between key 0 and key 1 Child 2 if the search key is between key 1 and key 2 Child 3 if the search key is greater than key 2. © John Urrutia 2014, All Rights Reserved44

45 Summary 2-3-4 tree Insertion requires that any full node be split on the way down the tree, during the search for the insertion point. Splitting the root creates two new nodes Splitting any other node creates one new node. The height of a 2-3-4 tree only increases when the root is split. © John Urrutia 2014, All Rights Reserved45

46 Summary There is a one-to-one correspondence between a 2-3-4 tree and a red-black tree. To transform a 2-3-4 tree into a red-black tree Make each 2-node into a black node Make each 3-node into a black parent with a red child Make each 4-node into a black parent with two red children. © John Urrutia 2014, All Rights Reserved46

47 Summary When a 3-node is transformed into a parent and child, either node can become the parent. Splitting a node in a 2-3-4 tree is the same as performing a color flip in a red-black tree. A rotation in a red-black tree corresponds to changing between the two possible orientations (slants) when transforming a 3-node. © John Urrutia 2014, All Rights Reserved47

48 Summary The height of a 2-3-4 tree is less than log N. Search times are proportional to the height. The 2-3-4 tree wastes space because many nodes are not even half full. © John Urrutia 2014, All Rights Reserved48

49 Summary A 2-3 tree is similar to a 2-3-4 tree, except that it can have only one or two data items and one, two, or three children. Insertion in a 2-3 tree involves finding the appropriate leaf and then performing splits from the leaf upward, until a non-full node is found. © John Urrutia 2014, All Rights Reserved49

50 Summary External storage means storing data outside of main memory, usually on a disk. External storage is larger, cheaper (per byte), and slower than main memory. Data in external storage is typically transferred to and from main memory a block at a time. Data can be arranged in external storage in sequential key order. This gives fast search times but slow insertion (and deletion) times. © John Urrutia 2014, All Rights Reserved50

51 Summary A B-tree is a multiway tree in which each node may have dozens or hundreds of keys and children. There is always one more child than there are keys in a node. For the best performance, a B-tree is typically organized so that a node holds one block of data. If the search criteria involve many keys, a sequential search of all the records in a file may be the most practical approach. © John Urrutia 2014, All Rights Reserved51


Download ppt "Chapter 10 2-3-4 Trees and External Storage © John Urrutia 2014, All Rights Reserved1."

Similar presentations


Ads by Google