AVL Deletion: Case #1: Left-left due to right deletion Spring 2010CSE332: Data Abstractions1 h a Z Y b X h+1 h h+2 h+3 b ZY a h+1 h h+2 X h h+1 Same single.

Slides:



Advertisements
Similar presentations
Chapter 4: Trees Part II - AVL Tree
Advertisements

Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
AVL Trees CS II – Fall /8/2010. Announcements HW#2 is posted – Uses AVL Trees, so you have to implement an AVL Tree class. Most of the code is provided.
CSE332: Data Abstractions Lecture 10: More B-Trees Tyler Robison Summer
CSE332: Data Abstractions Lecture 7: AVL Trees Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 10: More B Trees; Hashing Dan Grossman Spring 2010.
CSE 332 Review Slides Tyler Robison Summer
CSE332: Data Abstractions Lecture 9: BTrees Tyler Robison Summer
CSE332: Data Abstractions Lecture 7: AVL Trees Tyler Robison Summer
CSE332: Data Abstractions Lecture 8: AVL Delete; Memory Hierarchy Dan Grossman Spring 2010.
B-Trees. Motivation for B-Trees Index structures for large datasets cannot be stored in main memory Storing it on disk requires different approach to.
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
Other time considerations Source: Simon Garrett Modifications by Evan Korth.
CSE 326: Data Structures Lecture #11 B-Trees Alon Halevy Spring Quarter 2001.
Data Structures Topic #10.
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Splay Trees and B-Trees
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
Properties of BST delete We first do the normal BST deletion: – 0 children: just delete it – 1 child: delete it, connect child to parent – 2 children:
CSE373: Data Structures & Algorithms Optional Slides: AVL Delete Dan Grossman Fall 2013.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds.
ALGORITHMS FOR ISNE DR. KENNETH COSH WEEK 6.
Oct 29, 2001CSE 373, Autumn External Storage For large data sets, the computer will have to access the disk. Disk access can take 200,000 times longer.
1 Road Map Associative Container Impl. Unordered ACs Hashing Collision Resolution Collision Resolution Open Addressing Open Addressing Separate Chaining.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
Balanced Trees AVL Trees Red-Black Trees 2-3 Trees Trees.
Advanced Data Structure Hackson Leung
B-Trees and Red Black Trees. Binary Trees B Trees spread data all over – Fine for memory – Bad on disks.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
Starting at Binary Trees
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
CSE 326 Killer Bee-Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001 Where was that root?
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
B+ Trees  What if you have A LOT of data that needs to be stored and accessed quickly  Won’t all fit in memory.  Means we have to access your hard.
CSE 332 Data Abstractions: A Heterozygous Forest of AVL, Splay, and B Trees Kate Deibel Summer 2012 July 2, 2012CSE 332 Data Abstractions, Summer
1 CPS216: Advanced Database Systems Notes 05: Operators for Data Access (contd.) Shivnath Babu.
CompSci Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
1 CSE 326: Data Structures Trees. 2 Today: Splay Trees Fast both in worst-case amortized analysis and in practice Are used in the kernel of NT for keep.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
CSE 326: Data Structures Lecture #9 Big, Bad B-Trees Steve Wolfman Winter Quarter 2000.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables II.
B/B+ Trees 4.7.
Btrees Deletion.
CSE 332 Data Abstractions B-Trees
B+-Trees.
CSE332: Data Abstractions Lecture 10: More B Trees; Hashing
B-Trees.
B- Trees D. Frey with apologies to Tom Anastasio
CSE 332: Data Abstractions AVL Trees
Solution for Section Worksheet 4, #7b & #7c
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
Richard Anderson Spring 2016
B-Trees.
B-Trees.
Presentation transcript:

AVL Deletion: Case #1: Left-left due to right deletion Spring 2010CSE332: Data Abstractions1 h a Z Y b X h+1 h h+2 h+3 b ZY a h+1 h h+2 X h h+1 Same single rotation as when an insert in the left-left grandchild caused imbalance due to X becoming taller But here the “height” at the top decreases, so more rebalancing farther up the tree might still be necessary

AVL Deletion: Case #2: Left-right due to right deletion Spring 2010 a h-1 h h V U h+1 h+2 h+3 Z X b c h+1 h c X h-1 h+1 h V U h+2 Z a b h h+1 h Same double rotation when an insert in the left-right grandchild caused imbalance due to c becoming taller But here the “height” at the top decreases, so more rebalancing farther up the tree might still be necessary

AVL Deletion: Case #3: Case 1 revisited Spring 2010CSE332: Data Abstractions3 h a Z Y b X h+1 h+2 h+3 b ZY a h+1 h+2 h+3 X h h+1 What if both children have same height (h+1)? Do same as case 1; single rotation Why can’t we do the double rotation from case 2?

B-Trees  Smaller keys on left, larger on right  All data in leaves  Need to decide:  # of items per leaf  # of items per internal node

B-Tree Operations Insertion 1. Traverse from the root to the proper leaf. Insert the data in its leaf in sorted order 2. If the leaf now has L+1 items, overflow!  Split the leaf into two leaves:  Original leaf with  (L+1)/2  items  New leaf with  (L+1)/2  items  Attach the new child to the parent  Adding new key to parent in sorted order 3. If an internal node has M+1 children, overflow!  Split the node into two nodes  Original node with  (M+1)/2  children  New node with  (M+1)/2  children  Attach the new child to the parent  Adding new key to parent in sorted order Splitting at a node (step 3) could make the parent overflow too  So repeat step 3 up the tree until a node doesn’t overflow  If the root overflows, make a new root with two children  This is the only case that increases the tree height Deletion 1.Remove the data from its leaf 2.If the leaf now has  L/2  - 1, underflow! If a neighbor has >  L/2  items, adopt and update parent Else merge node with neighbor Guaranteed to have a legal number of items Parent now has one less node 3.If step (2) caused the parent to have  M/2  - 1 children, underflow! If an internal node has  M/2  - 1 children If a neighbor has >  M/2  items, adopt and update parent Else merge node with neighbor Guaranteed to have a legal number of items Parent now has one less node, may need to continue up the tree If we merge all the way up through the root, that’s fine unless the root went from 2 children to 1 In that case, delete the root and make child the root This is the only case that decreases tree height Somewhat complex; we won’t go into details…

Aside: Limitations of B-Trees in Java 6 Whole point of B-Trees is to minimize disk accesses It is worth knowing enough about “how Java works” to understand why B-Trees in Java aren’t what we want  Assuming our goal is efficient number of disk accesses  Java has many advantages, but it wasn’t designed for this The problem is extra levels of indirection…

One approach 7 Say we int keys, and some data E interface Keyed { int key(E); } class BTreeNode { static final int M = 128; int[] keys = new int[M-1]; BTreeNode [] children = new BTreeNode[M]; int numChildren = 0; … } class BTreeLeaf { static final int L = 32; E[] data = (E[])new Object[L]; int numItems = 0; … } The problem: how Java stores stuff in memory

What that looks like 8 BTreeNode (3 objects with “header words”) M M 70 BTreeLeaf (data objects not in contiguous memory) 20 (larger key array) (larger pointer array) L … (larger array) All the red references indicate unnecessary indirection

The moral 9  The whole idea behind B trees was to keep related data in contiguous memory  But that’s “the best you can do” in Java  Java’s advantage is generic, reusable code  C# may have better support for “flattening objects into arrays”  C and C++ definitely do  Levels of indirection matter!

Picking a hash function 10  If keys aren’t int s, the client must convert to an int  Trade-off: speed and distinct keys hashing to distinct int s  Very important example: Strings  Key space K = s 0 s 1 s 2 …s m-1  Where s i are chars: s i  [0,51] or s i  [0,255] or s i  [0, ]  Some choices: Which avoid collisions best? 1. h(K) = s 0 % TableSize 2. h(K) = % TableSize 3. h(K) = % TableSize What causes collisions for each? Anything w/ same first letter Any rearrangement of letters Hmm… not so clear

Java-esque String Hash 11  Java characters in Unicode format; 2 16 bits  So this would require n-2 + n-3 + n-4 +… multiplications to compute, right?  Can compute efficiently via a trick called Horner’s Rule:  Idea: Avoid expensive computation of 31 k  Say n=4  h=((s[0]*31+s[1])*31+s[2])*31+s[3]  Under what circumstances could this hash function prove poor?

Hash functions 12 A few rules of thumb / tricks: 1. Use all 32 bits (careful, that includes negative numbers) 2. When smashing two hashes into one hash, use bitwise-xor  Problem with Bitwise AND?  Produces too many 0 bits  Problem with Bitwise OR?  Produces too many 1 bits 3. Rely on expertise of others; consult books and other resources 4. If keys are known ahead of time, choose a perfect hash