MA/CSSE 473 Day 30 B Trees Dynamic Programming Binomial Coefficients Warshall's algorithm No in-class quiz today Student questions?

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Advertisements

RAIK 283: Data Structures & Algorithms
Dynamic Programming Dynamic Programming is a general algorithm design technique for solving problems defined by recurrences with overlapping subproblems.
1 Lecture 8: Data structures for databases II Jose M. Peña
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
COMP 451/651 Indexes Chapter 1.
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
Dynamic Programming Reading Material: Chapter 7..
Chapter 8 Dynamic Programming Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Design and Analysis of Algorithms - Chapter 81 Dynamic Programming Dynamic Programming is a general algorithm design technique Dynamic Programming is a.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Chapter 8 Dynamic Programming Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
Design and Analysis of Algorithms - Chapter 81 Dynamic Programming Dynamic Programming is a general algorithm design techniqueDynamic Programming is a.
Dynamic Programming A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 8 ©2012 Pearson Education, Inc. Upper Saddle River,
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
CS4432: Database Systems II
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
Indexing structures for files D ƯƠ NG ANH KHOA-QLU13082.
MA/CSSE 473 Day 28 Hashing review B-tree overview Dynamic Programming.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
More Trees Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees Main memory vs. disk ◦ Assumptions so far: ◦ We have assumed that we can store.
B+ Trees COMP
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 8 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
CSCE350 Algorithms and Data Structure Lecture 17 Jianjun Hu Department of Computer Science and Engineering University of South Carolina
Dynamic Programming Dynamic Programming Dynamic Programming is a general algorithm design technique for solving problems defined by or formulated.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
CSC401: Analysis of Algorithms CSC401 – Analysis of Algorithms Chapter Dynamic Programming Objectives: Present the Dynamic Programming paradigm.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it.
Starting at Binary Trees
MA/CSSE 473 Day 28 Dynamic Programming Binomial Coefficients Warshall's algorithm Student questions?
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
Arboles B External Search The algorithms we have seen so far are good when all data are stored in primary storage device (RAM). Its access is fast(er)
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
CompSci 100E 39.1 Memory Model  For this course: Assume Uniform Access Time  All elements in an array accessible with same time cost  Reality is somewhat.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
+ David Kauchak cs312 Review. + Midterm Will be posted online this afternoon You will have 2 hours to take it watch your time! if you get stuck on a problem,
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
 B-tree is a specialized multiway tree designed especially for use on disk  B-Tree consists of a root node, branch nodes and leaf nodes containing the.
CIS 068 Welcome to CIS 068 ! Lesson 12: Data Structures 3 Trees.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
1 Trees. 2 Trees Trees. Binary Trees Tree Traversal.
MA/CSSE 473 Day 27 Dynamic Programming Binomial Coefficients
MA/CSSE 473 Day 26 Student questions Boyer-Moore B Trees.
Multiway Search Trees Data may not fit into main memory
Source Code for Data Structures and Algorithm Analysis in C (Second Edition) – by Weiss
B+-Trees.
Dynamic Programming Dynamic Programming is a general algorithm design technique for solving problems defined by recurrences with overlapping subproblems.
Dynamic Programming Dynamic Programming is a general algorithm design technique for solving problems defined by recurrences with overlapping subproblems.
Chapter 8 Dynamic Programming
B+-Trees.
Hashing Exercises.
Tree data structure.
Wednesday, April 18, 2018 Announcements… For Today…
Tree data structure.
Chapter 8 Dynamic Programming
Data Structures Using C++ 2E
Presentation transcript:

MA/CSSE 473 Day 30 B Trees Dynamic Programming Binomial Coefficients Warshall's algorithm No in-class quiz today Student questions?

B-trees We will do a quick overview. For the whole scoop on B-trees (Actually B+ trees), take CSSE 433, Advanced Databases. Nodes can contain multiple keys and pointers to other to subtrees B Tree Animation applet: –

B-tree nodes Each node can represent a block of disk storage; pointers are disk addresses This way, when we look up a node (requiring a disk access), we can get a lot more information than if we used a binary tree In an n-node of a B-tree, there are n pointers to subtrees, and thus n-1 keys For all keys in T i, K i ≤ T i < K i+1 K i is the smallest key that appears in T i

B-tree nodes (tree of order m) All nodes have at most m-1 keys All keys and associated data are stored in special leaf nodes (that thus need no child pointers) The other (parent) nodes are index nodes All index nodes except the root have between  m/2  and m children root has between 2 and m children All leaves are at the same level The space-time tradeoff is because of duplicating some keys at multiple levels of the tree Especially useful for data that is too big to fit in memory. Why? Example on next slide

Example B-tree(order 4)

Search for an item Within each parent or leaf node, the keys are sorted, so we can use binary search (log m), which is a constant with respect to n, the number of items in the table Thus the search time is proportional to the height of the tree Max height is approximately log  m/2  n Exercise for you: Read and understand the straightforward analysis on pages Insert and delete are also proportional to height of the tree

Preview: Dynamic programming Used for problems with overlapping subproblems Typically, we save (memoize) solutions to the subproblems, to avoid recomputing them.

Dynamic Programming Example Binomial Coefficients: C(n, k) is the coefficient of x k in the expansion of (1+x) n C(n,0) = C(n, n) = 1. If 0 < k < n, C(n, k) = C(n-1, k) + C(n-1, k-1) Can show by induction that the "usual" factorial formula for C(n, k) follows from this recursive definition. –A good practice problem for you If we don't cache values as we compute them, this can take a lot of time, because of duplicate (overlapping) computation.

Computing a binomial coefficient Binomial coefficients are coefficients of the binomial formula: (a + b) n = C(n,0)a n b C(n,k)a n-k b k C(n,n)a 0 b n Recurrence: C(n,k) = C(n-1,k) + C(n-1,k-1) for n > k > 0 C(n,0) = 1, C(n,n) = 1 for n  0 Value of C(n,k) can be computed by filling in a table: k-1 k n-1 C(n-1,k-1) C(n-1,k) n C(n,k)

Computing C(n, k): Time efficiency: Θ(nk) Space efficiency: Θ(nk) If we are computing C(n, k) for many different n and k values, we could cache the table between calls.

Transitive closure of a directed graph We ask this question for a given directed graph G: for each of vertices, (A,B), is there a path from A to B in G? Start with the boolean adjacency matrix A for the n-node graph G. A[i][j] is 1 if and only if G has a directed edge from node i to node j. The transitive closure of G is the boolean matrix T such that T[i][j] is 1 iff there is a nontrivial directed path from node i to node j in G. If we use boolean adjacency matrices, what does M 2 represent? M 3 ? In boolean matrix multiplication, + stands for or, and * stands for and

Transitive closure via multiplication Again, using + for or, we get T = M + M 2 + M 3 + … Can we limit it to a finite operation? We can stop at M n-1. –How do we know this? Number of numeric multiplications for solving the whole problem?

Warshall's algorithm Similar to binomial coefficients algorithm Assumes that the vertices have been numbered 1, 2, …, n Define the boolean matrix R (k) as follows: –R (k) [i][j] is 1 iff there is a path in the directed graph v i =w 0  w 1  …  w s =v j, where s >=1, and for all t = 1, …, s-1, w t is v m for some m ≤ k i.e, none of the intermediate vertices are numbered higher than k Note that the transitive closure T is R (n)

R (k) example R (k) [i][j] is 1 iff there is a path in the directed graph v i =w 0  w 1  …  w s =v j, where –s >1, and –for all t = 2, …, s-1, w t is v m for some m ≤ k Example: assuming that the node numbering is in alphabetical order, calculate R (0), R (1), and R (2)

Quickly Calculating R (k) Back to the matrix multiplication approach: –How much time did it take to compute A k [i][j], once we have A k-1 ? Can we do better when calculating R (k) [i][j] from R (k-1) ? How can R (k) [i][j] be 1? –either R (k-1) [i][j] is 1, or –there is a path from i to k that uses no vertices higher than k- 1, and a similar path from k to j. Thus R (k) [i][j] is R (k-1) [i][j] or ( R (k-1) [i][k] and R (k-1) [k][j] ) Note that this can be calculated in constant time Time for calculating R (k) from R (k-1) ? Total time for Warshall's algorithm? Code and example on next slides