15-853: Algorithms in the Real World Locality I: Cache-aware algorithms – Introduction – Sorting – List ranking – B-trees – Buffer trees.

Slides:



Advertisements
Similar presentations
Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
Advertisements

0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
B+-trees. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity:
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
COMP 451/651 Indexes Chapter 1.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
I/O Efficient Algorithms. Problem Data is often too massive to fit in the internal memory The I/O communication between internal memory (fast) and external.
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 9: BTrees Tyler Robison Summer
B+-tree and Hashing.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge Aarhus University February 7, 2005.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 19: B-Trees: Data Structures for Disk.
Other time considerations Source: Simon Garrett Modifications by Evan Korth.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
I/O-Algorithms Lars Arge Aarhus University February 6, 2007.
I/O-Algorithms Lars Arge Aarhus University February 14, 2008.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
Primary Indexes Dense Indexes
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Tirgul 6 B-Trees – Another kind of balanced trees.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
More Trees Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees Main memory vs. disk ◦ Assumptions so far: ◦ We have assumed that we can store.
I/O-Efficient Graph Algorithms Norbert Zeh Duke University EEF Summer School on Massive Data Sets Århus, Denmark June 26 – July 1, 2002.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
Advanced Algorithm Design and Analysis (Lecture 2) SW5 fall 2004 Simonas Šaltenis E1-215b
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Xiaoying Gao, Peter Andreae, VUW B Trees and B+ Trees COMP 261.
CPSC 221: Algorithms and Data Structures Lecture #7 Sweet, Sweet Tree Hives (B+-Trees, that is) Steve Wolfman 2010W2.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
Indexing CS 400/600 – Data Structures. Indexing2 Memory and Disk  Typical memory access: 30 – 60 ns  Typical disk access: 3-9 ms  Difference: 100,000.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
15-853: Algorithms in the Real World Locality II: Cache-oblivious algorithms – Matrix multiplication – Distribution sort – Static searching.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
Internal and External Sorting External Searching
COMP 5704 Project Presentation Parallel Buffer Trees and Searching Cory Fraser School of Computer Science Carleton University, Ottawa, Canada
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
CENG 3511 External Sorting. CENG 3512 Outline Introduction Heapsort Multi-way Merging Multi-step merging Replacement Selection in heap-sort.
COMP261 Lecture 23 B Trees.
Multiway Search Trees Data may not fit into main memory
CSE 332 Data Abstractions B-Trees
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
Database Design and Programming
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
Presentation transcript:

15-853: Algorithms in the Real World Locality I: Cache-aware algorithms – Introduction – Sorting – List ranking – B-trees – Buffer trees

RAM Model Standard theoretical model for analyzing algorithms: – Infinite memory size – Uniform access cost – Evaluate an algorithm by the number of instructions executed CPU RAM

Real Machine Example: Nehalem CPU – ~ 0.4ns / instruction – 8 Registers L1 cache – size: 64KB – line size: 64B – access time: 1.5ns L2 cache – size: 256KB – line size: 64B – access time: 5ns L3 cache – size: 24MB – line size: 64B – access time: 20ns Memory – access time: 100ns Disc – access time: ~4ms = 4x10 6 ns CPU L1 L2 Main Memory Disc The cost of transferring data is important ⇒ Design algorithms with locality in mind The cost of transferring data is important ⇒ Design algorithms with locality in mind L3

I/O Model Abstracts a single level of the memory hierarchy Fast memory (cache) of size M Accessing fast memory is free, but moving data from slow memory is expensive Memory is grouped into size-B blocks of contiguous data Fast Memory block M/BM/B B B CPU Slow Memory Slow Memory Cost: the number of block transfers (or I/Os) from slow memory to fast memory.

Notation Clarification M: the number of objects that fit in memory, and B: the number of objects that fit in a block So for word-size (8 byte) objects, and memory size 1Mbyte, M = 128,000

Why a 2-Level Hierarchy? It’s simpler than considering the multilevel hierarchy A single level may dominate the runtime of the application, so designing an algorithm for that level may be sufficient Considering a single level exposes the algorithmic difficulties — generalizing to a multilevel is often straightforward We’ll see cache-oblivious algorithms later as a way of designing for multi-level hierarchies

What Improvement Do We Get? Examples – Adding all the elements in a size-N array (scanning) – Sorting a size-N array – Searching a size-N data set in a good data structure ProblemRAM AlgorithmI/O Algorithm ScanningΘ(N)Θ(N)Θ(N/B)Θ(N/B) SortingΘ(N log 2 N)Θ((N/B)log M/B (N/B)) SearchingΘ(log 2 N)Θ(log B N) PermutingΘ(N)Θ(min(N,sort(N)) – For 8-byte words on example new Intel Processor – B ≈ 8 in L2-cache, B ≈ 1000 on disc – log 2 B ≈ 3 in L2-cache, log 2 B ≈ 10 on disc

Sorting Standard MergeSort algorithm: Split the array in half MergeSort each subarray Merge the sorted subarrays Number of computations is O(N logN) on an N-element array How does the standard algorithm behave in the I/O model?

Merging A size-N array occupies at most N/B+1 blocks Each block is loaded once during merge, assuming memory size M ≥ 3B block B =

MergeSort Analysis Sorting in memory is free, so the base case is S(M) = Θ(M/B) to load a size-M array S(N) = 2S(N/2) + Θ(N/B) ⇒ S(N) = Θ((N/B)(log 2 (N/M)+1)) M 2M2M N/2 N log 2 (N/M) …

I/O Efficient MergeSort Instead of doing a 2-way merge, do a Θ(M/B)- way merge IOMergeSort: Split the array into Θ(M/B) subarrays IOMergeSort each subarray Perform a Θ(M/B)-way merge to combine the subarrays

k-Way Merge Assuming M/B ≥ k+1, one block from each array fits in memory Therefore, only load each block once Total cost is thus Θ(N/B) k

IOMergeSort Analysis Sorting in memory is free, so the base case is S(M) = Θ(M/B) to load a size-M array S(N) = (M/B) S(NB/M) + Θ(N/B) ⇒ S(N) = Θ((N/B)(log M/B (N/M)+1)) M 2 /B N/(M/B) N log M/B (N/M) M …

MergeSort Comparison Traditional MergeSort costs Θ((N/B)log 2 (N/M)) I/Os on size-N array IOMergeSort is I/O efficient, costing only Θ((N/B)log M/B (N/M)) The new algorithm saves Θ(log 2 (M/B)) fraction of I/Os. How significant is this savings? Consider L3 cache to main memory on Nehalem – M = 1 Million, B = 8, N = 1 Billion – 1 billion / 8 x 10 vs. 1 billion / 8 x 1 Not hard to calculate exact constants

List Ranking Given a linked list, calculate the rank of (number of elements before) each element start 9 Trivial algorithm is O(N) computation steps

List ranking in I/O model Assume list is stored in ~N/B blocks! May jump in memory a lot. Example: M/B = 3, B = 2, least-recently-used eviction start 9 In general, each pointer can result in a new block transfer, for O(N) I/Os

Why list ranking? Recovers locality in the list (can sort based on the ranking) start Generalizes to trees via Euler tours Useful for various forest/tree algorithms like least common ancestors and tree contraction Also used in graph algorithms like minimum spanning tree and connected components

List ranking outline 1.Produce an independent set of Θ(N) nodes (if a node is in the set, its successor is not) 2.“Bridge out” independent set and solve weighted problem recursively start

List ranking outline 3.Merge in bridged-out nodes start

List ranking: 1) independent set Each node flips a coin {0,1} A node is in the independent set if it chooses 1 and its predecessor chooses 0 Each node enters independent set with prob ¼, so expected set size is Θ(N) start 1

List ranking: 1) independent set Identifying independent-set nodes efficiently: Sort by successor address 1a0b0c1d0e0f1g0h1i start 1j 1a0b0c1d0e0f 1g 0h1i start 1j After sort, requires O(scan(N ))=O(N/B) block transfers

List ranking: 2) bridging out Sort by successor address twice abcdefghi start j ab c de f g h i j abcdefghi j

List ranking: 2) bridging out If middle node is in independent set, “splice” it out Gives a list of new pointers Sort back to original order and scan to integrate pointer updates abc c d f efj x y x+y abcde abcde old update

List ranking: 2) bridging out Scans and sorts to compress and remove independent set nodes (homework) abcdefghi start j bcef h j

List ranking: 3) merge in Uses sorts and scans start

List ranking analysis 1.Produce an independent set of Θ(N) nodes (keep retrying until random set is good enough) 2.“Bridge out” and solve recursively 3.Merge-in bridged-out nodes All steps use a constant number of sorts and scans, so expected cost is O(sort(N)) = O((N/B) log M/B (N/B)) I/Os at this level of recursion Gives recurrence R(N) = R(N/c) + O(sort(N)) = O(sort(N))

B-Trees A B-tree is a type of search tree ((a,b)-tree) designed for good memory performance Common approach for storing searchable, ordered data, e.g., databases, filesystems. Operations Updates: Insert/Delete Queries – Search: is the element there – Successor/Predecessor: find the nearest key – Range query: return all objects with keys within a range – …

B-tree/(2,3)-tree Objects stored in leaves Leaves all have same depth Root has at most B children Other internal nodes have between B/2 and B children B

B-tree search Compare search key against partitioning keys and move to proper child. Cost is O(height * node size) Search for

B-tree insert Search for where the key should go. If there’s room, put it in Insert

B-tree insert Search for where the key should go. If there’s no room, split the node Splits may propagate up tree Insert

B-tree inserts Splits divide the objects / child pointers as evenly as possible. If the root splits, add a new parent (this is when the height of the tree increases) Deletes are a bit more complicated, but similar — if a node drops below B/2 children, it is merged or rebalanced with some neighbors.

B-tree analysis Search All nodes (except root) have at least Ω(B) children ⇒ height of tree is O(log B N) Each node fits in a block Total search cost is O(log B N) block transfers.

B-tree analysis Insert (and delete): Every split of a leaf results in an insert into a height-1 node. In general, a height-h split causes a height-(h+1) insert. There must be Ω(B) inserts in a node before it splits again. An insert therefore pays for ∑(1/B) h = O(1/B) splits, each costing O(1) block transfers. Searching and updating the keys along the root- to-leaf path dominates for O(log B N) block transfers

Sorting with a search tree? Consider the following RAM sort algorithm: 1.Build a balanced search tree 2.Repeatedly delete the minimum element from the tree Runtime is O(N logN) Does this same algorithm work in the I/O model? Just using a B-tree is O(N log B N) which is much worse than O((N/B) log M/B (N/B))

Buffer tree Somewhat like a B-tree: when nodes gain too many children, they split evenly, using a similar split method all leaves are at the same depth Unlike a B-tree: queries are not answered online (they are reported in batches) internal nodes have Θ(M/B) children nodes have buffers of size Θ(M)

Buffer-tree insert Start at root. Add item to end of buffer. If buffer is not full, done. Otherwise, partition the buffer and send elements down to children. … Insert buffer M M/B

Buffer-tree insert Analysis ideas: Inserting into a buffer costs O(1+k/B) for k elements inserted On overflow, partitioning costs O(M/B) to load entire buffer. Then M/B “scans” are performed, costing O(#Arrays + total size/B) = O(M/B + M/B) Flushing buffer downwards therefore costs O(M/B), moving Ω(M) elements, for a cost of O(1/B) each per level An element may be moved down at each height, costing a total of O((1/B) height) = O((1/B)log M/B (N/B)) per element … Insert buffer M M/B height O(log M/B (N/B))

I/O Priority Queue Supporting Insert and Extract-Min (no Decrease- Key here) Keep a buffer of Θ(M) smallest elements in memory Use a buffer tree for remaining elements. While smallest-element buffer is too full, insert 1 (maximum) element into buffer tree If smallest-element buffer is empty, flush leftmost path in buffer tree and delete the leftmost leafs Total cost is O((N/B) log M/B (N/B)) for N ops. Yields optimal sort.

Buffer-tree variations To support deletions, updates, and other queries, insert records in the tree for each operation, associated with timestamps. As records with the same key collide, merge them as appropriate. Examples of applications: DAG shortest paths Circuit evaluation Computational geometry applications