Fully Persistent B-Trees 23 rd Annual ACM-SIAM Symposium on Discrete Algorithms, Kyoto, Japan, January 18, 2012 Gerth Stølting Brodal Konstantinos Tsakalidis.

Slides:



Advertisements
Similar presentations
Turning Amortized to Worst-Case Data Structures: Techniques and Results Tsichlas Kostas.
Advertisements

Planar point location -- example
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
B+-trees. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity:
Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011.
1 Algorithmic Aspects of Searching in the Past Christine Kupich Institut für Informatik, Universität Freiburg Lecture 1: Persistent Data Structures Advanced.
External Memory Geometric Data Structures
EECS 311: Chapter 4 Notes Chris Riesbeck EECS Northwestern.
A Linear Time Algorithm for the k Maximum Sums Problem By Gerth S. Brodal and Allan G. Jørgensen.
Multiversion Access Methods - Temporal Indexing. Basics A data structure is called : Ephemeral: updates create a new version and the old version cannot.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Dynamic Planar Range Maxima Queries (presented at ICALP 2011) Gerth Stølting Brodal Aarhus University Kostas Tsakalidis University of Primorska, October.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
Update 1 Persistent Data Structures (Version Control) v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Ephemeral query v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Partial persistence.
Chapter 15 B External Methods – B-Trees. © 2004 Pearson Addison-Wesley. All rights reserved 15 B-2 B-Trees To organize the index file as an external search.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
Purely Functional Worst Case Constant Time Catenable Sorted Lists Gerth Stølting Brodal University of Aarhus Joint work with Christos Makris Kostas Tsichlas.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
1 Balanced Search Trees  several varieties  AVL trees  trees  Red-Black trees  B-Trees (used for searching secondary memory)  nodes are added.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge Aarhus University February 7, 2005.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
1 Binary Search Trees Implementing Balancing Operations –AVL Trees –Red/Black Trees Reading:
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University February 6, 2007.
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
I/O-Algorithms Lars Arge Aarhus University February 9, 2006.
I/O-Algorithms Lars Arge Aarhus University March 9, 2006.
I/O-Algorithms Lars Arge Aarhus University February 14, 2008.
Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus.
I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Strict Fibonacci Heaps Gerth Stølting Brodal Aarhus University George Lagogiannis Robert Endre Tarjan Agricultural University of Athens Princeton University.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Time O(log d) Exponential-search(13) Finger Search Searching in a sorted array time O(log n) Binary-search(13)
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
Trevor Brown – University of Toronto B-slack trees: Space efficient B-trees.
Binary Tree Representation Of Trees Problems with trees. 2- and 3-nodes waste space. Overhead of moving pairs and pointers when changing among.
Balanced Search Trees Fundamental Data Structures and Algorithms Margaret Reid-Miller 3 February 2005.
Starting at Binary Trees
Algorithms and Data Structures Gerth Stølting Brodal Computer Science Day, Department of Computer Science, Aarhus University, May 25, 2012.
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
B-Trees ( Rizwan Rehman) Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory.
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Review for Exam 2 Topics covered (since exam 1): –Splay Tree –K-D Trees –RB Tree –Priority Queue and Binary Heap –B-Tree For each of these data structures.
External Memory Geometric Data Structures Lars Arge Duke University June 27, 2002 Summer School on Massive Datasets.
B-Trees Katherine Gurdziel 252a-ba. Outline What are b-trees? How does the algorithm work? –Insertion –Deletion Complexity What are b-trees used for?
Internal Memory Pointer MachineRandom Access MachineStatic Setting Data resides in records (nodes) that can be accessed via pointers (links). The priority.
More Trees. Outline Tree B-Tree 2-3 Tree Tree Red-Black Tree.
X1x1 x2x2 top-k y 3-sided x1x1 x2x2 External Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates Gerth Stølting Brodal Aarhus.
Michal Balas1 I/O-efficient Point Location using Persistent B-Trees Lars Arge, Andrew Danner, and Sha-Mayn Teh Department of Computer Science, Duke University.
arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
List Order Maintenance
Persistent Data Structures (Version Control)
Searching in Trees Gerth Stølting Brodal Aarhus University
Advanced Topics in Data Management
STACS arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
Strict Fibonacci Heaps
8th Workshop on Massive Data Algorithms, August 23, 2016
Presentation transcript:

Fully Persistent B-Trees 23 rd Annual ACM-SIAM Symposium on Discrete Algorithms, Kyoto, Japan, January 18, 2012 Gerth Stølting Brodal Konstantinos Tsakalidis Aarhus University, Denmark Spyros Sioutas Ionian University, Corfu, Greece Kostas Tsichlas Aristotle University of Thessaloniki, Greece

Binary Tree

Binary Search Tree

5 Insert(5)

Balanced Binary Search Tree Height O(log n) 1

Report(2,7) Height O(log n) 1 Time O(log n + t)

5 Red-Black Tree Bayer 1972

recolor restructure Red-Black Tree - rebalancing O(log n) O(1)

Driscoll, Sarnak, Sleator, Tarjan 1986 Red-Black Tree EphemeralPersistent query only Partially Fully Expensive to record updates performing many modifications Version tree Contributions of Driscoll et al. 1.Red-black trees with O(1) modifications per update (displacement paths) 2.General technique to make pointer based data structures fully persistent

RAM modelIO model Binary Search Trees Fully Persistent Balanced Binary Search Trees Incremental B-Trees Fully Persistent B-Trees Bayer 1972 Red-Black Trees Bayer 1972 Red-Black Trees w. Displacement Paths Driscoll et al Full Persistence: Node splitting Driscoll et al Maintaining Order in a List Dietz, Sleator 1987 CPU Complexity = # instructions Memory CPU Complexity = # block transfers Disk Memory B IO Efficient Node Splitting Fully Persistent Search Trees

Incremental B-tree with range searches in O(log B n + t/B) IOs, updates in O(log B n) IOs with O(1) key and pointer modifications, O(n/B) space Results Fully persistent B-tree with updates in O(log B n + log 2 B) IOs amortized, range searches in O(log B n + t/B) IOs, O(m/B) space Any emphemeral pointer based IO data structure with O(1) indegree can be made fully persistent with O(1) overhead per block access and O(log 2 B) IOs amortized overhead per field modification Persistent B-tree resultQueryUpdate Arge, Danner, Teh 2003log B n + t/Blog B n Lanka, Mays 1991(log B n + t/B) ∙ log B mlog B n ∙ log B m This talklog B n + t/Blog B n + log 2 B n = # elements in B-tree t= output size m= # versions / updates B = block size Full Partial

... w Incremental Search Tree Updates y x w Displacement path w  x  flip all colors on path + all red siblings (flag w with x) x y w Cascaded recoloring x x... Cascaded splitting B-Tree x degree [B,2B] height O(log B n) x w Overflow flag (path w  x) Counts twice w Splitting pair w Ready to overflow Driscoll et al Incremental B-tree Range searches visit O(log B n + t/B) nodes. Updates... 1.visit O(log B n) nodes 2.modify O(1) pointers and keys 3.creates O(1) new empty nodes

Fat node Queries = binary search among versions Node splitting (# ekstra fields ≥2∙indegree) 13 Full persistence : Node Splitting Technique Version tree (numbers = version ids) preorder traversal field:(1,x) (7,z) (5,x) (6,y) (2,x) increasing version field 1 : (1,a) (4,b) (3,a) (2,c) field 2 : (1,f) (7,g) (5,f) [0,[[0,[ [4,3[[4,3[ field 1 : (1,a) (4,b) field 2 : (1,f) (7,g) field 1 : (5,b) (3,a) (2,c) field 2 : (5,f) split version 5 [0,5[[0,5[ [5,[[5,[ [4,5[[4,5[ [5,3[ Version list (order maintenance data structure) Driscoll, Sarnak, Sleator, Tarjan 1986 Dietz, Sleator 1987 Updates (1,x) (6,y) (7,z)... in the IO model 1.O(B) fields & updates in a node 2.local version list LVL of all version ids in block 3.each inpointer spans O(1) verions in LVL

Thank You Main result Fully persistent B-tree with updates in O(log B n + log 2 B) IOs, searches in O(log B n + t/B) IOs, using space O(m/B) ?