Persistent Data Structures (Version Control)

Slides:



Advertisements
Similar presentations
Turning Amortized to Worst-Case Data Structures: Techniques and Results Tsichlas Kostas.
Advertisements

List Order Maintenance
Planar point location -- example
Priority Queues  MakeQueuecreate new empty queue  Insert(Q,k,p)insert key k with priority p  Delete(Q,k)delete key k (given a pointer)  DeleteMin(Q)delete.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
1 Functional Data Structures Purely functional car cdr never modify only create new pairs only DAGs [C. Okasaki, Simple and efficient purely functional.
An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile.
Time O(log d) Exponential-search(13) Finger Search Searching in a sorted array time O(log n) Binary-search(13)
1 Algorithmic Aspects of Searching in the Past Christine Kupich Institut für Informatik, Universität Freiburg Lecture 1: Persistent Data Structures Advanced.
1 Self-Adjusting Data Structures. 2 Lists [D.D. Sleator, R.E. Tarjan, Amortized Efficiency of List Update Rules, Proc. 16 th Annual ACM Symposium on Theory.
Chapter 7 Data Structure Transformations Basheer Qolomany.
A Linear Time Algorithm for the k Maximum Sums Problem By Gerth S. Brodal and Allan G. Jørgensen.
Fractional Cascading CSE What is Fractional Cascading anyway? An efficient strategy for dealing with iterative searches that achieves optimal.
1 Lecture 8: Data structures for databases II Jose M. Peña
Unified Access Bound 1 [M. B ă doiu, R. Cole, E.D. Demaine, J. Iacono, A unified access bound on comparison- based dynamic dictionaries, Theoretical Computer.
Multiversion Access Methods - Temporal Indexing. Basics A data structure is called : Ephemeral: updates create a new version and the old version cannot.
WS Prof. Dr. Th. Ottmann Algorithmentheorie 16 – Persistenz und Vergesslichkeit.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
Update Persistent Data Structures (Version Control) v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Ephemeral query v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Partial persistence.
Update 1 Persistent Data Structures (Version Control) v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Ephemeral query v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Partial persistence.
Chapter 15 B External Methods – B-Trees. © 2004 Pearson Addison-Wesley. All rights reserved 15 B-2 B-Trees To organize the index file as an external search.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
1 Minimize average access time Items have weights: Item i has weight w i Let W =  w i be the total weight of the items Want the search to heavy items.
1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
Fully Persistent B-Trees 23 rd Annual ACM-SIAM Symposium on Discrete Algorithms, Kyoto, Japan, January 18, 2012 Gerth Stølting Brodal Konstantinos Tsakalidis.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
E.G.M. Petrakissearching1 Searching  Find an element in a collection in the main memory or on the disk  collection: (K 1,I 1 ),(K 2,I 2 )…(K N,I N )
Fractional Cascading and Its Applications G. S. Lueker. A data structure for orthogonal range queries. In Proc. 19 th annu. IEEE Sympos. Found. Comput.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
Time O(log d) Exponential-search(13) Finger Search Searching in a sorted array time O(log n) Binary-search(13)
Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
Starting at Binary Trees
Open Problem: Dynamic Planar Nearest Neighbors CSCE 620 Problem 63 from the Open Problems Project
Λίστα Εργασιών Data Structures for Tree Manipulation D. Harel and R.E. Tarjan. Fast Algorithms for finding nearest common ancestors. SIAM J. Computing,
Discrete Mathematics Chapter 5 Trees.
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
CMPS 3130/6130 Computational Geometry Spring 2015
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Michal Balas1 I/O-efficient Point Location using Persistent B-Trees Lars Arge, Andrew Danner, and Sha-Mayn Teh Department of Computer Science, Duke University.
Confluently Persistent Tries Eric Price, MIT joint work with Erik Demaine, MIT Stefan Langerman, Université Libre de Bruxelles.
Presenters : Virag Kothari,Vandana Ayyalasomayajula Date: 04/21/2010.
Self-Adjusting Data Structures
Binary Tree ADT: Properties
Multiway Search Trees Data may not fit into main memory
List Order Maintenance
CMPS 3130/6130 Computational Geometry Spring 2017
Temporal Indexing MVBT.
Temporal Indexing MVBT.
Searching in Trees Gerth Stølting Brodal Aarhus University
External Methods Chapter 15 (continued)
Priority Queues MakeQueue create new empty queue
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
A simpler implementation and analysis of Chazelle’s
STACS arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
B-Tree.
8th Workshop on Massive Data Algorithms, August 23, 2016
Hash-Based Indexes Chapter 11
Presentation transcript:

Persistent Data Structures (Version Control) Partial persistence Full persistence Confluently persistence Purely functional Ephemeral v0 version DAG version tree list v0 v0 v0 car cdr update never modify only create new pairs only DAGs v1 v1 v1 v2 v1 v2 v2 v2 v3 v6 v3 query only v3 v3 v4 v5 v5 update/merge/query all versions updates at leaves any version can be copied query all versions v4 v4 v5 v5 Retroactive v0 v1 v2 v3 v4 v6 v6 update & query all versions updates in the past propragate update & query query

Partial persistent search trees Planar Point Location T1 T2 T3 T4 T5 T6 T7 Partial persistent search trees update O(n∙log n) preprocessing, O(log n) query

Path Copying (trees) c c a f f d e g root pointer Core of functional data structures d e g

Partial Persistence Version ID = time = 0,1,2,... Fat node (any data structure) record all updates in node (version,value) pairs field updates O(1) field queries  predecessor wrt version id (search tree/vEB) Node copying (O(1) degree data structures) Persistent node = collection of nodes, each valid for an interval of versions, with  extra updates,  = max indegree pointers must have subinterval of the node pointing to; otherwise copy and insert pointers (cascading copying) NB: Needs to keep track of back-pointers field1: (0,x) (3,y) (7,z) field2: (0,a) (14,c) (16,b) Potential = #extra fields used (overflow by +1 occupied fields => copying => no extra field used + ingoing rec. updates) [0,8[ [8,13[ [13,[ field1: (0,x) (3,y) field2: (0,a) (7,c) field1: (8,z) (10,w) field2: (8,c) (9,d) field1: (13,w) (15,y) field2: (13,e) (14,c)

Full Persistence Fat node Node splitting (≥ 2 ekstra fields) 1 4 3 2 increasing version 1 4 7 5 3 6 2 preorder traversal Version list (order maintenance data structure) 7 5 6 Version tree (numbers = version ids) Fat node Updates (1,x) (6,y) (7,z) to a field Queries = binary search among versions Update (7,z): Insert 7 as leftmost child of 4; insert pairs for 7 and 5=succ(7) Node splitting (≥ 2 ekstra fields) field: (1,x) (7,z) (5,x) (6,y) (2,x) Node splitting: FULLNESS = #extra fields used behind the first  Potential = sum of fullness of nodes. Overflow 2+1 extra fields used. Splitting  extra fields in each (at least one will be used for version at the split) Totally loss reduces by +1. Insert one pointer in each of the  nodes pointing at the split version (5). [5,3[ [4,3[ [4,5[ [0,[ [0,5[ [5,[ field1: (1,a) (4,b) (3,a) (2,c) field2: (1,f) (7,g) (5,f) split version 5 field1: (1,a) (4,b) field2: (1,f) (7,g) field1: (5,b) (3,a) (2,c) field2: (5,f)

Persistence Techniques [N. Sarnak, R.E. Tarjan, Planar point location using persistent search trees, Communications of the ACM, 29(7), 669-679, 1986] Partial persistence, trees, O(1) access, amortized O(1) update [J.R. Driscoll, N. Sarnak, D.D. Sleator, R.E. Tarjan, Making Data Structures Persistent, Journal of Computer and System Sciences, 38(1), 86-124, 1989] Partial & full persistence, O(1) degree data structures, O(1) access, amortized O(1) update [P.F. Dietz, R. Raman, Persistence, Amortization and Randomization. Proceedings 2nd Annual ACM-SIAM Symposium on Discrete Algorithms, 78-88, 1991] [G.S. Brodal, Partially Persistent Data Structures of Bounded Degree with Constant Update Time, Nordic Journal of Computing, volume 3(3), pages 238-255, 1996] Partial persistence, O(1) degree data structures, O(1) access & updates update [P.F. Dietz, Fully Persistent Arrays. Proceedings 1st Workshop on Algorithms and Data Structures, LNCS 382, 67-74, 1989] Full persistence, RAM structures, O(loglog n) access, O(loglog n) amortized expected updates Open problem (RAM and Pointer Machine): O(1) overhead for Full Persistent Pointer-Based Data Stuctures with O(1) indegree

Comparison of Persistence Techniques Copy data structure for each version no query overhead, slow updates & wastes a lot of space Record updates & keep current version fast updates & queries to current version, space efficient, slow queries in the past Path copying applies to trees, no query overhead, space overhead = depth of update Fat node partial persistence: O(1) updates and space optimal, loglog n query overhead full persistence: O(loglog n) expected amortized updates and space optimal, loglog n query overhead Node copying/splitting fast updates & queries (amortized updates for full persistence) only works for pointer-based structures with O(1) degree

Fractional Cascading Basic Idea : 2 x BinSearch(x)  1 BinSearch(x) + O(1) L1 L2 Build bridges (and pointers to nearest original element) Searches to next list : Traverse nearest bridge Construction : Repeatedly create bridges until all gaps O(1) Generalizes to catalog graphs of degree O(1) L1 L2 catalog graph 2 5 8 11 21 25 27 31 33 42 5 8 12 14 17 19 20 35 37 41 O(1) nodes 18 2 5 8 11 21 25 27 31 33 42 12 14 17 19 20 35 37 41 bridges

Fractional Cascading Techniques [Bernard Chazelle, Leonidas J. Guibas, Fractional Cascading: I. A Data Structuring Technique, Algorithmica, 1(2): 133-162, 1986.] [Bernard Chazelle, Leonidas J. Guibas: Fractional Cascading: II. Applications. Algorithmica 1(2): 163-191 (1986)] Static fractional cascading, O(1) worst-case access [Kurt Mehlhorn, Stefan Näher: Dynamic Fractional Cascading. Algorithmica 5(2): 215-241 (1990)] Dynamic fractional cascading, O(loglog N) worst-case access, amortized insert and delete Insertion or deletion only, O(1) per worst-case access, amortized insert or delete