WS 2006-07 Prof. Dr. Th. Ottmann Algorithmentheorie 16 – Persistenz und Vergesslichkeit.

Slides:



Advertisements
Similar presentations
Planar point location -- example
Advertisements

Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Chapter 4: Trees Part II - AVL Tree
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Dynamic Planar Convex Hull Operations in Near- Logarithmic Amortized Time TIMOTHY M. CHAN.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
2/14/13CMPS 3120 Computational Geometry1 CMPS 3120: Computational Geometry Spring 2013 Planar Subdivisions and Point Location Carola Wenk Based on: Computational.
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
Brute-Force Triangulation
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Zoo-Keeper’s Problem An O(nlogn) algorithm for the zoo-keeper’s problem Sergei Bespamyatnikh Computational Geometry 24 (2003), pp th CGC Workshop.
1 Algorithmic Aspects of Searching in the Past Christine Kupich Institut für Informatik, Universität Freiburg Lecture 1: Persistent Data Structures Advanced.
Chapter 7 Data Structure Transformations Basheer Qolomany.
Multiversion Access Methods - Temporal Indexing. Basics A data structure is called : Ephemeral: updates create a new version and the old version cannot.
Relaxed Balancing Advanced Algorithms & Data Structures Lecture Theme 09 Prof. Dr. Th. Ottmann Summer Semester 2006.
Update 1 Persistent Data Structures (Version Control) v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Ephemeral query v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Partial persistence.
Lecture 12 : Special Case of Hidden-Line-Elimination Computational Geometry Prof. Dr. Th. Ottmann 1 Special Cases of the Hidden Line Elimination Problem.
BTrees & Bitmap Indexes
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut.
1 Algorithmic Aspects of Searching in the Past Thomas Ottmann Institut für Informatik, Universität Freiburg, Germany
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
1 Geometric Solutions for the IP-Lookup and Packet Classification Problem (Lecture 12: The IP-LookUp & Packet Classification Problem, Part II) Advanced.
Geometric Data Structures Computational Geometry, WS 2007/08 Lecture 13 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik.
Hidden-Line Elimination Computational Geometry, WS 2006/07 Lecture 14 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
Unit 11a 1 Unit 11: Data Structures & Complexity H We discuss in this unit Graphs and trees Binary search trees Hashing functions Recursive sorting: quicksort,
Uniquely Represented Data Structures Advanced Algorithms & Data Structures Lecture Theme 10 Prof. Dr. Th. Ottmann Summer Semester 2006.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
1 Persistent data structures. 2 Ephemeral: A modification destroys the version which we modify. Persistent: Modifications are nondestructive. Each modification.
Lecture 10: Search Structures and Hashing
Lecture 6: Point Location Computational Geometry Prof. Dr. Th. Ottmann 1 Point Location 1.Trapezoidal decomposition. 2.A search structure. 3.Randomized,
1 Separator Theorems for Planar Graphs Presented by Shira Zucker.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
Important Problem Types and Fundamental Data Structures
Binary Trees Chapter 6.
Randomized Algorithms - Treaps
Heapsort Based off slides by: David Matuszek
1 Geometric Intersection Determining if there are intersections between graphical objects Finding all intersecting pairs Brute Force Algorithm Plane Sweep.
1 Binomial heaps, Fibonacci heaps, and applications.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
UNC Chapel Hill M. C. Lin Line Segment Intersection Chapter 2 of the Textbook Driving Applications –Map overlap problems –3D Polyhedral Morphing.
B-Trees. CSM B-Trees 2 Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so.
1 B-Trees & (a,b)-Trees CS 6310: Advanced Data Structures Western Michigan University Presented by: Lawrence Kalisz.
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Symbol Tables and Search Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Chapter 2: Basic Data Structures. Spring 2003CS 3152 Basic Data Structures Stacks Queues Vectors, Linked Lists Trees (Including Balanced Trees) Priority.
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
CS 61B Data Structures and Programming Methodology Aug 7, 2008 David Sun.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
Heapsort. What is a “heap”? Definitions of heap: 1.A large area of memory from which the programmer can allocate blocks as needed, and deallocate them.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Fat heaps (K & Tarjan 96). 2 Goal Want to achieve the performance of Fibonnaci heaps but on the worst case. Why ? Theoretical curiosity and some applications.
Heaps © 2010 Goodrich, Tamassia. Heaps2 Priority Queue ADT  A priority queue (PQ) stores a collection of entries  Typically, an entry is a.
February 17, 2005Lecture 6: Point Location Point Location (most slides by Sergi Elizalde and David Pritchard)
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Computational Geometry
Multiway Search Trees Data may not fit into main memory
Persistent Data Structures (Version Control)
Priority Queues © 2010 Goodrich, Tamassia Priority Queues 1
B+ Tree.
Priority Queues MakeQueue create new empty queue
Multi-Way Search Trees
Presentation transcript:

WS Prof. Dr. Th. Ottmann Algorithmentheorie 16 – Persistenz und Vergesslichkeit

2WS Overview  Motivation: Oblivious and persistent structures  Examples: Arrays, search trees  Making structures (partially) persistent: Structure-copying, path-copying-, DSST-method  Application: Point location  Oblivious structures: Randomized and uniquely represented structures, c-level jump lists

3WS Motivation A structure storing a set of keys is called oblivious, if it is not possible to infer its generation history from its current shape. A structure is called persistent, if it supports access to multiple versions. Partially persistent: All versions can be accessed but only the newest version can be modified. Fully persistent: All versions can be accessed and modified. Confluently persistent: Two or more old versions can be combined into one new version.

4WS Example: Arrays Array: …… Uniquely represented structure, hence, oblivious! Access: In time O(log n) by binary search. Update (Insertion, Deletion):  (n) Caution: Storage structure may still depend on generation history!

5WS Example: Natural search trees Only partially oblivious!  Insertion history can sometimes be reconstructed.  Deleted keys are not visible. Access, insertion, deletion of keys may take time  (n) 1, 3, 5, 75, 1, 3,

6WS Simple methods for making structures persistent  Structure-copying method: Make a copy of the data structure each time it is changed. Yields full persistence at the price of  (n) time and space per update to a structure of size n.  Store a log-file of all updates! In order to access version i, first carry out i updates, starting with the initial structure, and generate version i.  (i) time per access, O(1) space and time per update.  Hybrid-method: Store the complete sequence of updates and additionally each k-th version for a suitably chosen k. Result: Any choice of k causes blowup in either storage space or access time. Are there any better methods?

7WS Making data structures persistent Several constructions to make various data structures persistent have been devised, but no general approach has been taken until the seminal paper by Driscoll, Sarnak, Sleator and Tarjan, They propose methods to make linked data structures partially as well as fully persistent. Let’s first have a look at how to make structures consisting of linked nodes (trees, directed graphs,..) partially persistent.

8WS Fat node method - partial persistence  Record all changes made to node fields in the nodes.  Each fat node contains the same fields as an ephemeral node and a version stamp  Add the modification history to every node: each field in a node contains a list of version-value pairs

9WS Fat node method - partial persistence Modifications Ephemeral update step i creates new node: create a new fat node with version stamp i and original field values Ephemeral update step i changes a field: store the field value plus a timestamp Each node knows what its value was at any previous point in time Access field f of version i Choose the value with maximum version stamp no greater than i

10WS Fat node method - analysis  Time cost per access gives O(log m) slowdown per node (using binary search on the modification history)  Time and Space cost per update step is O(1) (to store the modification along with the timestamp at the end of the modification history)

11WS Fat node method - Example A partially persistent search tree. Insertions:5,3,13,15,1,9,7,11,10, followed by deletion of item

12WS Path-copying method - partial persistence  Make a copy of the node before changing it to point to the new child. Cascade the change back until root is reached. Restructuring costs O(height_of_tree) per update operation  Every modification creates a new root.  Maintain an array of roots indexed by timestamps.

13WS Path-copying method version 0:

14WS Path-copying method version 1: Insert (2)

15WS Path-copying method version 1: Insert (2) version 2: Insert (4)

16WS Path-copying method Restructuring costs: O(log n) per update operation, if tree is maintained balanced version 1: Insert (2) version 2: Insert (4)

17WS Node-copying method: DSST DSST-method: Extend each node by a time-stamped modification box ? All versions before time t All versions after time t Modification boxes initially empty are filled bottom up k t: rp lp rp

18WS DSST method version 0

19WS DSST method lp version 0:

20WS DSST method lp version 1: Insert (2) version 2: Insert (4)

21WS DSST method The amortized costs (time and space) per update operation are O(1) rp 1 lp version 1: Insert (2) version 2: Insert (4)

22WS Node-copying method - partial persistence Modification If modification box empty, fill it. Otherwise, make a copy of the node, using only the latest values, i.e. value in modification box plus the value we want to insert, without using modification box Cascade this change to the node’s parent If the node is a root, add the new root to a sorted array of roots Access time gets O(1) slowdown per node, plus additive O(log m) cost for finding the correct root

23WS Node-copying method - Example A partially persistent search tree. Insertions: 5,3,13,15,1,9,7,11,10, followed by deletion of item

24WS Node-copying method - partial persistence The amortized costs (time and space) per modification are O(1). Proof: Using the potential technique

25WS Potential technique The potential is a function of the entire data structure Definition potential function: A measure of a data structure whose change after an operation corresponds to the time cost of the operation The initial potential has to be equal to zero and non-negative for all versions The amortized cost of an operation is the actual cost plus the change in potential Different potential functions lead to different amortized bounds

26WS Node-copying method - partial persistence Definitions Live nodes: they form the latest version ( reachable from the root of the most recent version), dead otherwise Full live nodes: live nodes whose modification boxes are full

27WS Node-copying method - potential paradigm The potential function f (T): the number of full live nodes in T (initially zero) The amortized cost of an operation is the actual cost plus the change in potential Δ f =? Each modification involves k number of copies, each with a O(1) space and time cost, and one change to a modification box with O(1) time cost Change in potential after update operation i: Δ f = Space: O(k + Δ f), time: O(k Δ f) Hence, a modification takes O(1) amortized space and O(1) amortized time

28WS Application: Planar Pointlocation Suppose that the Euclidian plane is subdivided into polygons by n line segments that intersect only at their endpoints. Given such a polygonal subdivision and an on-line sequence of query points in the plane, the planar point location problem, is to determine for each query point the polygon containing it. Measure an algorithm by three parameters: 1) The preprocessing time. 2) The space required for the data structure. 3) The time per query.

29WS Planar point location -- example

30WS Planar point location -- example

31WS Solving planar point location (Cont.) Partition the plane into vertical slabs by drawing a vertical line through each endpoint. Within each slab the lines are totally ordered. Allocate a search tree per slab containing the lines at the leaves with each line associate the polygon above it. Allocate another search tree on the x-coordinates of the vertical lines

32WS Solving planar point location (Cont.) To answer query first find the appropriate slab then search the slab to find the polygon

33WS Planar point location -- example

34WS Planar point location -- analysis Query time is O(log n) How about the space ?  (n 2 ) And so could be the preprocessing time

35WS Planar point location -- bad example Total # lines O(n), and number of lines in each slab is O(n).

36WS Planar point location & persistence So how do we improve the space bound ? Key observation: The lists of the lines in adjacent slabs are very similar. Create the search tree for the first slab. Then obtain the next one by deleting the lines that end at the corresponding vertex and adding the lines that start at that vertex How many insertions/deletions are there alltogether ? 2n

37WS Planar point location & persistence (cont) Updates should be persistent since we need all search trees at the end. Partial persistence is enough. Well, we already have the path copying method, lets use it. What do we get ? O(n logn) space and O(n log n) preprocessing time. We can improve the space bound to O(n) by using the DSST method.

38WS Methods for making structures oblivious Unique representation of the structure:  Set/size uniqueness: For each set of n keys there is exactly one structure which can store such a set.  The storage is order unique, i.e. the nodes of the strucure are ordered and the keys are stored in ascending order in nodes with ascending numbers. Randomise the structure: Assure that the expectation for the occurrence of a structure storing a set M of keys is independent of the way how M was generated. Observation: The address-assingment of pointers has to be subject under a randomised regime!

39WS Example of a randomised structure Z-stratified search tree On each stratum, randomly choose the distribution of trees from Z. Insertion? Deletion? … … …….... …..

40WS Uniquely represented structures (a) Generation history determines structure (b) Set-uniqueness:Set determines structure 1, 3, 5, 7 5, 1, 3, 7 1, 3, 5,

41WS Uniquely represented structures (c) Size-uniqueness:Size determines structure 1, 3, 5, 7 2, 4, 5, 8 Common structure Order-uniqueness: Fixed ordering of nodes determines where the keys are to be stored

42WS Set- and order-unique structures Lower bounds? Assumptions: A dictionary of size n is represented by a graph of n nodes. -Node degree finite (fixed), -Fixed order of the nodes, -i-th node stores i-largest key. Operations allowed to change a graph: Creation | Removal of a node Pointer change Exchange of keys Theorem: For each set- and order-unique representation of a dictionary with n keys, at least one of the operations access, insertion, or deletion must require time  (n 1/3 ).

43WS Uniquely represented dictionaries Problem: Find set-unique oder size-unique representations of the ADT „dictionary“ Known solutions: (1)set-unique, oder-unique Aragon/Seidel, FOCS 1989: Randomized Search Trees universal hash-function Update as for priority search trees! Search, insert, delete can be carried out in O(log n) expected time. (s, h(s)) priority s  X

44WS The Jelly Fish (2) L. Snyder, 1976, set-unique, oder-unique Upper Bound: Jelly Fish, search insert delete in time O(  n). body:  n nodes  n tentacles of length  n each

45WS Lower bound for tree-based structures set-unique, oder-unique Lower bound: For “ tree-based ” structures the following holds: Update-time · Search-time = Ω (n) Number of nodes n ≤ h  L + 1 L ≥ (n – 1)/h At least L-1 keys must have moved from leaves to internal nodes. Therefore, update requires time Ω(L). Delete x 1 Insert x n+1 > x n L leaves · x n x 1 h

46WS Cons-structures (3) Sunder/Tarjan, STOC 1990, Upper bound: (Nearly) full, binary search trees Einzige erlaubte Operation für Updates: Search time O(log n) Einfügen Entfernen in Zeit O(  n) möglich · · · · L R x LR x Cons,,

47WS Jump-lists (Half-dynamic) 2-level jump-list 2-level jump-liste of size n Search:O(i) = O( ) time Insertion: Deletion: O( ) time tail 0i2in (n-1)/i·i

48WS Jump-lists: Dynamization 2-level-jump-list of size n search:O(i) = O(  n) time insert delete : O(  n) time Can be made fully dynamic: (i-1) 2 i2i2 n(i+1) 2 (i+2) 2

49WS level jump-lists level 2 Search(x): locate x by following level-2-pointers identifying i 2 keys among which x may occur, level-1-pointers identifying i keys among which x may occur, level-0-pointers identifying x time: O(i) = O(n 1/3 ) 0i2ii 2 i 2 +i2·i 2

50WS level jump-lists level 2 Update requires Changing of 2 pointers on level 0 Changing of i pointers on level 1 Changing of all i pointers onlevel 2 Update time O(i) = O(n 1/3 ) 0i2ii 2 i 2 +i2·i 2

51WS c-level jump-lists Let Lower levels: level 0: all pointers of length 1:... level j: all pointers of legth i j-1 :... level c/2 :... Upper levels: level j: connect in a in list all nodes 1, 1·i j-1 +1, 2· i j-1 +1, 3· i j-1 +1,... level c:

52WS c-level jump-lists Theorem: For each c ≥ 3, the c-level jump-list is a size and order- unique representation of dictionaries with the following characteristics: Space requirement O(c·n) Access time O(c·n 1/c ) Update time, if n is even, if n is odd