Wolfgang Mulzer Institut f ür Informatik Data Structures on Event Graphs Bernard ChazelleWolfgang Mulzer FU Berlin Princeton University
2 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs It‘s the data Data can be huge Rethink classical algorithms from a data-oriented perspective. corrupted …… low-entropy expensive
3 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs It‘s the data Data can be huge We study a model that represents temporal locality of the data. corrupted …… low-entropy expensive
4 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs A concrete problem – successor search Given: An ordered universe U of n elements x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 x 10 Goal: maintain a subset S of U supporting successor queries Operations: Insert(x i ) Delete(x i ) Successor(x i ) Also known as Union-Split-Find Problem.
5 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs A concrete problem – successor search Given: An ordered universe U of n elements x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 x 10 Can be solved in O(log log n) time on a pointer machine. [van Emde Boas, Kaas, Zijlstra 77] This is optimal. [Mehlhorn, Näher, Alt 88], [Pătraşcu, Thorup 06]
6 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Event graphs Given: An ordered universe U of n elements and a labeled, connected, undirected graph G Ix0Ix0 Ix7Ix7 Dx9Dx9 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 G is labeled with operations Ix i, Dx i, Sx i G is known in advance G can be preprocessed Adversary walks on G to perform ops Similar to Markov chains
7 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Event graphs G is labeled with operations Ix i, Dx i, Sx i G is known in advance G can be preprocessed Adversary walks on G to perform ops x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 x 10 Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9
8 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9 (Dx 2, )(Sx 2, ) (Ix 9, {x 9 }) (Ix 5, {x 5, x 9 })
9 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. If dec(G) is available, we can perform all operations in constant time. But: The size of dec(G) is exponential.
10 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. Questions: - What can we say about the structure of dec(G)? -What can we deduce about dec(G), given G? -In which cases can dec(G) be compressed efficiently?
11 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs The structure of decorated graphs dec(G) contains a unique strongly connected component that has no exit and is reachable from every other node. This component is called the unique sink. C1C1 C4C4 C3C3 C2C2
12 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs The structure of decorated graphs Theorem: Given a node v V(G) and a set S U, we can decide in time O(|V(G)|+|E(G)|) whether (v,S) lies in the unique sink. Proof idea: We show that for every node in the unique sink there exists a unique certificate in G (a certifying walk). A modified graph search in G can be used to find a certifying walk for (v,S), if it exists.
13 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Can the decorated graph be compressed? Consider the case that G is a path. Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n 1+ ) space on a word RAM, where n=|V|. Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9
14 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Can the decorated graph be compressed? Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n 1+ ) space on a word RAM, where n=|V|. Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9
15 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Can the decorated graph be compressed? Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n 1+ ) space on a word RAM, where n=|V|. Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9 Proof: Maintain S in a doubly linked list. Each node in G has a pointer to its predecessor or successor in S. Use this pointer to answer the queries. Need only maintain those pointers that will be relevant next. Use lookup-table.
16 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Example Dx3Dx3 Dx1Dx1 Dx2Dx2 Sx5Sx5 Sx8Sx8 Ix7Ix7 Dx9Dx9 Ix2Ix2 x1x1 x3x3 x5x5 x7x7 x 10 … …
17 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Reducing the space requirement A naïve implementation uses two lookup-tables per node to update the pointers → O(n 2 ) space usage. Can be improved to O(n 1+ ) space. Approach: Use spatial decomposition and bootstrapping to compress the lookup-tables (cf. [Crochemore et al, 2008])
18 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs What about randomization? We assumed an adversary. But: What if the walk on the path is random? Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|.
19 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs What about randomization? Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|. Proof (sketch): Subdivide the path into segments of n nodes. The random walk requires (n) steps to leave a segment. Build the quadratic data structure once the walk enters the next segment. Use overlapping segments and deamortization techniques to make it work.
20 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs What about more complicated graphs? What if G is a tree, a grid, or something more complicated? The path approach does not work any more Ix0Ix0 Ix7Ix7 Sx7Sx7 Sx2Sx2 Dx2Dx2 Ix9Ix9 Sx2Sx2 Dx9Dx9 Ix0Ix0 Ix7Ix7 Dx2Dx2 Sx7Sx7 Ix9Ix9 Sx2Sx2 Ix5Ix5 Dx9Dx9 Ix7Ix7 We conjecture that in this case the O(log log n) bound from van Emde Boas trees is optimal (but we do not know).
21 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Conclusion and open problems A new way to model request sequences to a data structure. Can be applied to any data structuring problem. More algorithmic questions on decorated graphs, e.g., can we estimate the size of the unique sink efficiently? Can we prove lower bounds for the successor problem on general event graphs?
22 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs Thank you!