Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)

Slides:



Advertisements
Similar presentations
all-pairs shortest paths in undirected graphs
Advertisements

Distance and Routing Labeling Schemes in Graphs
Dynamic Graph Algorithms - I
Compact Routing in Theory and Practice Lenore J. Cowen Tufts University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Cse 521: design and analysis of algorithms Time & place T, Th pm in CSE 203 People Prof: James Lee TA: Thach Nguyen Book.
Compact and Low Delay Routing Labeling Scheme for Unit Disk Graphs Chenyu Yan, Yang Xiang, and Feodor F. Dragan (WADS 2009) Kent State University, Kent,
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
Noga Alon Institute for Advanced Study and Tel Aviv University
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Chapter 3 The Greedy Method 3.
Routing, Anycast, and Multicast for Mesh and Sensor Networks Roland Flury Roger Wattenhofer RAM Distributed Computing Group.
B+-tree and Hashing.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
CS728 Lecture 16 Web indexes II. Last Time Indexes for answering text queries –given term produce all URLs containing –Compact representations for postings.
1 Approximate Distance Oracles Mikkel Thorup AT&T Research Uri Zwick Tel Aviv University.
Compact Routing Schemes Mikkel Thorup Uri Zwick AT&T Labs – Research Tel Aviv University.
Collective Additive Tree Spanners of Homogeneously Orderable Graphs
Collective Tree Spanners of Graphs with Bounded Parameters F.F. Dragan and C. Yan Kent State University, USA.
Collective Tree Spanners of Graphs F.F. Dragan, C. Yan, I. Lomonosov Kent State University, USA Hiram College, USA.
Collective Tree Spanners and Routing in AT-free Related Graphs F.F. Dragan, C. Yan, D. Corneil Kent State University University of Toronto.
May 5, 2015Applied Discrete Mathematics Week 13: Boolean Algebra 1 Dijkstra’s Algorithm procedure Dijkstra(G: weighted connected simple graph with vertices.
1 Binomial heaps, Fibonacci heaps, and applications.
Complexity of Bellman-Ford Theorem. The message complexity of Bellman-Ford algorithm is exponential. Proof outline. Consider a topology with an even number.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Improved Sparse Covers for Graphs Excluding a Fixed Minor Ryan LaFortune (RPI), Costas Busch (LSU), and Srikanta Tirthapura (ISU)
Chapter 2 Graph Algorithms.
May 1, 2002Applied Discrete Mathematics Week 13: Graphs and Trees 1News CSEMS Scholarships for CS and Math students (US citizens only) $3,125 per year.
1 Exact Top-k Nearest Keyword Search in Large Networks Minhao Jiang†, Ada Wai-Chee Fu‡, Raymond Chi-Wing Wong† † The Hong Kong University of Science and.
Trees and Distance. 2.1 Basic properties Acyclic : a graph with no cycle Forest : acyclic graph Tree : connected acyclic graph Leaf : a vertex of degree.
Distance sensitivity oracles in weighted directed graphs Raphael Yuster University of Haifa Joint work with Oren Weimann Weizmann inst.
Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew U. Liam RodittyBar Ilan Tsvi KopelowitzBar Ilan → Weizmann.
1 Some Results in Interval Routing Francis C.M. Lau HKU and ITCS Tsinghua December 2, 2007.
Complexity of Bellman-Ford
Approximate Distance Oracles Mikkel Thorup and Uri Zwick Presented By Shiri Chechik.
An optimal dynamic spanner for points residing in doubling metric spaces Lee-Ad Gottlieb NYU Weizmann Liam Roditty Weizmann.
A correction The definition of knot in page 147 is not correct. The correct definition is: A knot in a directed graph is a subgraph with the property that.
Fault Tolerant Graph Structures Merav Parter ADGA 2015.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Distance and Routing Labeling Schemes in Graphs with Applications in Cellular and Sensor Networks Feodor F. Dragan Computer Science Department Kent State.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
A light metric spanner Lee-Ad Gottlieb. Graph spanners A spanner for graph G is a subgraph H ◦ H contains vertices, subset of edges of G Some qualities.
Graphs Slide credits:  K. Wayne, Princeton U.  C. E. Leiserson and E. Demaine, MIT  K. Birman, Cornell U.
Exponential random graphs and dynamic graph algorithms David Eppstein Comp. Sci. Dept., UC Irvine.
Implicit Representation of Graphs Paper by Sampath Kannan, Moni Naor, Steven Rudich.
Spanning Tree Definition:A tree T is a spanning tree of a graph G if T is a subgraph of G that contains all of the vertices of G. A graph may have more.
Stephen Alstrup (U. Copenhagen) Esben Bistrup Halvorsen (U. Copenhagen) Holger Petersen (-) Aussois – ANR DISPLEXITY - March 2015 Cyril Gavoille (LaBRI,
Data Structures for Emergency Planning Cyril Gavoille (LaBRI, University of Bordeaux) 8 th FoIKS Bordeaux – March 3, 2014.
Succinct Routing Tables for Planar Graphs Compact Routing for Graphs Excluding a Fixed Minor Ittai Abraham (Hebrew Univ. of Jerusalem) Cyril Gavoille (LaBRI,
Forbidden-Set Distance Labels for Graphs of Bounded Doubling Dimension
Chapter 5 : Trees.
Progress and Challenges for Labeling Schemes
Improved Randomized Algorithms for Path Problems in Graphs
Labeling Schemes with Forbidden-Sets
Accessing nearby copies of replicated objects
Localized Data Structures
Randomized Algorithms CS648
Distance and Routing Labeling Schemes in Graphs
cse 521: design and analysis of algorithms
Compact Routing Schemes
Approximate Distance Oracles
Binomial heaps, Fibonacci heaps, and applications
Distance-preserving Subgraphs of Interval Graphs
Important Problem Types and Fundamental Data Structures
New Jersey, October 9-11, 2016 Field of theoretical computer science
Routing in Networks with Low Doubling Dimension
Forbidden-set labelling in graphs
Distance and Routing Labeling Schemes in Graphs with Applications in Cellular and Sensor Networks Feodor F. Dragan Computer Science Department Kent State.
Presentation transcript:

Distributed Data Structures: A Survey Cyril Gavoille (LaBRI, University of Bordeaux)

Contents 1. Efficient data structures 2. Distributed data structures 3. Informative labeling schemes 4. Conclusion

1. Efficient data structures (Tarjan’s like) Example 1: A tree (static) T with n vertices Question: nearest common ancestor nca(x,y) for some vertices x,y? Note: queries (x,y) are not known in advance (on-line queries on a static tree) (on-line queries on a static tree)

[Harel-Tarjan ’84] Each tree with n vertices has a data structure of O(n) space (computable in linear time) such that nca queries can be answered in constant time.

A weighted graph G with n vertices, and a parameter k1 A weighted graph G with n vertices, and a parameter k ≥ 1 Question: a k-approximation δ(x,y) on dist(x,y) in G for some vertices x,y? with dist(x,y) ≤ δ(x,y) ≤ k. dist(x,y) Example 2: Example 2:

[Thorup-Zwick - J.ACM ’05] Each undirected weighted graph G with n vertices, and each integer k1, has a data structure of O(k. n) space (computable in O(km. n) expected time) such that (2k- 1)-approximated distance queries can be answered in O(k) time. Each undirected weighted graph G with n vertices, and each integer k ≥ 1, has a data structure of O(k. n 1+1/k ) space (computable in O(km. n 1/k ) expected time) such that (2k- 1)-approximated distance queries can be answered in O(k) time. Essentially optimal, related to an Erdös Conjecture.

2. Distributed data structures Typical questions are: Answer to query Q with the local knowledge of x (or its vicinity), so without any access to a global data structure. Answer to query Q with the local knowledge of x (or its vicinity), so without any access to a global data structure. A network x

Query at x: who has any mpeg file named ‘‘Sta*Wa*’’? Example 1: Distributed Hash Tables (DHT) Example 1: Distributed Hash Tables (DHT) x Answer: go to w and ask it. Answer: go to w and ask it. x does not know, but w certainly knows … at least a pointer set of peers logical network

Query at x: next hop to go to y? Example 2: Routing in a physical network Example 2: Routing in a physical network x y

Query at x: the number of descents of x (or a constant approximation of it) Example 3: in a dynamic setting Example 3: in a dynamic setting A growing rooted tree It is possible to maintain a 2-approximation on the number of descendants with O(log 2 n) amortized messages of O(loglogn) bits each, n number of inserted vertices. It is possible to maintain a 2-approximation on the number of descendants with O(log 2 n) amortized messages of O(loglogn) bits each, n number of inserted vertices. [Afek,Awerbuch,Plokin,Saks – J.ACM ’96] [Afek,Awerbuch,Plokin,Saks – J.ACM ’96]

Goals are: ► The same as for global data structures:  Low preprocessing time  Small size data structure  Fast query time  Efficient updates + Smaller and balanced local data structures + Low communication cost (trade-offs), for multiple hops answers

3. Informative Labeling Schemes For the talk  A static network/graph  Queries: involve only vertices  Answers: do not require any communication (direct data structures)

Question: dist(x,y) in a graph G? Answering to dist(x,y) consists only in inspecting the local data structure of x and of y. Main goal: minimize the maximal size of a local data structure. Wish: |DS(x,G)| « |DS(G)|, ideally |DS(x,G)| ≈ (1/n). |DS(G)| Data Structure for graph G xy

[Thorup-Zwick - J.ACM ’05] … Moreover, each vertex w  L(w) of Õ(nlogD) bits (D=weighted diameter of G) such that a (2k- 1)-approximation on dist(x,y) can be answered from L(x) and L(y) only. … Moreover, each vertex w  L(w) of Õ(n 1/k logD) bits (D=weighted diameter of G) such that a (2k- 1)-approximation on dist(x,y) can be answered from L(x) and L(y) only. n n 1+1/k n n 1/kwyx Overlap: Õ(logD)

Informative labeling schemes (more formally) [Peleg ’00] A P -labeling scheme for F is a pair ‹L,f› such that:  G  F,  u,v  G: (labeling)L(u,G) is a binary string (labeling)L(u,G) is a binary string (decoder)f(L(u,G),L(v,G)) = P (u,v,G) (decoder)f(L(u,G),L(v,G)) = P (u,v,G) Let P be a graph property defined on pairs of vertices (can be extended to any tuple), and let F be a graph family.

Some P -labeling schemes ► Adjacency ► Distance (exact or approximate) ► First edge on a (near) shortest path (compact routing, labeled-based routing) ► Ancestry, parent, nca, sibling relation in trees ► Edge connectivity, flow ► General predicate P described in monadic second order logic [Courcelle] ► Proof labeling systems [Korman,Kutten,Peleg]

Ancestry in rooted trees Motivation: [Abiteboul,Kaplan,Milo ’01] The … structure of a huge XML data-base is a rooted tree. Some queries are ancestry relations in this tree. Use compact index for fast query XML search engine. Here the constants do matter. Saving 1 byte on each entry of the index table is important. Here n is very large, ~ Ex: Is descendant of ?

Folklore? [Santoro, Khatib ’85] [a,b] [c,d]? [a,b]  [c,d]?  2logn bit labels DFS labeling 1 L(x)=[2,18] [13,18] 18 [22,27]

[Alstrup,Rauhe – SODA ’02] Upper bound: logn + O(  logn) bits Lower bound: logn +  (loglogn) bits

Adjacency Labeling / Implicit Representation P (x,y,G)=1 iff xy in E(G) [Kanan,Naor,Rudich – STOC ’92] O(logn) bit labels for: trees (and forests) trees (and forests) bounded arboricity graphs (planar, …) bounded arboricity graphs (planar, …) bounded treewidth graphs bounded treewidth graphs In particular: 2logn bits for trees 2logn bits for trees 4logn bits for planar 4logn bits for planar

Acutally, the problem is equivalent to an old combinatorial problem: Acutally, the problem is equivalent to an old combinatorial problem: [Babai,Chung,Erdös,Graham,Spencer ’82] Small Universal Induced Graph U is an universal graph for the family F if every graph of F is isomorphic to an induced subgraph of U b e b a c e d f g c e c g a g

Universal graph U (fixed for F (fixed for F) Graph G of F |L(x,G)| =  log 2 |V( U )|  b e b a c e d f g c e c g a g

Best known results/Open questions ► Bounded degree graphs: logn [Alon,Asodi - FOCS ’02] ► Trees: logn + O(log * n) [Alstrup,Rauhe - FOCS ’02]  Planar: 3logn + O(log * n) x vZy log*n = min{ i  0 | log (i) n  1}

Lower bounds?: logn +  (1) for planar Lower bounds?: logn +  (1) for planar No hereditary family with n!2 O(n) labeled graphs (trees, planar, bounded genus, bounded treewidth,…) is known to require labels of logn +  (1) bits. No hereditary family with n!2 O(n) labeled graphs (trees, planar, bounded genus, bounded treewidth,…) is known to require labels of logn +  (1) bits. logn + O(1) bits for this family?

Distance Motivation: [Peleg ’99] If a short label (say of polylogarithmic size) can be added to the address of the destination, then routing to any destination can be done without routing tables and with a “limited” number of messages. P (x,y,G)=dist(x,y) in G dist(x,y) x message header=hop-county

A selection results ►  (n) bits for general graphs  1.56n bits, but with O(n) time decoder! [Winkler ’83 (Squashed Cube Conjecture)]  11n bits and O(loglogn) time decoder [Gavoille,Peleg,Pérennès,Raz ’01] ►  (log 2 n) bits for trees and bounded treewidth graphs, … [Peleg ’99, GPPR ’01] ►  (logn) bits and O(1) time decoder for interval, permutation graphs, … [ESA ’03]:  O(n) space O(1) time data structure, even for m=  (n 2 )

Results (cont’d) ►  (logn. loglogn) bits and (1+o(1))-approximation for trees and bounded treewidth graphs [GKKPP – ESA ’01] ► More recently: doubling dimension-  graphs Every radius-2r ball can be covered by  2  radius-r balls Euclidean graphs have  =O(1) Euclidean graphs have  =O(1) Include bounded growing graphs Include bounded growing graphs Robust notion Robust notion

Distance labeling for doubling dimension graphs  (  -O(  ) logn. loglogn) bits (1+  )-approximation for doubling dimension-  graphs [Gupta,Krauthgamer,Lee – FOCS ’03] [Talwar – STOC ’04] [Mendel,Har-Peled – SoCG ’05] [Slivkins - PODC ’05]

Distance labeling for planar ►  O(log 2 n) bits for 3-approximation [Gupta,Kumar,Rastogi – SICOMP ’05] ► O(  -1 log 2 n) bits for (1+  )-approximation [Thorup – J.ACM ’04] ►  (n 1/3 )  ?  Õ(  n) for exact distance

Lower bounds for planar [Gavoille,Peleg,Pérennès,Raz – SODA ’01] #vertices ~ k 3 #critical edges ~ k 2 #labels = 2 k    |label|> k 2 / 2 k ~ n 1/3

► A graph G with a state S u at each vertex u: (G,S) ► A global property P (MST, 3-coloring, …) ► A marker algorithm applied on (G,S) that returns a label L(u) for u ► A binary decoder (checker) for u applied on N(u): f u = f(S u,L(u),L(v 1 )…L(v k )) ∈ {0,1} G has property P  f u =1  u G hasn't prop. P  w, f w =0 whatever the labels are Proof Labeling Systems [Korman,Kutten,Peleg – PODC ’05] u v1v1v1v1 v3v3v3v3 v2v2v2v2 S1S1S1S1 S4S4S4S4 S2S2S2S2 S3S3S3S3 S5S5S5S5

What is the knowledge needed for local verifications of global properties? S1S1S1S1 S4S4S4S4 S2S2S2S2 S3S3S3S3 S5S5S5S5

Conclusion ► Labeling scheme for distributed computing is a rich concept. ► Many things remain to do, specially lower bounds