Locality Aware Network Solutions Dahlia Malkhi The Hebrew University of Jerusalem
2 A Brief Overview of Distributed Computing The 90 ’ s: –Internet activity: Web browsing –Paradigm: Client-server –Techniques: cluster computing, Paxos, group communication
3 A Brief Overview of Distributed Computing The 90 ’ s: –Internet activity: File sharing –Paradigm: P2P, grid, web-services –Techniques: overlay networks, content distribution networks, resource location
4 Application: IPv6 Routing over IPv4 [van Renesse 02] AF3S:::3FF1:43E4 0001:::3BBB: :::7777: :::2222:2222 EEE0:::EEEE:EEEE 5151:::6161: A:::0202: :::0909:9999 Distribute Hash Tables (DHT)
5 Application: Content Delivery / Finding Nearest Copies of Data ? ? ?
6 Application: Hyperencryption [Maurer 92, Ding & Rabin 02] Random bits Alice Bob Key Adversary bits
7 Application: A Hyperencryption P2P Network [Rabin 03] Distributed Hash Table (DHT)
8 Application: A Distributed Google? ?
9 Scalable Network Solutions Overlay networks provide added functionality at the application level –Search, routing, location services Network theory provides the foundations –Possibilities, impossibilities, lower/upper bounds Practical solutions require flexible deployment
10 Distributed Data Structures (DDS) Peers jointly implement a data structure, e.g., hash table Route queries based on data-name (key)
11 DDS Problem Reduced to Routing ?? Responsible for
12 Why classic routing network designs don’t help Static # of nodes a priori known Node labels designated by network designer
13 DDS Reduced to Routing The problem: Overlay routing network –Variants: labeled routing, name-independent routing, finding nearest copies Dynamic emulation
14 Distributed Hash Tables [Malkhi, Naor, Ratacjzak, PODC 2002] SchemeDegree Route Chord, Tapestry, Pastry [2001] Log n CAN [2001] dd*n (1/d) Viceroy [2002] 5Log n Koorde, D2B, DH, generic [2003] 2Log n [Abraham, Awerbuch, Azar, Bartal, Malkhi, Pavlov, IPDPS 2003 ]
15 Tree View of Dynamic Graphs Leafs of the tree represent current nodes Inner nodes in the tree represent nodes that were split Example: merge of 000, 001 into 00
16 Locality awareness source target
17 Locality awareness source target
18 Locality Awareness in Overlay Networks Model the network as a weighted undirected graph –c(x, y): cost of shortest path from x to y –c() is a metric An overlay network is a sub-graph Let x=x 0, x 1, …, x t =y be a route in the overlay network Stretch: Ratio between overlay route cost and shortest path cost: ( c(x, x 1 ) + c(x 1,x 2 ) + … c(x t-1, y) ) / c(x,y)
19 Overlay Networks in Growth-Bounded Metrics Previous work : –[Plaxton, Rajaraman, Rica 1997], Tapestry (Berkeley), Pastry (MS UK) –Expected (large) constant stretch –Logarithmic node degree LAND [Abraham, Malkhi, Dobzinski, SODA 2004] : –Guaranteed stretch (1+ε) –Expected logarithmic node degree, constant depends on growth-bound –Simple, intuitive construction and proofs
20 Overlay Networks in Geometric Spaces Modeling the Internet as a geometric space is practical –Ubiquitous GPS devices –Successful embeddings in virtual coordinate-space Problem 1: Locate nodes Problem 2: Route to known coordinates
21 Location Services and Routing in Geometric Spaces LLS: First fully-locality aware location service [Abraham Dolev Malkhi 2004] –bounded stretch lookup –bounded stretch update First constant-degree routing scheme (to known coordinates) [Abraham Malkhi, PODC 2004] –constant node degree, logarithmic hops, 1+ε stretch
22 Routing in Arbitrary Graphs: Lower and upper bounds Name-independent routing: node names are independent of routing scheme [Awerbuch, Bar Noy, Linial, Peleg 1989] Lower bounds: [Gavoille Gengler 2001] –Stretch < 3 O(n) routing information –Stretch < 5 √n routing information Upper bound: [Abraham, Gavoille, Malkhi, Nisan, Thorup, SPAA 2004] –stretch-3 routing with O(√n ) routing information –Stretch 3 is indeed attainable! General upper bound: [Abraham Gavoille, Malkhi, DISC 2004] –Stretch-k routing with memory O(k 2 k √n )
23 ever-growing global scale-free networks, their provisioning, repair and unique functions EVERGROW The Vision ultimate RAID ultimate GNUTELLA ultimate GOOGLE ultimate AKAMAI infrastructure and new methods and systems devoted to measurement, mock-up and and analysis of present and future network traffic, topology and logical structure, to bridge the gap in theory, protocols and understanding to what the Internet can be in An EC project. Coordinators: SICS (Sweden) and HUJI (Jerusalem)
24 Locality-Aware, Robust Overlay for Information Lookup and Content Delivery Degree O(√n) Locality awareness: –Formally stretch 3 –For far-apart nodes, lower stretch Mostly two-hop –Whenever full connectivity exists Flexibility –Estimate √n roughly –Cache information on many vicinity nodes –Store information about any known node of same color Fault tolerance: –Multiple route choices –Quick repair –Maintain QoS in face of churn
25
26 Large Scale Content Delivery Initially split the content Then cross-exchange data pieces Solutions build on top of overlay routing networks
27 FastReplica Cherkasova,Lee -HP Labs Phase 1 Source Clients
28 FastReplica Cherkasova,Lee -HP Labs Phase 2 Source Clients
29 Locality motivation – Tree example
30 Julia algorithm motivation: “Divide and conquer” First phase
31 2 nd phase Julia algorithm motivation: “Divide and conquer”
32 Network nodes
33 Nodes’ random identifiers
34 Coloring and Vicinities
35 Coloring and Vicinities ? ? ? ?
36 Stretch 3 ? d ≤ d ≤ 2d
37 The Full Routing Scheme ? ? a b c d