Peer-to-Peer Networks 1 Christian Scheideler Institut für Informatik Technische Universität München
Motivation Every distributed system must be based on a network interconnecting its sites Network: of physical or logical nature
Physical Network Supercomputers, multicore systems,…
Logical Network Internet
Overlay Network Internet
Overlay Network
Overlay Network Basic question: how to organize sites in a scalable and robust overlay network???
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Graph theory Graph G=(V,E): V: set of nodes / vertices E ½ { (v,w) | v,w 2 V}: set of edges / arcs v knows w v can send info to w valid path D B C A
Graph theory (v,w): distance (length of shortest path) of w to v in G D=maxv,w (v,w): diameter of G A D B C D=4
Graph theory (U): set of neighbors of node set U (U)=|(U)| / |U| (G) = minU,|U|<|V|/2 (U): expansion of G D B C A |U|=2 U |(U)|=1
Graph theory Network G=(V,E,c): V: set of nodes, E: set of edges c:E ! IR+: edge capacities 2 D B C A
Graph Theory Unless mentioned otherwise: All edges have capacity 1 {v,w} represents {(v,w), (w,v)} D B C A
Network topologies Ideally, complete network: Problem: does not scale well! (~n2 edges)
Line Network degree 2 (optimal), BUT diameter bad (n-1 for n nodes) expansion bad ( (line) = 2/n ) How to get a low diameter?
Binary Tree n=2k+1-1 nodes, degree 3 diameter is k = 2 log2 n, BUT depth k k n=2k+1-1 nodes, degree 3 diameter is k = 2 log2 n, BUT expansion is still bad ( (tree)=2/n )
2-dimensional Grid n = k2 nodes, maximum degree 4 1 side length k k n = k2 nodes, maximum degree 4 diameter is 2(k-1) < 2 n expansion is ~2/ n Not too bad, but can we get better values?
Hypercube Nodes: (x1,…,xd) 2 {0,1}d Edges: 8 i: (x1,…,xd) ! (x1,..,1-xi,..,xd) d=1 d=2 d=3 Degree d, diameter d, expansion 1/ d Routing: (x1,x2,…,xd) ! (y1,x2,…,xd) ! (y1,y2,x3,…,xd) ! … ! (y1,y2,…,yd)
Butterfly Nodes: (k,(xd,…,x1)) 2 {0,..,d} £ {0,1}d Edges: (k-1,(xd,…,x1)) ! (k,(xd,..,xk,..,x1)) (k,(xd,..,1-xk,..,x1)) Degree 4, diameter 2d, expansion ~1/d 00 01 10 11 1 1 1 2 Routing: (0,(x1,x2,…,xd)) ! (1,(y1,x2,…,xd)) ! (2,(y1,y2,x3,…,xd)) ! … ! (d,(y1,y2,…,yd))
Cube-Connected-Cycles Nodes: (k,(x1,…,xd)) 2 {0,..,d-1} £ {0,1}d Edges: (k,(x1,…,xd)) ! (k-1,(x1,...,xd)) (k+1,(x1,..,xd)) (k,(x1,..,1-xk+1,..,xd)
De Bruijn Graph Nodes: (x1,…,xd) 2 {0,1}d Edges: (x1,…,xd) ! (0,x1,…,xd-1) (1,x1,…,xd-1) 01 001 011 010 101 00 11 000 111 10 100 110 (x1,…xd) ! (yd,x1,…xd-1) ! (yd-1,yd,x1,…,xd-2) ! …
The Diameter Theorem: Every graph of maximum degree d>2 and size n must have a diameter of at least (log n)/(log(d-1))-1. Theorem: For every even d>2 there is a family of graphs of maximum degree d and size n with diameter (log n)/(log d -1). tree of all reachable nodes at dist. k
The Expansion Theorem: For every graph G the expansion (G) is at most 1. Theorem: There are families of constant degree graphs with constant expansion. Example: Gabber-Galil Graph Node set: (x,y) 2 {0,…,n-1}2 (x,y) ! (x,x+y),(x,x+y+1), (x+y,y), (x+y+1,y) (mod n)
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Overlay Network Basic question: how to organize sites in a scalable and robust overlay network??? Robustness: can handle faults and malicious behavior Scalability: works efficiently for large number of sites
Server-based approach Internet server Does not scale well! sites
Alternatives Supervised overlay network Peer-to-peer overlay network Supervisor assists in maintaining network Peers maintain network themselves
Problem: How to maintain an overlay network as peers join and leave?
Supervised Overlay Network Supervisor assigns peers to points in [0,1) so that peers evenly distributed Neighboring peers connect to form cycle 1 7/8 1/8 1/4 3/4 3/8 5/8 1/2
Supervised Overlay Network Node v wants to join (n nodes in system): give it (n+1)th position Node w wants to leave: move last node v to w‘s position 1 v w
Supervised Overlay Network v: node at nth position supervisor: stores pred(v), v, succ(v), succ(succ(v)) join and graceful leave operation: 1 v
Pure Peer-to-Peer Network We also focus on [0,1). Every peer mapped to random point in [0,1). Peers form cycle based on points. Chord: cryptographic hash function CAN: random number 1 v
Continuous-Discrete Approach Problem: cycle not a good routing topology! 1 long paths!
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Continuous-discrete Approach V: set of peers, U: virtual space Each v 2 V mapped to region R(v) ½ U Family F of functions f:U ! U {v,w} edge , [F(R(v)) Å R(w)] [ [F(R(w)) Å R(v)] = ;
Continuous-discrete Approach Basic questions: How to map peers to regions? What family F to choose?
Continuous-discrete Approach Take a classical family of networks (Hypercube, de Bruijn graph,…) Convert it into continuous form by interpreting node labels as points in U, edges as a family of functions F Mapping peers to regions will then convert continuous form back into discrete graph.
Hypercube Classical hypercube: V: nodes with labels (x1,…,xd) 2 {0,1}d For all i: (x1,…,xd) ! (x1,..,1-xi,..,xd) Continuous version of hypercube: Interpret (x1,…,xd) as z=i xi/2i d ! 1: U=[0,1) F: fi+(x) = x+1/2i, fi-(x) = x-1/2i 8 i>0
De Bruijn Graph Classical de Bruijn graph: V: nodes with labels (x1,…,xd) 2 {0,1}d E: (x1,…,xd) ! (0,x1,…,xd-1), (1,x1,…,xd-1) Continuous de Bruijn graph: Interpret (x1,…,xd) as z=i xi/2i d ! 1: U=[0,1) F: f0(x) = x/2, f1(x) = (1+x)/2
Gabber-Galil Graph Classical Gabber-Galil graph: Node set: (x,y) 2 {0,…,n-1}2 (x,y) ! (x,x+y),(x,x+y+1), (x+y,y), (x+y+1,y) (mod n) Continuous Gabber-Galil graph: n ! 1: U=[0,1)2 F: f1(x,y)=(x,x+y), f2(x,y)=(x+y,y)
Continuous-discrete Approach Take a classical family of networks (Hypercube, de Bruijn graph,…) Convert it into continuous form by interpreting node labels as points in U, edges as a family of functions F Mapping peers to regions will then convert continuous form back into discrete graph.
Supervised Overlay Network How to map peers to regions? Consider any space U=[0,1)d Hierarchical decomposi- tion tree:
Supervised Overlay Network 1 000 001 01 10 11
Supervised Overlay Network Fact: Volumes of subcubes assigned to nodes differ by factor of at most 2. Subcubes pairwise disjoint. Union of subcubes gives U. Combine this with family F of functions.
Join Operation v w 1 000 001 010 01 011 10 11
Join Operation 000 001 10 f R(v) R(v) R(w) 11 f’ {u,v} edge , [F(R(u)) Å R(v)] [ [F(R(u)) Å R(v)] = ;
Join Operation w inherits connections from v v w 1 000 001 010 01 011 1 000 001 010 01 011 10 11
Leave Operation v inherits connections from w v w 1 000 00 001 01 10 1 000 00 001 01 10 11
Supervised Overlay Network For any supervised network based on continuous-discrete approach with [0,1)d: Sufficient if supervisor introduces new peer to cycle neighbors. From these, new peer can get all F-connections Join/leave can be performed with constant time and work for supervisor. High robustness: Sufficient to secure base cycle!
Peer-to-Peer Overlay Network We focus on U=[0,1). Every peer mapped to random point in [0,1). 1 v v owns region [v,succ(v))
Join Operation New peer chooses random position x. Route to peer v owning position. Inherit all relevant edges w.r.t. F from v 1 v x
Leave Operation Node that wants to leave transfers its connections to its predecessor. 1
Peer-to-Peer Overlay Network Scalability: with hypercube / de Bruijn network has logarithmic diameter peers have (poly-)logarithmic degree join/leave need (poly-)logarithmic time/work (w.h.p.) Robustness: Make sure base ring is robust!
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Maintaining a robust cycle Problem: cycle very fragile structure! 1
Maintaining a robust cycle Solution: connect to (log n) nearest neighbors Chernoff bounds: nodes still connected under constant fraction of random failures (with high probability) 1 2 nearest Nodes randomly distributed on cycle: constant fraction of correlated failures redu-ces to random failure case
Maintaining a robust cycle Problem: what if adversarial peers are part of in the system? system cannot distinguish between peers! honest peers adversarial peers
Supervised cycle Nodes connect to (log n) nearest neighbors: 1 v w Nodes connect to (log n) nearest neighbors: Hard for adversarial peers to isolate honest peers
Peer-to-peer cycle Chord: uses cryptographic hash function to map peers to points in [0,1) randomly distributes honest peers does not randomly distribute adversarial peers
Peer-to-peer cycle CAN: map peers to random points in [0,1)
Peer-to-peer cycle Group spreading: Map peers to random points in [0,1) Limit lifetime of peers Too expensive!
Peer-to-peer cycle How can the system enforce an even distribution of honest and adversarial peers in the [0,1) space???
Peer-to-peer cycle n honest peers, n adversarial peers partition [0,1) space into regions of size (c log n)/n for some constant c For any region I ½ [0,1) of size (c log n)/n: Balancing condition: (log n) peers in I Majority condition: honest peers in majority scalability robustness
How to satisfy conditions? Rule that works: k-cuckoo rule n honest n adversarial evict k/n-region < 1-1/k
Limitation of k-cuckoo rule Only works for any sequence of join and leave requests of adversarial peers. Does not work for any sequence of join and leave requests. Example: adversary orders all peers in a region of size O(log n / n) to leave Solution: also rearrangements for leave Op.
n honest n adversarial k-Flip&Cuckoo Rule Join: as before (k-cuckoo rule) Leave: choose random k/n-region among neighboring (c log n) k/n-regions, empty & flip it with random k/n-region n honest n adversarial flip join
Random Number Generation Critical component: robust distributed random number generator Solution: very simple (no error-correcting codes) works for public channels even if constant fraction is adversarial Trick: generate groups of random numbers
Maintaining a robust cycle So far, only proactive techniques (i.e., techniques that protect cycle) Proactive techniques expensive and have their limits (minority of adv. peers) Also reactive techniques needed (i.e., techniques that can recover cycle)
Recovering the cycle First approach: recover sorted list 20 5 8 12 2 2
Recovering a sorted list Naïve approach: Continuously collect info about neighbors of neighbors until all nodes known Transform neighborhood into sorted list Not easy to check! Not scalable! Initial graph
Recovering a sorted list Better approach: linearization Every node does the following locally: 3 5 8 12 14 16 coordination problem 3 5 8 12 14 16
Recovering a sorted list Naïve solution of coordination problems: Suppose that time is synchronized In each round (2 time steps) each node v: right linearization left linearization v v v v
Recovering a sorted list Correctness of right/left linearization: Consider arbitrary consecutive pair v,w Range reduces by 1 in each round v w range of path from v to w
Recovering a sorted list Correctness of right/left linearization: Consider arbitrary consecutive pair v,w v w range of path from v to w
Recovering a sorted list Correctness of right/left linearization: Consider arbitrary consecutive pair v,w degree increases by +2 in each round v w range of path from v to w
Recovering a sorted list More realistic approach: take asynchronous behavior into account Peers operate in actions: <label>: <guard> ! <commands> v.NB: neighbor list of v we assume: w 2 v.NB , v 2 w.NB v w edges like shared variables no edges {v,v} {v,w}: 0/1
Recovering a sorted list safe if executed sequentially in each node u.L, u.R: left / right neighborhood of u Actions for node u: grow right: (v 2 u.R) Æ (w 2 v.L) Æ (w 2 u.NB) ! u.NB := u.NB [ {w} trim right: (v,w 2 u.R) Æ (w 2 v.L) ! u.NB := u.NB n {v} grow left and trim left similar wait until w2 u.NB and u2 w.NB w u v u w v preferred op to keep degree low
Recovering a sorted cycle Establish wrap-around edge: v.wa: wrap-around edge of v we assume: v.wa = w , w.wa=v v sets v.wa to w: v.NB:=v.NB [ {v.wa}, v.wa:=w Problem: more cases for initial state!
Recovering a sorted cycle Additional actions for node u: wrap: (u.L=;) Æ (u.wa=?) Æ (w 2 u.R) ! u.wa := w extend: (u.L=;) Æ (u.wa=?) Æ (w2 u.wa.R) ! u.wa := w unwrap: (u.L=;) Æ (u.wa=?) Æ (u.wa>u) ! u.wa := ? u w u w v u
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Skip Graphs Problem: messages between local peers may be sent across world
Skip Graphs Better: Give nodes hierarchically specified names europe.germany.bavaria.munich.tum Sort nodes according to names name space Problem: high imbalance, so cont-disc approach does not work!
Skip Graphs Each node v has arbitrary unique name ID(v) and random bit string s(v) prefixi(s(v)): first i bits of s(v) Skip graph rule: For every node v and i 2 IN0: v connects to closest successor and pre-decessor w (w.r.t. ID(v) ) with prefixi(s(w)) = prefixi(s(v))
Skip Graphs Nodes v with s(v)=0… Nodes v with s(v)=1…
Skip Graphs Hierarchical view: 00 01 10 11 1 000 001 00 01 10 11 1 (log n) Degree, (log n) diameter, (1) expansion w.h.p.
Routing in Skip Graphs Asia Europe O(log n) hops w.h.p. Australia America Africa
The Hyperring Is randomization in skip graphs necessary? Hyperring: deterministic form of skip graph Approach similar to skip graphs: organize nodes in cycle according to real names. Cherry Banana Apple
Shortcuts: Intertwined Rings bridge
Join and Leave Inserting a node: bottom up
Join and Leave Deleting a node: bottom up
k-separated Hyperring In every level, bridges are k nodes apart. How large does k have to be to guarantee polylogarithmic expansion a ? Theorem: a = (1/n)W(1/ k ) So k has to be non-constant ( W( log n ) ). Do areas with old insertions/deletions have to be revisited?? 2
k-separated Hyperring Rule: Choose k=6(d+3) d: current degree of node initiating op. Theorem: degree: O(log n) expansion: W(1/log n) congestion for permutations: O(log n) w.h.p. work for Join/Leave: O(log n) 3
Locality-aware Overlay Networks Problem: in general, a distance metric can-not be embedded well into 1-dimensional space So applicability of skip graphs limited Use different construction based on Plaxton, Rajaraman and Richa
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Locality-aware Overlay Networks For a node v let s(v) be its random bit string and Bi(v) be ball around v of minimum radius so that Bi(v) contains c 2i log n peers B3(v) B1(v) B2(v)
Locality-aware Overlay Networks Assumption: growth-bounded metric N(v,r): set of nodes w with d(v,w) < r There is a constant >0 so that |N(v,(1+)r)| < 2|N(v,r)| all v, r B3(v) B1(v) B2(v)
Locality-aware Overlay Networks Topology: for every node v and i 2 IN: v connects to all nodes w 2 Bi(v) with prefixi-1(s(v)) = prefixi-1(s(w)) B3(v) c 2i log n peers in Bi(v) B1(v) B2(v)
Locality-aware Overlay Networks Topology rule implies: degree of each node (log2 n) w.h.p. v has nodes w in Bi(v) with prefixi(s(w)) = prefixi-1(s(v)) ± x for all x 2 {0,1} w.h.p. B3(v) c 2i log n peers in Bi(v) B1(v) B2(v)
Locality-aware Routing Routing from v to w: s(v)=(x1 x2 x3…), s(w)=(y1,y2,y3,…) v ! closest u1 in B1(v) with prefix1(u1) = y1 u1 ! closest u2 in B2(u1) with prefix2(u2) = y1 y2 … until we reach uk-1 with w in Bk(uk-1)
Locality-aware Routing =1 B1(v) B2(v) u2 u1 v w B3(u1) B2(u1) B3(u2)
Locality-aware Routing Let r(B) be radius of ball B. d(u1,v) < r(B1(v))/ w.h.p. ( = (log1+ c) ) r(B2(u1)) > (1+-1/) r(B1(v)) d(u2,u1) < r(B2(u1))/ w.h.p. r(B3(u2)) > (1+-1/) r(B2(u1)) … After k hops ( r=r(B1(v)) ): d(uk, w) < d(v,w) + i=0k-1 (1+-1)i r/ < d(v,w) + (-1)-1 r (1+-1/)k r(Bk+1(uk)) > (1+-1/)k r
Locality-aware Routing After k hops ( r=r(B1(v)) ): d(uk, v) < i=0k-1 (1+-1)i r/ < (ag-1)-1 r (1+-1/)k r(Bk+1(uk)) > (1+-1/)k r Finally, w 2 Bk+1(uk): d(v,w) > r(Bk(uk-1)) – d(uk-1,v) > (1-1/(ag-1)) (1+-1/)k-1 r d(uk,v) < d*=(ag-1)-1 r (1+-1/)k and total path length < 2d*+d(v,w) uk w v d* < (/2)d(v,w) if > 2(1+)/+2
Overview Graph Theory Supervised and Peer-to-Peer Overlay Networks Continuous-Discrete Approach Maintaining a robust Cycle Skip Graphs Locality-aware Overlay Networks Networks for non-uniform Peers
Networks for non-uniform peers Problem: peers have non-uniform bandwidth Cont-disc and skip graphs do not work!
Networks for non-uniform peers Ad-hoc solutions: cut large peers into many small peers multi-tier network Better approach: organize peers in a heap How to design scalable distributed heap?
Networks for non-uniform peers PAGODA heap network dB(1) 3 levels dB(2) dB(d): leveled de Bruijn graph of dimension d 4 levels dB(3) dB(4) 5 levels v w ……………….. Routing between v and w via nodes of two dB-levels up
Join PAGODA heap network ~log2 n levels dB(d): leveled de Bruijn graph of dimension d 4 levels dB(3) dB(4) 5 levels ……………….. Move upwards until all parents have larger bandwidth
Leave PAGODA heap network ~log2 n levels dB(d): leveled de Bruijn graph of dimension d 4 levels dB(3) dB(4) 5 levels ……………….. Set bandwidth to 0, send downwards until no further children, remove node
Networks for non-uniform peers PAGODA heap network dB(1) ~log2 n levels dB(2) dB(d): leveled de Bruijn graph of dimension d dB(3) dB(4) ……………….. Problem: updating PAGODA may need O(log2 n) time
Networks for non-uniform peers SHELL network: oblivious heap Join operation: O(log n) time Leave operation: O(1) time
Conclusions Many interesting fronts to work on in context of scalable distributed systems: self-optimizing networks social networks proactive approaches reactive approaches (repairs under adversarial presence) new paradigms
Questions?
Supervised Overlay Network 1 v