Distributed Slicing in Dynamic Systems A. Fernández, V. Gramoli, E. Jiménez, A-M. Kermarrec, M. Raynal
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Context Distributed System Interconnected set of nodes Heterogeneous environment Resources are heterogeneously spread Bandwidth Processing Power Storage Space Uptime … Large-scale dynamic environment Nodes leave and join at any time Large amount of nodes No global information 1,5 0,
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Problem What means rich/poor in this context?
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Problem What means rich/poor in this context? 20 Am I rich or poor?
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Problem What means rich/poor in this context? rich?
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Problem What means rich/poor in this context? poor?
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Applications Remedy the Gnutella problem in Peer-to-peer Gnutella performance was limited by poor nodes Kazaa/Skype sollicitate rich nodes rather than poor nodes Allocating resources/nodes to services Streaming service needs nodes with highest bandwidth Non-critical service can run on unstable nodes File-sharing service requires nodes with many files …
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Objectives Classifying nodes into categories, slices Based on individual characteristics: attributes A slice corresponds to a portion of the system Typically, answering the question:
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Objectives HOW RICH AM I COMPARED TO OTHERS? Classifying nodes into categories, slices Based on individual characteristics: attributes A slice corresponds to a portion of the system Typically, answering the question:
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Classifying the system nodes
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Classifying the system nodes …using their attribute values (assume a single attribute for simplicity reason)
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Classifying the system nodes Attribute values ai
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Classifying the system nodes Attribute values ai Normalized Indices pi
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal #4#3#2#1 Classifying the system nodes Normalized Indices pi 0 1 Slices si Attribute values ai
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Model Dynamic system System of n nodes Nodes join and leave the system at any time Nodes may crash too Each node i knows its attribute value ai, its position estimate pi’, the slices (ex: 10 equally sized slices, each containing 10% of the nodes), a communication view Vi : constant number of neighbors j, their position estimate pj’, their attribute aj, (and their age)
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Underlying Gossip-based Overlay Each node i periodically: Exchanges information with its neighbors (nodes of its view) Views, values… Computes its new view and its new state View, value… Dynamic overlay: Failed nodes are naturally removed from views Joining nodes are naturally added into views Scalable overlay: Limited amount of information stored Limited amount of information exchanged
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Jelasity, Kermarrec 2006 (JK) Each node i sets pi’ as a uniform random value Each node i looks for a misplaced neighbor j A neighbor j is misplaced with i iff (ai – aj)(pi’ – pj’) < 0 Then, nodes i and j exchange pi’ and pj’ The sequence of position estimates matches the sequence of attribute values For any node couple i and j, ai pi’<pj’ Convergence to this ordering is exponentially fast (in the number of execution)
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering Similar to JK Uses Local Disorder Measure LDM Let lpj’ (resp. lpj ) be the normalized index of j random value (resp. attribute value) among all j of Vi i LDM(i) = ∑ j in Vi i (lpj’ – lpj)² Protocol Do periodically { Update view Vi using an underlying protocol. Choose the neighbor j that minimizes the LDM(i). Exchange random values pi’ and pj’ Update the random value pi’ w/ pj’ if necessary. Update slice assignment si’ := s : pi’ in s. }
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering Similar to JK Uses Local Disorder Measure LDM Let lpj’ (resp. lpj ) be the normalized index of j random value (resp. attribute value) among all j of Vi i LDM(i) = ∑ j in Vi i (lpj’ – lpj)² Protocol Do periodically { Update view Vi using an underlying protocol. Choose the neighbor j that minimizes the LDM(i). Exchange random values pi’ and pj’. Update the random value pi’ w/ pj’ if necessary. Update slice assignment si’ := s : pi’ in s. } Difference with JK
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering /11 9/11
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering /11 7/11
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering /11 2/11
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering /11 4/11
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 1: Ordering
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Result: Slight convergence speed up n = 10 4 #slices = 10 |V| = 20
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal If random values are not perfectly uniformly distributed …some nodes might never find their slice e.g. the 3 nodes of S2 in the example above Problem: Wrong Slice Assignment #4#3#2#1 0 1 Slices Normalized indices
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking No random values! Protocol Do periodically { Update/Shuffle view Vi using an underlying protocol. l += #neighbors with lower attribute value. g += #neighbors Sends ai to a randomly chosen neighbor Sends ai to the neighbor that is the closest to a slice boundary Update slice assignment si’ = s such that l/g in s. } Upon reception { Receive aj from j if ( aj < ai ) l += 1; g +=1 ; Update slice assignment si’ = s such that l/g in s. }
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking No random values! Protocol Do periodically { Update/Shuffle view Vi using an underlying protocol. l += #neighbors with lower attribute value. g += #neighbors Sends ai to a randomly chosen neighbor Sends ai to the neighbor that is the closest to a slice boundary Update slice assignment si’ = s such that l/g in s. } Upon reception { Receive aj from j if ( aj < ai ) l += 1; g +=1 ; Update slice assignment si’ = s such that l/g in s. } Same number of messages
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking /11
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking /2
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking /4
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Algorithm 2: Ranking /3
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Result 1: Unlimited convergence n = 10 4 #slices = 100 |V| = 20
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Result 1: Unlimited convergence Ranking precision keeps improving n = 10 4 #slices = 100 |V| = 20
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Result 2: Tolerating Dynamism Churn is correlated with attribute values! e.g. the attribute is the remaining batery lifetime or available storage space. Churn
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Performance Analysis d, is the distance from p i ’ to the closest slice boundary. For confidence coefficient of 99,99%, the required number of attribute value drawn is m i ≥ z p i ’ (1 – p i ’) / d 2, with z <16, a constant.
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Conclusion Churn-tolerant algorithm Gossip-based mechanisms. Slice belongingness re-approximation. Scalable algorithm Limited number of neighbors. Size of the system is unknown. Applications of the distributed slicing Resource allocation Supernodes / Ultrapeers election Future work Can we obtain convergence speed of the first algorithm with the accuracy of the second algorithm?
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal References Ordered Slicing of Very Large-Scale Overlay Networks M. Jelasity and A.-M. Kermarrec In Proc. of the 6 th IEEE Conference on P2P Computing, Randomized Algorithms R. Motwani, P. Raghavan Cambridge University Press, 1995 Time Bounds for Selection M. Blum, R. Floyd, V. Pratt, R. Rivest, and R. Tarjan Journal Computer and System Sciences 7: , 1972
ICDCS 2007 June Fernandez, Gramoli, Jimenez, Kermarrec, Raynal Result 3: Feasibility