Updates in Highly Unreliable, Replicated Peer-to-Peer Systems Anwitaman Datta, Manfred Hauswirth, Karl Aberer (EPFL) Presented by Zhiyuan “Troy” Zhan 1/18/2019
Outline Motivation System Model & Algorithm Analytical Model & Analysis Related Work Conclusion 1/18/2019
Motivation Peer-to-Peer System is not just about file sharing. Data items can be added, deleted and updated frequently. Peer commerce Shared calendars/address books Trust management Medical information sharing Replication is used to improve fault-tolerance and response time. 1/18/2019
Motivation – cont’d How to disseminate updates to other peers is the target problem Consistency guarantee Scalability, underlying system assumption, resource consumption Challenges: Huge number of peers Peers can go online/offline at any time Often lack of global knowledge 1/18/2019
Motivation – cont’d Contributions Address the update dissemination problem with low online probabilities of peers (<30%) and no global knowledge. Present a “fully decentralized, efficient and robust communication scheme” based on rumor spreading. A generic analytical model of combined push/pull technique. 1/18/2019
Motivation - Problem Statement Assumptions: Low percentage of online peers, impossible to achieve any kind of quorum. Transactional consistency is not required, eventual consistency is desirable in most applications. Update conflicts is very rare, the paper does not handle it. Probabilistic guarantee of successful search are sufficient. Total # of replicas is substantially lower than total # of peers (i.e. 1000 vs 1000,000). Consecutive updates can be distributed sparsely over time. Communication overhead is the major performance measurement. 1/18/2019
Outline Motivation System Model & Algorithm Analytical Model & Analysis Related Work Conclusion 1/18/2019
System Model A peer-to-peer overlay network. Each peer has its own local knowledge, i.e. routing table, replica list, etc. Peers can go offline at any time. A communication channel can be established between any two online peers, otherwise, assume each other offline. 1/18/2019
Algorithm – Push Phase Executed when disseminating updates. At replica “p”, upon receiving message Push(U,V,Rf,t): IF Push(U,V,Rf,t) not processed THEN Select a random subset Rp of replicas with |Rp|=R*fr; With probability PF(t), send Push(U, V, Rf+Rp+{p}, t+1) to Rp-Rf; Set Push(U,V,Rf,t) as processed; 1/18/2019
Algorithm – Push Phase,cont’d U: actual update data item V: version vector. Contains global version identifiers (GUID, can be computed locally), altered data items are treated as distinct coexist versions. R, Rf, Rp: replicas. PF(t): a function of t. 1/18/2019
Algorithm – Pull Phase Executed when a peer recovers from failure, or reconnects, or receives no updates for a while, or receives pull message but not sure whether itself is in sync. Contact online replicas; Inquire for missed updates based on version vectors; 1/18/2019
Outline Motivation System Model & Algorithm Analytical Model & Analysis Related Work Conclusion 1/18/2019
General Assumptions Assume an update U is initiated for R online replicas. In general, the online population in push round t: Ron(t)=Ron(t-1)*x+[R-Ron(t-1)]*y x=1-p, y=q p: probability of an online peer going offline in one push round; q: probability of an offline peer coming online in one push round. p,q are typically small and may vary in different rounds. ASSUME p is constant and omit peers coming online: Ron(t)=Ron(t-1)*x ASSUME fr is constant. 1/18/2019
Analysis of Pull Phase – Round 0 Total number of messages: msg(0)=R*fr; New replicas which receive the update: newreplicas(0)=Ron(0)*fr; Online replicas that do not receive the update: Ron(0)*(1-fr); Message length (size, denote U as the update message size): ML(0)=U+R*fr*B; (B: size of data required to describe one replica – meta data), only consider U and Rf; 1/18/2019
Analysis of Pull Phase – Round 1 # of messages in round 1: msg(1)=Ron(0)*fr*x*PF(1) * R*fr(1-fr) # of replicas that newly pushed with updates after round 1: newreplicas(1)=Ron(0)*x*(1-fr) * [1-(1-fr)Ron(0)*fr*x*PF(1) ] Length of message: ML(1)=U+R*B*(fr+fr*(1-fr)) =U+R*B*(1-(1-fr)2) 1/18/2019
Analysis of Pull Phase – Round t>=2 Define fd_aware(t) and faware(t) fd_aware(t): Increment in fraction of online replicas which are aware of the update after round t faware(t): Total fraction of online replicas which are aware of the update at the beginning of round t faware(t)= faware(t-1)+ fd_aware(t-1) 1/18/2019
Analysis of Pull Phase – Round t>=2, cont’d newreplicas(t)=Ron(t-1)*(1-faware(t-1))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – in paper newreplicas(t)=Ron(t-1)*(1-faware(t))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – I think Given fd_aware(t)=(1-faware(t))*[1-(1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ], we have: faware(t)= faware(t-1)+ fd_aware(t-1)=1-(1-faware(t-1))*(1-fr)Ron(t-2)*fd_aware(t-2)*x*PF(t-1) =….; faware(t) rapidly grows to 1; 1/18/2019
Analysis of Pull Phase – Round t>=2, cont’d If the partial list is ignored: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr; If the partial list is considered: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr*(1-fr)t; - (1) ML(t)=U+R*B*(1-(1-fr)t+1); - (2) Both (1) and (2) are proved in the paper by induction on t. 1/18/2019
Analysis of Pull Phase Case1: a replica “p” comes online after a push phase is over Trivial, assume other online replicas have got the update already. Case2: “p” comes online during the push phase, suppose faware fraction of the replicas Ron are aware of the updates, the probability of “p” getting the update in m attempts is: 1-[1-(Ron* faware /R)]m -(3) Query: similar to Pull, but may need majority logic, or version scheme, or hybrid of two, to identify the latest updates 1/18/2019
Analytical Results Varying initial online population Ron(0), 1% 1/18/2019
Analytical Results – cont’d Varying initial online population Ron(0), >5% 1/18/2019
Analytical Results – cont’d 1/18/2019
Analytical Results – cont’d 1/18/2019
Analytical Results – cont’d, Parameter tuning 1/18/2019
Analytical Results – cont’d, scalability 1/18/2019
Discussions: Comparison with Gnutella Parameter self-tuning (Optimization) 1/18/2019
Outline Motivation System Model & Algorithm Analytical Model & Analysis Related Work Conclusion 1/18/2019
Related Work Replication and updates in DB: iAnywhere Solutions: Server-based approach Bayou: assumes significantly less replicas, less updates, disconnections are short Some other approaches assume availability of resource and replicas in general 1/18/2019
Related Work – cont’d Group communication and lazy epidemic schemes: Similar work: Bimodal multicast, epidemic updates None has done “special case study of bimodal behavior and the utility of epidemic algorithms in a highly unreliable environment” 1/18/2019
Outline Motivation System Model & Algorithm Analytical Model & Analysis Related Work Conclusion 1/18/2019
Conclusion This paper provides “an analytical model to demonstrate the significant reduction of message overhead” using combined push and pull techniques. Totally decentralized solution, no global knowledge is needed. The paper is available at citeseer. Will appear in ICDCS 2003. 1/18/2019