Download presentation
Presentation is loading. Please wait.
Published byAbel Miles Modified over 6 years ago
1
Updates in Highly Unreliable, Replicated Peer-to-Peer Systems
Anwitaman Datta, Manfred Hauswirth, Karl Aberer (EPFL) Presented by Zhiyuan “Troy” Zhan 1/18/2019
2
Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019
3
Motivation Peer-to-Peer System is not just about file sharing.
Data items can be added, deleted and updated frequently. Peer commerce Shared calendars/address books Trust management Medical information sharing Replication is used to improve fault-tolerance and response time. 1/18/2019
4
Motivation – cont’d How to disseminate updates to other peers is the target problem Consistency guarantee Scalability, underlying system assumption, resource consumption Challenges: Huge number of peers Peers can go online/offline at any time Often lack of global knowledge 1/18/2019
5
Motivation – cont’d Contributions
Address the update dissemination problem with low online probabilities of peers (<30%) and no global knowledge. Present a “fully decentralized, efficient and robust communication scheme” based on rumor spreading. A generic analytical model of combined push/pull technique. 1/18/2019
6
Motivation - Problem Statement
Assumptions: Low percentage of online peers, impossible to achieve any kind of quorum. Transactional consistency is not required, eventual consistency is desirable in most applications. Update conflicts is very rare, the paper does not handle it. Probabilistic guarantee of successful search are sufficient. Total # of replicas is substantially lower than total # of peers (i.e vs 1000,000). Consecutive updates can be distributed sparsely over time. Communication overhead is the major performance measurement. 1/18/2019
7
Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019
8
System Model A peer-to-peer overlay network.
Each peer has its own local knowledge, i.e. routing table, replica list, etc. Peers can go offline at any time. A communication channel can be established between any two online peers, otherwise, assume each other offline. 1/18/2019
9
Algorithm – Push Phase Executed when disseminating updates.
At replica “p”, upon receiving message Push(U,V,Rf,t): IF Push(U,V,Rf,t) not processed THEN Select a random subset Rp of replicas with |Rp|=R*fr; With probability PF(t), send Push(U, V, Rf+Rp+{p}, t+1) to Rp-Rf; Set Push(U,V,Rf,t) as processed; 1/18/2019
10
Algorithm – Push Phase,cont’d
U: actual update data item V: version vector. Contains global version identifiers (GUID, can be computed locally), altered data items are treated as distinct coexist versions. R, Rf, Rp: replicas. PF(t): a function of t. 1/18/2019
11
Algorithm – Pull Phase Executed when a peer recovers from failure, or reconnects, or receives no updates for a while, or receives pull message but not sure whether itself is in sync. Contact online replicas; Inquire for missed updates based on version vectors; 1/18/2019
12
Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019
13
General Assumptions Assume an update U is initiated for R online replicas. In general, the online population in push round t: Ron(t)=Ron(t-1)*x+[R-Ron(t-1)]*y x=1-p, y=q p: probability of an online peer going offline in one push round; q: probability of an offline peer coming online in one push round. p,q are typically small and may vary in different rounds. ASSUME p is constant and omit peers coming online: Ron(t)=Ron(t-1)*x ASSUME fr is constant. 1/18/2019
14
Analysis of Pull Phase – Round 0
Total number of messages: msg(0)=R*fr; New replicas which receive the update: newreplicas(0)=Ron(0)*fr; Online replicas that do not receive the update: Ron(0)*(1-fr); Message length (size, denote U as the update message size): ML(0)=U+R*fr*B; (B: size of data required to describe one replica – meta data), only consider U and Rf; 1/18/2019
15
Analysis of Pull Phase – Round 1
# of messages in round 1: msg(1)=Ron(0)*fr*x*PF(1) * R*fr(1-fr) # of replicas that newly pushed with updates after round 1: newreplicas(1)=Ron(0)*x*(1-fr) * [1-(1-fr)Ron(0)*fr*x*PF(1) ] Length of message: ML(1)=U+R*B*(fr+fr*(1-fr)) =U+R*B*(1-(1-fr)2) 1/18/2019
16
Analysis of Pull Phase – Round t>=2
Define fd_aware(t) and faware(t) fd_aware(t): Increment in fraction of online replicas which are aware of the update after round t faware(t): Total fraction of online replicas which are aware of the update at the beginning of round t faware(t)= faware(t-1)+ fd_aware(t-1) 1/18/2019
17
Analysis of Pull Phase – Round t>=2, cont’d
newreplicas(t)=Ron(t-1)*(1-faware(t-1))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – in paper newreplicas(t)=Ron(t-1)*(1-faware(t))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – I think Given fd_aware(t)=(1-faware(t))*[1-(1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ], we have: faware(t)= faware(t-1)+ fd_aware(t-1)=1-(1-faware(t-1))*(1-fr)Ron(t-2)*fd_aware(t-2)*x*PF(t-1) =….; faware(t) rapidly grows to 1; 1/18/2019
18
Analysis of Pull Phase – Round t>=2, cont’d
If the partial list is ignored: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr; If the partial list is considered: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr*(1-fr)t; - (1) ML(t)=U+R*B*(1-(1-fr)t+1); - (2) Both (1) and (2) are proved in the paper by induction on t. 1/18/2019
19
Analysis of Pull Phase Case1: a replica “p” comes online after a push phase is over Trivial, assume other online replicas have got the update already. Case2: “p” comes online during the push phase, suppose faware fraction of the replicas Ron are aware of the updates, the probability of “p” getting the update in m attempts is: 1-[1-(Ron* faware /R)]m -(3) Query: similar to Pull, but may need majority logic, or version scheme, or hybrid of two, to identify the latest updates 1/18/2019
20
Analytical Results Varying initial online population Ron(0), 1%
1/18/2019
21
Analytical Results – cont’d
Varying initial online population Ron(0), >5% 1/18/2019
22
Analytical Results – cont’d
1/18/2019
23
Analytical Results – cont’d
1/18/2019
24
Analytical Results – cont’d, Parameter tuning
1/18/2019
25
Analytical Results – cont’d, scalability
1/18/2019
26
Discussions: Comparison with Gnutella
Parameter self-tuning (Optimization) 1/18/2019
27
Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019
28
Related Work Replication and updates in DB:
iAnywhere Solutions: Server-based approach Bayou: assumes significantly less replicas, less updates, disconnections are short Some other approaches assume availability of resource and replicas in general 1/18/2019
29
Related Work – cont’d Group communication and lazy epidemic schemes:
Similar work: Bimodal multicast, epidemic updates None has done “special case study of bimodal behavior and the utility of epidemic algorithms in a highly unreliable environment” 1/18/2019
30
Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019
31
Conclusion This paper provides “an analytical model to demonstrate the significant reduction of message overhead” using combined push and pull techniques. Totally decentralized solution, no global knowledge is needed. The paper is available at citeseer. Will appear in ICDCS 2003. 1/18/2019
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.