Consistent Data Replication: Is it feasible in WANs? Yi Lin Bettina Kemme Marta Patiño-Martínez Ricardo Jiménez-Peris Sep 2, 2005
Data Replication: What,Why,How? Without Replication With Replication Toronto Montreal Ottawa Toronto Montreal Ottawa … … WAN Montreal Toronto Montreal Ottawa Benefits: Fault Tolerance, Performance Challenge: keep data consistent
Data Replication: challenge Keep data consistent w(x) Replica control x x x
Motivations Most replication protocols have been proved to perform well in LANs. Little work has been done in WANs GlobData [DMBS02], Tech Report [JHU02] Are these protocols also feasible in WANs? Protocols working well in LANs may not work well in WANs. Why? What are the bottlenecks? Any solutions?
Intro to Group Communication Systems GCS provides multicast primitives to all members in the group Group maintenance (removal of failed members, etc.) Ordering Unordered Total order (messages delivered in all members in the same order) Reliability Different degrees of delivery guarantees in case of site failures Analyzed in paper;
Data Replication: Using Group Communication Systems Total Order Read-Only requests: Executed in the local site Update requests: Multicast in total order firstly. executed according to total order delivery. Num of msgs for an update 1 total order w(x) w(x) w(x) w(x) x x x Symmetric
Data Replication: Using Group Communication Systems Read-Only requests: Executed in the local site Update requests: Request totally ordered firstly. executed only in the primary site Multicast the changes in unordered msg. Apply change in other sites Num of msgs for an update 1 total order + 1 unordered Local write (w(x)) 1 total order within response time Remote write (w(x)) 1 total order + 1 unordered within response time Total Order w(x) w(x) primary w(x) w(x) unordered x x x x x Primary Copy
Data Replication: Using Group Communication Systems Read-Only requests: Executed in the local site Update requests: Request totally ordered firstly. executed locally Multicast the changes in unordered msg. Apply change in other sites Num of msgs for an update 1 total order + 1 unordered No concurrent conflicting req 1 total order within response time Has concurrent conflicting req 1 total order + 1 unordered within response time Total Order w(x) w(x) w(x) w(x) unordered x x x x x Local Copy
Num of messages summary Symmetric Primary Copy Local Copy Total num of msgs 1 total order 1 unordered Num of msgs within respone time Local write No concurrent conflicting request Remote write Has concurrent conflicting request
Experiment (I) LAN WAN (5 sites, 100% update)
Experiment (I): Response time analysis
Experiment (II): Scalability in WAN Read-only requests Update requests 50% update, Symmetric
Different Total Order Algorithms Seq # token A (seq) A m m B B C C SEQUENCER TOKEN m2 m <1,0,0> A A m1 m2m1 <1,0,0> B B <1,0,0> C C LAMPORT Round Robin (ATOP)
Experiment (III): Different Total Order Alg 5 sites in WAN, without replication 5 sites in WAN, with replication 100% update, Symmetric,
Conclusions Consistent database replication is feasible in WANs; For deterministic applications, Symmetric approach is preferable. For non-deterministic applications, Local Copy is preferable; In WAN, total order multicast is crucial to response time. Round Robin total order has better performance over others; We have some other interesting optimizations. Please refer to our paper.
References [C-JDBC] E. Ceccet, J.Marguerite, and W. Zwaenepoel. C-JDBC: Flexible database clustering middleware. In USENIX conference 2004 [Ganymed] C. Plattner and G. Alonso. Ganymed: Scalable replication for transactional web applications. In Middleware, 2004. [GlobData] L. Rodrigues, H. Miranda, R. Almeida, J. Martins, and P. Vicente. Strong Replication in the GlobData Middleware. In Workshop on Dependable Middleware-Based Systems, 2002. [Middle-R] R. Jimenez-Peris, M. Patiòno-Martnez, B. Kemme, and G. Alonso. Improving Scalability of Fault Tolerant Database Clusters. In ICDCS'02. [Conflict-Aware] C. Amza, A. L. Cox, and W. Zwaenepoel. Conict-Aware Scheduling for Dynamic Content Applications. In USENIX Symp. on Internet Tech. and Sys., 2003. [State Machine] F. Pedone, R. Guerraoui, and A. Schiper. The Database State Machine Approach. Distributed and Parallel Databases, 14:71-98, 2003. [Spread] http://www.spread.org [JGroups] http://www.jgroups.org