Department of Electrical Engineering Mobile Computing Panos Papadimitratos Wireless Networks Lab Department of Electrical Engineering Cornell University
Problem Context
Mobile Computing Environment Limited Bandwidth High Latency Intermittent Connectivity Lower Reliability Low Physical Security Lower Processing Capability Higher Degree of Heterogeneity
Despite the adversity.. Run Distributed Applications Provide Distributed Services Share Data Remain Consistent Remain Efficient
Why are things more difficult? Connectivity is NOT continuous Topological Changes Less Resources Consequently: Lower Availability Potential Inconsistencies
Two aspects “…Replicated, Highly Available, Weakly Consistent Storage System…” “Develop Mobile Applications … minimize the dependence upon continuous connectivity…”
Bayou Distributed Data Storage System Designed for a Mobile Computing Environment Non-Transparent Weakly-Consistent Replication Application-Specific Mechanisms to Detect & Resolve Conflicts Low Usage of the Network
Previous Work Theory of Epidemics Coda Ficus Notes, Oracle, MS Access Eventual Consistency Coda Disconnected Operation Optimistic Replication Consistency Application-specific resolvers Conflicts resolution based on file type Log unresolved conflicts, create error message Ficus Notes, Oracle, MS Access
System Model Client/Server Architecture -Transactional System Data are replicated to a set of servers Applications run as clients Two Basic Operations: Read and Write Replication is Weakly Consistent Read-Any-Write-Any Model
System Model Storage Storage System System Server State Server State Anti-Entropy Read or Write Application Bayou API Client Stub Server State Storage System Application Bayou API Client Stub
Conflict Detection & Resolution Application-Specific Notion of Conflict Semantics Granularity – example: Scheduling Application Resolution Policy Automated Mechanisms Dependency Checks Merge Procedures
Dependency Checks Application-Supplied Query & Expected Result Query is Run at the Server against its current data If Check Fails, invoke Merge procedure
Merge Procedures High-level programs with application-specific knowledge Run by the Server Performed Atomically as part of Writes Attempt to Resolve the Conflict Produce a Revised Update to Apply
Handling Conflicts – An Example
Basic Anti-Entropy Goal: the reconciliation of replicas’ data Pair-wise manner One-way Operation Propagate Write Operations Accept-Order Constraint Prefix Property Version Vectors
Basic Anti-Entropy (continued) R.V Version Vector All Writes unknown to R R For each w in S.Write_log if (R.V(w.server_ID) < w.accept_stamp) SendWrite(R,w)
A More Reasonable Approach Without an ever-growing Write Log Need a method for Truncating the Write Log Idea: An Update that is received by all Replicas need not be logged any more. Allow for an independent, aggressive pruning by each Replica The notion of Stable or Committed Write is pivotal in the pruning process
Write Stability Stable Write: iff it has been executed for the last time by a server. Intuitively equivalent to Confirmation or Commitment Primary Commit Scheme Designate a Replica as Primary Primary determines the order (position) of a Write when it first receives it. Stable Order Any Non-Committed Write is called Tentative
Anti-Entropy (Revisited) R.V Version Vector R.CSN Highest Commit Sequence Number First, All Committed Writes unknown to R if R.CSN < S.CSN for each w in S.Write_log if (w.accept_stamp < R.V(w.server_ID)) SendCommitNotification(…) else SendWrite(…) Second, All Tentative Writes unknown to R For each w in S.Write_log if (R.V(w.server_ID) < w.accept_stamp) SendWrite(R,w) R
Write-Log Truncation Stable Order maintains the Prefix Property Replicas can truncate any stable prefix from their Write Logs Incremental Reconciliation may not be possible Each Replica needs to remember the omitted Write Operations Full-Database Transfer
‘Extended’ Anti-Entropy Session Guarantees Causal Order – Accept Stamp Reduce Client-Observed inconsistencies Eventual Consistency Define a Total Order using the Server ID and the Causal Order Propagate Updates in this Total Order Provide Guarantees on the ‘quality’ of the Replicas Data Content
Other issues Light-Weight Server Creation Security Update through transportable storage media, i.e. floppy disks Link quality determines the frequency of the performed anti-entropies
Experiments Measurements on a modified EXMH (e-mailer) that uses Bayou for storing messages Only Committed Writes are propagated Measure the execution time for an Anti-Entropy (100 Writes) over different network links Network Transfer Inserting Newly Received Writes
Experiments - II
Bayou - Summary Support for Arbitrary Communication Topologies Operation over Low-Bandwidth Networks Incremental Progress Eventual Consistency Efficient Storage Management Propagation through Transportable Media Light-weight Management of Dynamic Replica Sets Arbitrary Policy Choices
Rover Toolkit Set of Software Tools for Development of Mobile Applications Two approaches: Mobile-Transparent Applications Mobile-Aware Applications
Goals: Minimize Dependence on Optimize Utilization of Bandwidth Continuous Connectivity Remotely Stored files Optimize Utilization of Bandwidth Dynamic Division of Work
Previous Work Cedar Locus Coda Bayou Check-in Check-out Data Sharing Type-specific Conflict Resolution Coda Optimistic Concurrency Control Pre-Fetching Bayou Tentative Data Session Guarantees
Toolkit Design Client-Server architecture Mobile Communication Support Re-locatable Dynamic Objects (RDO) Reduce Client/Server communication Update Shared Objects Code Shipping Queued Remote Procedure Call (QRPC) Non-Blocking Calls When Disconnected
Toolkit Design Application code & data are RDOs Rover-Applications Interface Primary Functions Create Session Import Invoke Export RDOs are cahced RDOs are lazily fetched
Toolkit Design Client-Side Application Object Conflict? Rover Library Modify/ Resolve Object Conflict? Rover Library Import RDO RDO Cache RDO Network Scheduler QRPC Log Export Log Mobile Host Resolved Log Server
Design Issues Communication Scheduling Computation Relocation Separate application from data Move computation/data: client server Object Replication – Pre-fetching Consistency Primary Copy, Tentative-Update Optimistic Concurrency Control Type-Specific Concurrency Control
Architecture Network App3 App 1 App 2 App3 App 1 App 2 Access Manager Operation Log Access Manager Operation Log Object Cache Object Cache Network Scheduler Network Scheduler Server Mobile Host Network
Implementation Issues Rover starts as a minimal kernel Failure Recovery – Access Manager Log Size Batching of QPRCs Promises – Callback User Notification Application-Specific Conflict Resolution
Experiments Single Server, Multiple Clients Different Network Options TCP over wireless links Three setups: Compressed or Batched QRPCs Mobile-Transparent Application Mobile-Aware Applications
Experiments - II
Experiments - III
Experiments - IV
Rover - Summary QRPC benefits: RDOs migrate functionality RPCs are scheduled, batched, compressed Increased Network Performance RDOs migrate functionality Minimize Data Transfer Porting of Applications to Rover is relatively easy Measurements show significant improvement from both approaches
Topics for Discussion Are there ‘missing’ features? What if the semantics are not that ‘strong’? Or, if the uncertainty about data values is not accepted? Should Rover support some replication service? Do we really know what should be an ‘interesting’ mobile application ?
Topics for Discussion - II In other words, are the assumptions made reasonable ? How secure are these architectures ? How about the ‘mobile’ data ? Nomadic Computing: Can these schemes support Nomads ? Other peer-to-peer models? E.g. Sensor Networks?