Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on.

Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on Computer Systems, February 1992. L. Mummert, M. Ebling, M. Satyanarayanan. Exploiting weak Connectivity for mobile file access. ACM Symp. on Operating System Principles, 1995.

Need for Mobile Access Continue having access to information –Address book, calendar, spreadsheets, documents, maps, programs, etc. Information/ Files on Servers

Why not keep them on the client? Makes sense! –No network cost (latency, $$$) –Availability However not always possible! –Forgotten to bring along –Larger than what client can hold –Active sharing with other users

Network issues to keep in mind Intermittence Low bandwidth High latency High expense Energy consuming Radio silence needed sometimes Keep as much as possible in the client (cache) and manage this cache intelligently.

Problems with disconnected clients Updates are not visible to others Updates are at risk Update conflicts are more likely Cache misses impede progress Note that even an infinite cache space does not solve all these problems

Coda Client Architecture Application Venus Code Minicache Kernel To Server

Coda Design Principles Don’t punish strongly connected clients Don’t make life worse than when disconnected Do work in the background if you can When in doubt seek user advice.

Basic Idea Hoard data whenever you can (when network connectivity is available) - Hoarding Serve requests from hoard/cache when disconnected – Emulating Reconcile the hoard/cache with servers and propagate updates when you get back connectivity – Reintegrating/Write disconnected

Venus States Hoarding Emulating Write Disconnected Disconnection Connection Strong Connection Weak Connection

Caching vs. Hoarding Caching –Uncached data always available but with higher cost –Caching for future performance –Filled on demand –Stale data purged –Writes to cache committed immediately Hoarding –Uncached data unavailable when disconnected –Caching for future performance & availability partial replication –Filled on demand/pre- fetched –Stale data sometimes better than no data –Writes to cache committed later on reconnection tracking changes conflict detection conflict resolution

Hoarding Hoard data in anticipation of disconnection At file(s) granularity Several complications: –Anticipation of file reference behavior –Disconnections and reconnections unpredictable –True cost of a hoard miss unpredictable to evaluate between two alternatives –Account for activity at other clients

Hoarding in Coda Prioritize objects and evaluate these priorities at “Hoard Walk” to decide what to keep Each client has a hoard database (HDB) A user may optionally indicate a hoard priority in the HDB. Current priority of an object is a function of its hoard priority and recent usage. Objects of lowest priority are chosen as victims for replacement.

Hoard Walk Occur every 10 minutes or by explicit user request 2 steps: status walk + data walk After status walk, it can display to the user what objects it will fetch in the data walk (which the user can over-ride if needed)

Other Hoarding Work You do not want to hoard all files belonging to a project at a time. Calculate semantic distance between files: –Temporal/sequence based distance –Lifetime semantic distance Perform “clustering” on these distances Hoard entire clusters prioritized by recent use. G. Kuenning and G. Popek. Automated Hoarding for Mobile Computers. Proc. of the 16th ACM SOSP, October 1997.

Emulation Note there is no network connectivity Read requests serviced from hoard On a miss, nothing to do except raise an error code – user can either move on or block until it can be serviced. Writes are logged so that they can be propagated later to server. Read misses can also be logged for better hoarding next time.

Coherence Issues Normally we would like serializability for accesses. Accesses from different clients execute as if they were done one after another in some serial order.

Serializability Client 1 R W Client 2 R W R R W W, R W R W, R R W W, ….. SerializableNon-Serializable W R R W R W W R …..

Serializability Note that R-R operations are commutative and can be re-ordered without any violations However, R-W and W-W operations cannot be re-ordered and this is what we need to worry about. Strict serializability requires –Propagation of writes –Everyone sees the same order of writes!

Ordering of writes Client 1 A = 1; Client 2 While (A==0) ; B = 1; Client 3 While (B==0) ; Print(A) A=B=0 and both are present in all three client caches What should be printed out?

How do we propagate writes from other clients and from this client? Coda uses an optimistic policy to trade-off some inconsistency for performance R-W and W-W sharing not that prevalent in all applications. When some conflicts can be identified abort or leave it to application.

Coda Consistency Mechanisms Server registers that client has cached an object. It uses a invalidation message (callback break) to notify the client that the copy has changed. Client discards the cached copy and refetches either on demand or at next hoard walk. When a client is disconnected it will not get such messages. Hence it has to validate hoard upon reconnection. If a read occurs in the meantime, it is as though it happened before the write by the other client! The other client is not stalled in the process.

Callback optimizations Server maintains timestamp (version) at volume (sub-tree) granularity. Client caches volume version at end of hoard walk. When connectivity is restored, client presents these versions (in one chunk) for validation. If volume stamp not valid, then go and validate individual objects.

Reintegration/Write Disconnected Need to flush the log of writes to server upon reconnection. But this can induce a lot of network traffic. Instead an incremental mechanism – trickle reintegration – is used. Potential to trade-off consistency for performance

Trickle Reintegration When appending to log, see if there is scope for compression, e.g. 2 writes to same object, delete of a file for which a store exists in log. Age entries in the log until they cross a threshold. Only after this propagate them to server. The hope is that they can be compressed/removed further.

Conflict resolution R-W conflicts (stale data reads) are presumed to be OK – no read logs! W-W conflicts may be identified at reintegration time. Conflicts at file: Stores to same file Conflicts at directory: –Two names are same –Updates and deletes to same entry –Directory attributes are modified Pass on such conflicts to user.

Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on.

Similar presentations

Presentation on theme: "Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on.

Similar presentations

Presentation on theme: "Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on."— Presentation transcript:

Similar presentations

About project

Feedback