Coda / AFS Thomas Brown Albert Ng
What is Coda?
Issues with AFS Any network disconnections to the AFS network means no access to any files If read-write server crashes, then nobody can update any files in the system
Coda Offshoot of Andrew File System (AFS) Developed at Carnegie Mellon University by Professor Mahadev Satyanarayanan First introduced in 1987
Differences from AFS Many Read-Write Servers/Replicas Optimistic Replication Strategy Focus on availability more than consistency Allows disconnected operations More aggressive caching and use of the cache
Coda
General
Components of Coda Components of the Coda client: Cache - local cache that temporarily stores the files Sun’s Vnode Interface - Allows interception of file access calls MiniCache - handles Linux file access calls - Passes off to Venus or to the file system Venus - manages the cache and interactions between client and server
Components of Coda Components of the Coda network: Server Control Manager (SCM) - The head server that manages administrative tasks for the network Volume Storage Group (VSG) - The full set of servers pertaining to a volume of data Accessible Volume Storage Group (AVSG) - The subset of servers within VSG that a client can access/communicate with
How Coda Operates
States of Coda Hoarding Emulation Reintegration
Hoarding Stage Normal operating state Grabs as many files as possible from Coda servers into the local cache Access files from cache as much as possible to improve performance Send any updates to files back to Coda servers as they occur
Cache / Hoard Each cache contains a user-defined Hoard database (HDB), a list of all files of interest or importance to the user/system. A priority is attached to every file stored in the cache and within the HDB. For every file actively stored in the cache, this priority will decay over time. Example Hoard Database
Hoard Walking Cache will check and compare the current priority all files current in the cache and all files inside the HDB. Fetch all files with higher priority and replace files with the lowest priority Stops once all cached objects have a higher priority than all uncached objects (cache equilibrium) Occurs every ten minutes
Emulation Stage Client is disconnected from servers Uses local cache to emulate access to files on the Coda servers If file is not in cache, Coda returns an error notice Any updates logged in a Client Modification Log (CML). Stored in Recoverable Virtual Memory (RVM)
Reintegration Stage Initiated when communication to at least one server is restored Send the Client Modification Log to servers to update all changes Once reintegration is complete, Coda returns to the hoarding state
How Reintegration Is Performed Client Modification Log is received by server(s) Server locks all objects referred to by the log Each operation in the log is validated and executed Files/data are transferred from client to server as necessary Release locks and determine if all operations were successful or not If a conflict occurs, rollback the changes
Version Vectors Each file/directory contains a vector that is used to keep track of updates/divergences Each server/replica has its own version number within the vector As files are updated on the replica, increment version number Vectors can be compared to see if they are identical, concurrent, or ordered File Updated Server A [1, 1, 1] Server B [1, 1, 1] Server C [1, 1, 1] Server A [2, 2, 1] Server B [2, 2, 1] Server C [1, 1, 1]
Retrieving a File Client requests version vector for desired file from all AVSG servers Client checks all of the version vectors If all version vectors match, retrieve the file from one of the servers If not, client will notify servers of a potential error and the servers will attempt to automatically resolve it. “Check many, Read one, Write many”
Storing a File Client sends updated file and current version vector to all AVSG servers Each server increments their version number within the vector and sends it back to the client Client merges all of version numbers and produces an updated version vector for the file If client detects anything wrong, it can notify the servers of a potential conflict
Conflict Resolution
Conflicts Multiple versions of the same file A consequence of utilizing optimistic replication Two types in Coda: Local / Global Global / Global
Local / Global Conflicts Two concurrent versions of the file (one in cache, other on server) Occurs when multiple clients modify the same file/directory. However, one client was disconnected while modifying the file/directory. When this occurs, Coda will lock the file and then either: Utilize an Application-Specific Resolver to attempt to automatically fix the conflict Flag file(s) for users to manually resolve the conflict
Application-Specific Resolvers Tool to automatically attempt to resolve any file conflicts for an application Must be written and set up for every single application Example: Calender Application: File A schedules a cafe date at 3:00PM on Monday File B schedules a presentation at 1:00PM on Friday
Global / Global Conflicts Different servers have different versions of a file Occurs when disconnected/crashed servers rejoin the AVSG and server partitions may have occurred Uses the version vector of files to automatically update the outdated files and directories Any unresolvable errors (like concurrency conflicts) will require manual intervention to fix
Sources 1) "The Coda Distributed File System." The Coda Distributed File System. N.p., n.d. Web. 17 Nov. 2016. http://www.coda.cs.cmu.edu/ljpaper/lj.html 2) "Coda File System." Coda File System User and System Administrators Manual. N.p., n.d. Web. 17 Nov. 2016. http://www.coda.cs.cmu.edu/doc/html/manual/index.html 3) Kistler, James J., and M. Satyanarayanan. "Disconnected Operation in the Coda File System." ACM Transactions on Computer Systems 10.1 (1992): 3-25. Web. http://grids.ucs.indiana.edu/ptliupages/hhms/pdf/disconnected.pdf 4) M. Satyanarayanan. "The Evolution of Coda." ACM Transactions on Computer Systems 20.2 (2002): 85-124. Web. http://www.cs.cmu.edu/afs/cs/Web/People/satya/docdir/satya-tocs-codaevol-2002.pdf 5) "The Coda Distributed Filesystem for Linux." The Coda Distributed Filesystem for Linux - Introduction to Coda - Tutorials - LinuxPlanet. N.p., n.d. Web. 18 Nov. 2016. http://www.linuxplanet.com/linuxplanet/tutorials/4481/1 6) Satyanarayanan, M. "Scalable, Secure, and Highly Available Distributed File Access." Computer 23.5 (1990): 9-18. Web. 7) Satyanarayanan, M. "Pervasive Computing: Vision and Challenges." IEEE Personal Communications 8.4 (2001): 10-17. Web. 8) Satyanarayanan, M. “Coda: A Highly Available File System for a Distributed Workstation Environment.” Proceedings of the Second Workshop on Workstation Operating Systems (n.d.): n. pag. Web. http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/satya/docdir/satya-ieeetc-coda-1990.pdf 9) Parker, D.s., G.j. Popek, G. Rudisin, A. Stoughton, B.j. Walker, E. Walton, J.m. Chow, D. Edwards, S. Kiser, and C. Kline. “Detection of Mutual Inconsistency in Distributed Systems.” IEEE Transactions on Software Engineering SE-9.3 (1983): 240-47. Web. http://zoo.cs.yale.edu/classes/cs426/2012/bib/parker83detection.pdf 10) P. Kumar and M. Satyanarayanan, "Log-based directory resolution in the Coda file system," [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems, San Diego, CA, 1993, pp. 202-213. Web.
Questions?