Download presentation
Presentation is loading. Please wait.
1
Coda file system: Disconnected operation By Wallis Chau May 7, 2003
2
contents Coda overview Brief features Disconnected operation Hoarding Emulation Reintegration Conclusion
3
Introduction Coda is developed at Carnegie-Mellon University. Unix-based file system which is descendant of Andrew file system (AFS). Coda has features of high scalability, disconnected operation, replication, naming and security.
4
What is Coda? It is distributed file system. Consist of small number of servers called vices and many workstation called virtues (clients). File is stored in servers and be accessed through virtues.
5
How it works Virtue machine consist of user level process called venus, virtual file system layer, client stub, local file system interface. Any call to access a file from user process will go to virtual file system layer and is redirected to local file system interface or venus process depending on target of file is cached or not. Venus uses RPC system to communicate with file server. When venus gets response from server, it copies whole file and store in client cache. File activity is made in client cache. Changes will update original copy when file is closed.
6
Virtue (client) structure local file system interface Virtual file system layer User process Venus process RPC client stub server
7
Features : brief description Scalability: coda is capable of adding users in less implementation Security: Coda uses secure channel, system- level authentication and controlling access. Naming: All clients use the same name space single mount point. i.e. file has same path in any client. Replication server: files in servers are replicated for high availability. If one server fails, other servers still can provide services. Disconnected operation: When all server are temporary inaccessible, client can still work on files off line until connection is re-established.
8
Disconnected operation hoarding emulation reintegration Hoard database Hoard walk log
9
Hoarding State of normal operation (server connected). Venus gets useful data from server into client cache to get ready for disconnection may occurs. Which data should be cached based on prioritized algorithm which consists of recent reference history and hoard database (HDB). HDB is entries of pathnames identifying objects that user explicitly stores ( un- cached).
10
Hoarding (con’t) Coda determines file priority to place into caches when caches is in equilibrium. Equilibrium is all these conditions are met: No un-cached file has higher priority than any cached file. Cache is full or no un-cached file has nonzero priority. Each cached file is a copy in client’s Accessible volume storage group (ASVG). If cache shifts off, hoard walk is needed to re- compute cache for equilibrium.
11
Emulation State when disconnection occurs. Venus performs the jobs of server in cache. It records sufficient information in the log for replay when it gets to reintegration. Each log contains system call arguments and states of objects referenced by the call for corresponding volumes. Venus also keeps its cache and related data structure in non-volatile storage called recoverable virtual memory (RVM) in case of client reboots. RVM stores meta-data of replay logs, HDB, cache directories, status blocks.
12
Reintegration State when server connection is resumed. Venus updates changes made in emulation state to server a volume at a time. Venus shift the replay log to ASVG in server to be executed. 4 phases of replay algorithm: - phase 1: log is parsed and reference objects are locked. - phase 2: validate operation in the log. - phase 3: data transfer - phase 4: commit and release lock.
13
Reintegration (con’t) At the end, Venus frees replay logs and reset cache priority. If there are conflicts (write/write) between copies in cache and in server, there are automatic resolution or manual correction depending on what was changed.
14
Conclusion Disconnected operation in Coda file system brings a new look to distributed system. It lowers network traffic and increases fault transparency and tolerance that makes file system more reliable and efficient. I think this is good choice for system where high availability is needed.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.