G22.3250-001 Robert Grimm New York University Disconnected Operation in the Coda File System.

Slides:



Advertisements
Similar presentations
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Advertisements

Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed File Systems.
Ch 9 – Distributed Filesystem Reading Assignment – Summary due 10/18 (CODA paper) Goals –Network Transparency – you don’t know where in the system the.
The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.
File System for Mobile Computing Quick overview of AFS identify the issues for file system design in wireless and mobile environments Design Options for.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Andrew File System (AFS)
CODA/AFS (Constant Data Availability)
Overview of Mobile Computing (3): File System. File System for Mobile Computing Issues for file system design in wireless and mobile environments Design.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE CS582: Distributed Systems Lecture 13, 14 -
Disconnected Operation in the Coda File System James J. Kistler and M. Satyanarayanan Carnegie Mellon University Presented by Deepak Mehtani.
Disconnected Operation in the Coda File System James J. Kistler and M. Satyanarayanan Carnegie Mellon University Presented by Cong.
G Robert Grimm New York University Recoverable Virtual Memory.
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
G Robert Grimm New York University Recoverable Virtual Memory.
Disconnected Operation In The Coda File System James J Kistler & M Satyanarayanan Carnegie Mellon University Presented By Prashanth L Anmol N M Yulong.
G Robert Grimm New York University Scale and Performance in Distributed File Systems: AFS and SpriteFS.
Chapter 12 File Management Systems
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Distributed File System: Data Storage for Networks Large and Small Pei Cao Cisco Systems, Inc.
Mobility Presented by: Mohamed Elhawary. Mobility Distributed file systems increase availability Remote failures may cause serious troubles Server replication.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Adaptation for Mobile Data Access (DM1 & PL1) Yerang Hur Mar. 24, 1998.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.
Replication for Mobile Computing Prasun Dewan Department of Computer Science University of North Carolina
Distributed File Systems Case Studies: Sprite Coda.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
1 Mobile File Systems: Disconnected and Weakly Connected File Systems 3/29/2004 Richard Yang.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
DISCONNECTED OPERATION IN THE CODA FILE SYSTEM J. J. Kistler M. Sataynarayanan Carnegie-Mellon University.
CS 525M – Mobile and Ubiquitous Computing Seminar Bradley Momberger.
CODA: A HIGHLY AVAILABLE FILE SYSTEM FOR A DISTRIBUTED WORKSTATION ENVIRONMENT M. Satyanarayanan, J. J. Kistler, P. Kumar, M. E. Okasaki, E. H. Siegel,
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
Example: Rumor Performance Evaluation Andy Wang CIS 5930 Computer Systems Performance Analysis.
Mobile File System Byung Chul Tak. AFS  Andrew File System Distributed computing environment developed at CMU provides transparent access to remote shared.
Jinyong Yoon,  Andrew File System  The Prototype  Changes for Performance  Effect of Changes for Performance  Comparison with A Remote-Open.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
Distributed File Systems
Information/File Access and Sharing Coda: A Case Study J. Kistler, M. Satyanarayanan. Disconnected operation in the Coda File System. ACM Transaction on.
ENERGY-EFFICIENCY AND STORAGE FLEXIBILITY IN THE BLUE FILE SYSTEM E. B. Nightingale and J. Flinn University of Michigan.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
An Architecture for Mobile Databases By Vishal Desai.
Outline for Today’s Lecture Administrative: Objective: –Disconnection in file systems –Energy management.
EE324 INTRO TO DISTRIBUTED SYSTEMS L-20 More DFS.
11.6 Distributed File Systems Consistency and Replication Xiaolong Wu Instructor: Dr Yanqing Zhang Advanced Operating System.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
Mobile File Systems.
Coda / AFS Thomas Brown Albert Ng.
Nomadic File Systems Uri Moszkowicz 05/02/02.
Example Replicated File Systems
Disconnected Operation in the Coda File System
Presented By Abhishek Khaitan Adhip Joshi
Today: Coda, xFS Case Study: Coda File System
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
Distributed File Systems
University of Southern California Information Sciences Institute
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie.
Presentation transcript:

G Robert Grimm New York University Disconnected Operation in the Coda File System

Altogether Now: The Three Questions  What is the problem?  What is new or different?  What are the contributions and limitations?

Remember AFS?  Files organized into volumes (partial subtree)  Identified by FIDs (instead of pathnames)  Name resolution performed on clients  Cached in entirety on client  Updates propagated on closing of file  Protected by callbacks from server  Volume cloning as central manageability mechanism  Provides consistent copy-on-write snapshot  Used for moving, read-only replication, backup

Coda Design Overview  Same target environment as AFS  Trusted Unix servers, untrusted Unix clients  Academic and research workloads  Very limited concurrent, fine-grained write sharing  Two mechanisms for availability  Server replication across volume storage groups (VSGs)  Similar to AFS: Files cached on clients, updates propagated in parallel to accessible VSG, consistency ensured by callbacks  Disconnected operation serviced from local cache  Cache misses result in failures and are reflected to application/user

Design Rationale  Scalability: Place functionality on clients  Callbacks, whole-file caching, name resolution on clients,…  Portable workstations are (becoming) common-place  People already treat their laptops as portable caches  They are good at predicting future usage  First- and second-class replication are both useful  Servers have higher quality (space, security, maintenance)  Performance, scalability  But (disconnected) clients ensure availability  Availability

Design Rationale (cont.)  Optimistic replica control enables/ensures availability  Locks and leases are too limiting  Especially when disconnection is “involuntary”  Locks may reserve resource for too long  Leases may reserve resource for not long enough  But optimistic control may result in conflicts  Need to be detected, automatically resolved  Though, Unix FS usage typically has little write sharing

Client Structure  Most functionality located in user-level server  With small in-kernel component  What are the trade-offs here?

The Venus State Machine  Hoarding prepares for possible disconnection  Emulation services requests from local cache  Reintegration propagates changes back to servers

Hoarding  Balance current working set against future needs  Hoarding DB provides user-specified list of files  Prioritized, may include directories and their descendants  Cache organized hierarchically  After all, name resolution is performed on client (!)  Hoard walk maintains equilibrium  Goal: No uncached object has higher priority than cached ones  Operation  Reevaluate name bindings to identify all children  Compute priorities for cache, HDB; evict and fetch as needed  What to do about broken callbacks?  Purge files, symlinks immediately, but delay directory operations

Emulation  Local Venus pretends to be the server  Manages cache with same algorithm as during hoarding  Modified objects assume infinite priority (why?)  Logs operations for future reintegration  Log (and metadata, HDB) accessed through RVM  Flushed infrequently when hoarding, frequently when emulating  Only contains close operations, but not individual writes  Removes previous store records on repeated close  Should remove stores after unlink or truncate  Should remove log records that are inverted  E.g., mkdir vs. rmdir on the same directory

Emulation (cont.)  What to do when disk space runs out?  Currently: Manually delete from cache  No more modifications if log is full  In the future  Compress file cache and RVM contents  Allow user to selectively back out of updates  Back up cache, RVM on removable media

Reintegration  Replay algorithm  Obtain permanent FIDs for new objects  Often not necessary as FIDs are preallocated  Ship log to all servers in AVSG, which replay them  Begin transaction, lock all referenced objects  Validate each operation and then execute it  Check for conflicts, file integrity, protection, disk space  Perform data transfers (back-fetching)  Commit transaction, unlock all objects  On error, abort replay, create replay file (superset of tar file)  Conflict detection focused on write/write conflicts  Based on storeids, which uniquely identify updates

Some Experimental Results  Reintegration timings  Andrew benchmark: 288 s execution, 43 s reintegration  Makes extensive use of file system (remember: load unit)  Venus make: 3,271 s execution, 52 s reintegration  Likelihood of conflicts  Based on AFS traces  Overall results  99% of all modifications by previous writer  Two users modifying same object < 1 day apart: 0.75%  Most conflicts for system files as operators work in shifts  Without system files: 99.5% previous, 0.4% two users

Cache Size  Based on cumulative data for Coda, AFS, local FS  Venus serves as its own simulator  Max/Avg/Min of high-watermarks for 5 workstations

What Do You Think?