Andrew File System (AFS)

Slides:



Advertisements
Similar presentations
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Advertisements

Andrew File System (AFS)
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE CS582: Distributed Systems Lecture 13, 14 -
Consistency in NFS and AFS. Network File System (NFS) Uses client caching to reduce network load Built on top of RPC Server cache: X Client A cache: XClient.
Caching in Distributed File System Ke Wang CS614 – Advanced System Apr 24, 2001.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
G Robert Grimm New York University Scale and Performance in Distributed File Systems: AFS and SpriteFS.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
AFS Made By Andrew Carnegie & Andrew Mellon Carnegie Mellon University Presented By Christopher Tran & Binh Nguyen.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Lecture 23 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The Google File System Reporter: You-Wei Zhang.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
Distributed File Systems Case Studies: Sprite Coda.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 20 Distributed File Systems Copyright © 2008.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Host and Callback Tracking in OpenAFS Jeffrey Altman, Secure Endpoints, Inc Derrick Brashear, Sine Nomine Associates.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
Jinyong Yoon,  Andrew File System  The Prototype  Changes for Performance  Effect of Changes for Performance  Comparison with A Remote-Open.
Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Lecture 24 Sun’s Network File System. PA3 In clkinit.c.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Lecture 25 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Lecture 22 Sun’s Network File System
Distributed File Systems
Distributed File Systems
Multiprocessor Cache Coherency
CMSC 611: Advanced Computer Architecture
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Distributed File Systems
Today: Coda, xFS Case Study: Coda File System
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Consistency and Replication
Distributed File Systems
DISTRIBUTED FILE SYSTEMS
Distributed File Systems
Exercises for Chapter 8: Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Chapter 15: File System Internals
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Network File System (NFS)
M05 DISTRIBUTED FILE SYSTEM
Presentation transcript:

Andrew File System (AFS) Craig Shih & Todor Avramov University of Washington, Bothell CSS534: Parallel Programming December 8, 2016

Outline Overview Consistency Architecture Synchronization and Caching Issues Scalability and Performance AFS vs. NFS References Q&A

Overview Introduced by researchers at Carnegie-Mellon University in the 1980’s Distributed File System with Main goal of Scalability Client-side (Venus), Server-side (Vice) Basic idea of AFS – whole-file caching on local disk of client machine 2 version (AFSv1 and AFSv2) 4 major implementations Transarc(IBM) (deprecated) OpenAFS ARLA Linux Kernel Source Code (early stages of development) AFS v1 – was originally called the ITC distributed file system – had basic design in place, but didn’t scale desirably AFS client-side (venus) Server side – vice AFSv1 – onlyl cached file contents, not directories. When file is called to be open – directory is passed to the server (vice) at which point the entire file is cached onto local disk. When finished (close is called) any changes would be stored back on server. Next time if the same file is called, would be more efficient since it checks if its been changed, if not will just open the local disk’s cached copy. Two main problems with VERSION 1 – Path-traversal costs are too high – had to traverse whole path – too much cpu used The client issues too many TestAuth protocol messages. – too many checks to see if local copy can be used – more often than not, local copy can be used These two main issues limited scalability AFS Version 2 Introduced Callback – to reduce number of client/server interactions. – simply a promise from the server to the client that the server will inform the client when a file that the client is caching has been modified. Also introduced notion of a file –dentifier (FID) – instead of pathnames to specify which file a client was interested – FID would be used FID consists of a Volume ID, a file ID, and a “uniquifier” – essentially using /home/remzi/notes.txt as an example – the first fetch of notes.txt will cache directory home, establish callback for that directory, then do the same with remzi, and then do the same with the actual notes.txt file. This way any changes to the entire path will be marked for callback. Server will notify client if any changes occur. This reduces the need to make multiple hits on the server once initial caching has been established.

AFS Version 2 Introduced Callbacks Reduces number of client/server interactions Is a promise from the server to the client that the server will inform client when a file that the client has cached has been modified. Introduced notion of File Identifier (FID) Replaced whole pathnames as a way to specify file location to server FID consists of a Volume Identifier, File Identifier, and a “uniquifier” ”uniquifier” allows the reuse of Identifiers if a file is deleted. Introduced Callbacks Reduces number of client/server interactions Is a promise from the server to the client that the server will inform client when a file that the client has cached has been modified. Introduced notion of File Identifier (FID) Replaced whole pathnames as a way to specify file location to server FID consists of a Volume Identifier, File Identifier, and a “uniquifier” ”uniquifier” allows the reuse of Identifiers if a file is deleted. Ex. /home/craig/todor/hello.txt Will cache each directory (home, craig, todor) Will then cache the file hello.txt and establish callback for that as well. Server will then notify client if any changes occur. Reduces the need to make multiple hits on the server once initial caching has been established

Callback/ FID Example

Consistency Guarantees Update Visibility When will the server be updated with a new version of the file? Cache Staleness Once the server has a new version, how long before clients see the new version instead of an older cached copy? Two cases to consider: Consistency between processes on different machines Consistency between processes on the same machine AFS Uses Weak (but practical) Consistency Model Guarantees that after the completion of an operation, the next operation performed anywhere in the network will see the updated file system state

Consistency Mechanisms New data written to a file is not stored back at the file server and visible to other clients until the file is closed. Typical UNIX read/write semantics on the same machine – writes to a file are immediately visible to other local processes Open/Close Granularity Client nodes that already have the data cached on their local disks will not have to retrieve from the file server. Minimizes network communication and file server load Whole-file caching on local disk Every time a file is modified in the client side, in addition to storing the file in the local cache, a copy is sent to the file server Minimizes the effects of a client failure Write-through Cache When the server becomes aware that a particular file is modified by a client, the server breaks callbacks (initiates invalidation callbacks) for any clients with local cached copies of that file. Subsequent opens on those clients require a re-fetch of the new version of the file from the server Callbacks

Consistency Timeline

AFS Architecture Vice RPC RPC

System Call Interception in AFS

Vice (File Server)

Synchronization and Caching Issues Callbacks depend on reliable delivery of messages from server to clients Issues: Network is unreliable and may prevent delivery of callback; breaking messages Client may crash File server may crash Fixes After recovery, treat all cache contents as suspect Cache Timeout/Periodic polling (10 minutes) > Synchronization: Update and Callback at the same time for the same file? Locking the cache entries? (Too complicated and may result in deadlock) Add a callback sequence number at the file server? (Best solution) Discard the callback? (Ad hoc solution) What happens when client updates a file and sends it to the fileserver while at the same time the file server calls the client to break a callback promise associated with the same file?

Scalability and Performance Enterprise deployments may exceed 25,000 clients Able to support about 50 clients per server Files commonly accessed locally. File reads usually from local disk cache Client-side performance came close to local performance

AFS vs NFS Performance Assumptions: small and medium files fit into memory of a client – large files fit on a local disk but not in client memory. Access across the network to the remote server for a file block takes Lnet time units. Access to local memory takes Lmem. And access to local disk takes Ldisk. General assumption is that Lnet > Ldisk < Lmem

AFS vs NFS Qualitative AFS NFS Supported Clients Requires Disks Diskless Network Traffic As required (much less) Periodic (high) Server Load Client Response Times Equal OS Support Minimal More Caching Protocol Whole Files Blocks of Files Caching Mechanism Client Disks Client Memory Security Kerberos Weaker more primitive Admin Management Simpler Difficult

References http://pages.cs.wisc.edu/~remzi/OSTEP/dist-afs.pdf http://ra.adm.cs.cmu.edu/anon/usr/ftp/home/ftp/itc/CMU-ITC-063.pdf http://ra.adm.cs.cmu.edu/anon/anon/usr0/ftp/itc/CMU-ITC-062.pdf http://tele.informatik.uni-freiburg.de/lehre/ws01/dsys/Lectures/Lecture19-1.pdf https://en.wikipedia.org/wiki/Andrew_File_System Distributed Systems: Concepts and Design, Edition 4, Addison-Wesley Copyright © George Coulouris, Jean Dollimore, Tim Kindberg, Pearson Education 2005, Chapter 8.4 Andrew File System

Questions