Scale and Performance in a Distributed File System

Slides:



Advertisements
Similar presentations
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Advertisements

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Andrew File System (AFS)
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE CS582: Distributed Systems Lecture 13, 14 -
Caching in Distributed File System Ke Wang CS614 – Advanced System Apr 24, 2001.
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
G Robert Grimm New York University Scale and Performance in Distributed File Systems: AFS and SpriteFS.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Lecture 23 The Andrew File System. NFS Architecture client File Server Local FS RPC.
PETAL: DISTRIBUTED VIRTUAL DISKS E. K. Lee C. A. Thekkath DEC SRC.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed File Systems 1 CS502 Spring 2006 Distributed Files Systems CS-502 Operating Systems Spring 2006.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Advanced Operating Systems - Spring 2009 Lecture 21 – Monday April 6 st, 2009 Dan C. Marinescu Office: HEC 439 B. Office.
Distributed File Systems
Distributed File Systems Case Studies: Sprite Coda.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 20 Distributed File Systems Copyright © 2008.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Jinyong Yoon,  Andrew File System  The Prototype  Changes for Performance  Effect of Changes for Performance  Comparison with A Remote-Open.
Caching in the Sprite Network File System Scale and Performance in a Distributed File System COMP 520 September 21, 2004.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Lecture 25 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
File-System Management
Chapter 12: File System Implementation
Chapter 17: Distributed-File Systems
Chapter 2: Computer-System Structures
File System Implementation
Chapter 11: File System Implementation
Andrew File System (AFS)
File System Implementation
Scale and Performance in a Distributed File System
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Filesystems 2 Adapted from slides of Hank Levy
Chapter 17: Distributed-File Systems
Today: Coda, xFS Case Study: Coda File System
Multiple Processor Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 11: File System Implementation
CSE 451: Operating Systems Autumn Module 22 Distributed File Systems
Distributed File Systems
DISTRIBUTED FILE SYSTEMS
Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Chapter 2: Operating-System Structures
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Chapter 15: File System Internals
University of Southern California Information Sciences Institute
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Database System Architectures
Distributed File Systems
Chapter 2: Operating-System Structures
Chapter 17: Distributed-File Systems
Distributed File Systems
M05 DISTRIBUTED FILE SYSTEM
Presentation transcript:

Scale and Performance in a Distributed File System John H. Howord et. al. in ACM Transactions on Computer Systems Vol. 6, No. 1, 1988 Presented by Jiwoo Bang, Moonsub Kim

Presentation Outline Andrew File System Prototype Overview of Andrew File System (AFS) Performance Evaluation Of AFS Prototype Changes For Performance Revised Andrew File System Effect of Changes For Performance Performance Comparison of NFS and AFS Changes for Operability Conclusion

Part 1. Andrew File System Prototype

Andrew File System Distributed File System developed in CMU A distributed implementation of a file system, where multiple users share files and storage resources Overall storage space managed by a DFS is composed of different, remotely located, smaller storage spaces Presents a homogeneous, location-transparent file name space AFS architecture is based on observations … Shared files are infrequently updated Files are normally accessed by only a single user Local caches are likely to remain valid for long periods

Andrew File System Caching in Distributed File System Retaining most recently accessed disk blocks Repeated accesses to a block in cache can be handled without involving the disk Reduce network traffic and server contention Scalability is most important design goal of AFS Unusual Design Features Whole-file caching

Andrew File System Architecture

Andrew File System Prototype Vice (Server) Serve files to Venus A set of trusted servers, collectively called Vice A process running on server side Vice process dedicated to each Venus client Venus (Client) Cache files from Vice Contacts Vice only when a file is opened or closed Reading and writing are performed directly on the cached copy (client side)

Andrew File System Prototype Vice (Server) Serve files to Venus A set of trusted servers, collectively called Vice A process running on server side Vice process dedicated to each Venus client Venus (Client) Cache files from Vice Contacts Vice only when a file is opened or closed Reading and writing are performed directly on the cached copy (client side)

Andrew File System Prototype Vice (Server) Serve files to Venus A set of trusted servers, collectively called Vice A process running on server side Vice process dedicated to each Venus client Venus (Client) Cache files from Vice Contacts Vice only when a file is opened or closed Reading and writing are performed directly on the cached copy (client side)

Andrew File System Prototype Vice (Server) Serve files to Venus A set of trusted servers, collectively called Vice A process running on server side Vice process dedicated to each Venus client Venus (Client) Cache files from Vice Contacts Vice only when a file is opened or closed Reading and writing are performed directly on the cached copy (client side)

Andrew File System Prototype Vice (Server) Serve files to Venus A set of trusted servers, collectively called Vice A process running on server side Vice process dedicated to each Venus client Venus (Client) Cache files from Vice Contacts Vice only when a file is opened or closed Reading and writing are performed directly on the cached copy (client side)

Andrew File System Prototype Vice maintains status information in separate files .admin directory to store configuration data directory hierarchy mirroring original vice file structures Stub directory gives a name space located on other servers If file not on a server, the search for its name would end in a stub directory

Andrew File System Prototype Vice-Venus interface names files by their full pathname there is no notion of low level file name such as inode full path directory working is required

Andrew File System Prototype Vice create dedicated processes for each client the dedicated process persists until its client terminated too frequent context switching

Andrew File System Prototype Venus verify timestamps on every open Each open include at least one interaction with a server even if the file were already in the cache and up-to-date!

AFS Prototype Benchmark Setup Load Unit Load placed on a server by a single client A Load Unit = about five Andrew users Read-only source subtree 70 files Total 200kbytes Synthetic Benchmark Simulate real user actions MakeDir, Copy, ScanDir, ReadAll, Make

Stand-alone Benchmark Performance MakeDir: Constructs a target subtree that is identical in structure to the source subtree Copy: Copies every file from the source subtree to the target subtree ScanDir: Recursively traverses the target subtree and examines the status of every file in it ReadAll: Scans every byte of every file in the target subtree once Make: Compiles and links all the files in the target subtree

Stand-alone Benchmark Performance Hit Ratio of Two Caches 81% for file cache 82% for status cache

Distribution of Vice Calls in Prototype TestAuth and GetFileStat are accounting for nearly 90% of the total calls! Caused by frequency of cache validity check

Prototype Benchmark Performance Took about 70% longer at a load of 1 than in the stand-alone case TestAuth rose rapidly beyond a load of 5 Running only between 5 and 10 servers is the maximum feasible

Prototype Server Usage CPU utilization is too high! Sever CPU is the performance bottleneck Caused by frequent context switching Traversing full pathnames Server loads were not evenly balanced Require load balancing between servers

Changes For Performance Cache Management Caches the contents of directories and symbolic links in addition to files Require workstations to do path name traversals Callback Venus assumes that cache entries are valid unless otherwise notified Server promises to notify it before allowing a modification by any other workstation Result Dramatically reduces the number of cache validation requests received by servers

Changes For Performance Name Resolution Implicit namei in Venus was costly Introduce fids (96-bits long) 32-bit Volume Number Collection of files located on one server 32-bit Vnode Number Used as an index into an array containing the file storage information for the files in a sing volume 32-bit Uniquifier Allows reuse of vnode numbers

Changes For Performance Communication and Server Process Structure A single process to service all clients of a server Using multiple Lightweight Processes (LWPs) within one process Context switching is cheap

Changes For Performance Low-Level Storage Representation Access files by their inodes rather than by path names Eliminated nearly all pathname lookups on workstations and servers

Part 2. Revised Andrew File System

Revised AFS Benchmark Result The design changes have improves scalability considerably Scalability Andrew is only 19% slower than a stand-alone workstation at load 1 (prototype was 70% slower) Less than twice as long at a load of 15 as at a load of 1 The design changes have improved scalability considerably! Comparison of prototype(basic FS architecture) and Revised AFS

AFS Server utilization during the benchmark For better performance, AFS server needs more efficient software or faster CPU CPU Utilization Rises from % at a load of 1 to over 70% at a lo8ad of 20 At a load of 20 the system is still not saturated But performance bound is still CPU Disk Utilization Below 20% even at a load of 20 Better performance requires, More efficient server software Faster server CPU

Andrew Server Usage Most of servers show CPU utilizations between 15% and 25% Vice4 Caused by system maintenance activities that were performed during the unexpectedly period Vice9 This server stores the bulletin boards which are accessed and modified by many different users /***************** GetTime Most frequently called Used by workstations to synchronize their clocks and as an implicit keepalive FetchStatus Generated by users listing directories RemoveCB Flushes a cache entry Vice9: Indicates that the files it stores exhibit poor locality (mostly just one read for bulletin board) Vice8: used by operation staff Modification made to do RemoveCB on groups *****************/

Comparison with A Remote-Open File System Caching of entire file in the local disks of AFS was motivated by Whole-file transfer contacts servers only on opens and closes, read-writes cause no network traffic Whole-file transfer uses efficient bulk data transfer protocols. The amount of data fetched after a reboot is usually small. Caching of entire files simplifies cache management(no individual pages) 그림넣을것!!!!

Comparison with A Remote-Open File System Drawbacks of the entire file caching approach in AFS Files that are larger than the local disk cache cannot be accessed at all Strict emulation of 4.2BSD concurrent read and write semantics across workstations is impossible But the Andrew File System does scale well.

Comparison with A Remote-Open File System Distributed File Systems: Sun Microsystem NFS, AT&T RFS, Locus The data in a file are not fetched at all Instead, the remote site potentially participates in each individual read and write operation Even though buffering and read-ahead are used to improve performance but the remote site is still conceptually involved in every I/O

Sun Network File System Page caching: NFS caches inodes and individual pages of a file in memory. Once a file is open, the remote site is treated like a local disk with read-ahead and write-behind of pages Consistency semantics of NFS A new file may not be visible elsewhere for 30 seconds Two processes writing to the same file could produce a different results

Performance NFS’s performance degrades rapidly with increasing load

Performance Andrew well scales especially on ScanDir and ReadAll than NFS

CPU/Disk Utilization CPU utilization of NFS is much higher than AFS At a load of 1, CPU utilization is about 22% in NFS but 3% in Andrew At a load 18, CPU utilizations saturates at 100% in NFS but for Andrew 38% in the cold cache and 42% in the warm case NFS used both disks on the server, with utilizations rising from about 9% and 3% at a load of 1 to nearly 95% and 19% at a load of 18 respectively Disk 1: System libs. Disk 2: User data Andrew used only one of the server disks, with utilization rising from about 4% at a load of 1 to about 33% at a load of 18 in the cold cache case

Network Traffic NFS generates nearly three times as many packets as Andrew at a load of one

Changes for Operability Goal: to build a system that would be easy for a small operational staff to run and monitor with minimal inconvenience to users Problems in the prototype: inflexible mapping of Vice files to server disk storage Vice was constructed out of collections of files glued together by the 4.2BSD Mount mechanism Movement of files across servers was difficult (embedded file location info) It was not possible to implement a quota system The mechanisms of file location and file replication were cumbersome (consistency problem) Standard backup utilities were not convenient for use in distributed environment Backup needs the entire disk partition taken off-line

Changes for Operability Volume Collection of files forming a partial subtree of the Vice name space Volumes are glued together at Mount Points A volume resides within a single disk partition on a server Volumes provides a level of Operational Transparancy Volume Movement Volume movement is done by creating a frozen copy-on-write snapshot of the volume The volume move operation is atomic Quotas Quotas are implemented on a per volume basis

Changes for Operability Read-Only Replication Improves availability and balances load No callback needed The volume location database specifies the server containing the read-write copy of a volume and a list of read-only replication sites Backup Volumes form the basis of the backup and restoration mechanism Create a frozen snapshot of a read-only clone, then transfer it

Conclusions Design changes in the Prototype Andrew File System improved scalability considerably At a load of 20, the system was still not saturated A server using revised Andrew File System can serve more than 50 users Changes in cache management, name resolution, server process structure, low-level storage representation Volumes provide a level of Operational Transparency that is not supported by any other file system Quota Read-only replication Simple and efficient backup