Distributed File SystemsCS-4513, D-Term 20071 Distributed File Systems CS-4513 Distributed Computing Systems (Slides include materials from Operating System.

Slides:



Advertisements
Similar presentations
OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.
Advertisements

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 16 Distributed-File Systems Background Naming and Transparency Remote File.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 23: Distributed-File Systems (Chapter 17)
Module 17: Distributed-File Systems Background Naming and Transparency Remote File Access Stateful versus Stateless Service File Replication Example Systems.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 17: Distributed-File Systems.
Chapter 17: Distributed-File Systems Silberschatz, Galvin and Gagne ©2005 AE4B33OSS Chapter 17 Distributed-File Systems Background Naming and Transparency.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Chapter 17: Distributed-File Systems Adapted to COP4610 by Robert van Engelen.
1DT057 DISTRIBUTED INFORMATION SYSTEM DISTRIBUTED FILE SYSTEM 1.
Distributed File Systems CS 3100 Distributed File Systems1.
File System Implementation
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Chapter 17: Distributed-File Systems Part Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter.
Distributed File Systems
Chapter 17 Distributed File Systems By: Amar Deo( ) Ankur Patel( )
Distributed File SystemsCS-4513 D-term Distributed File Systems (and related topics) CS-4513 Distributed Computing Systems (Slides include materials.
Distributed File SystemsCS-502 Fall Distributed File Systems CS-502 Operating Systems Fall 2006 (Slides include materials from Operating System Concepts,
Distributed-File Systems
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Chapter 17: Distributed-File Systems Part 1
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed File Systems 1 CS502 Spring 2006 Distributed Files Systems CS-502 Operating Systems Spring 2006.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
1 10/2/2015 Chapter 16 Distributed-File Systems 此章不作上课内容 l Background l Naming and Transparency l Remote File Access l Stateful versus Stateless Service.
1DT057 DISTRIBUTED INFORMATION SYSTEM DISTRIBUTED FILE SYSTEM 1.
Distributed File Systems Distributed file system (DFS) – a distributed implementation of the classical time-sharing model of a file system, where multiple.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Dr. M. Munlin Network and Distributed System Structures 1 NETE0516 Operating Systems Instructor: ผ. ศ. ดร. หมัดอามีน หมัน หลิน Faculty of Information.
NFS : Network File System SMU CSE8343 Prof. Khalil September 27, 2003 Group 1 Group members: Payal Patel, Malka Samata, Wael Faheem, Hazem Morsy, Poramate.
Distributed File SystemsCS-502 Fall Distributed File Systems CS-502 Operating Systems (Slides include materials from Operating System Concepts, 7.
Distributed File SystemsCS-502 (EMC) Fall Distributed File Systems (and related topics) CS-502, Operating Systems Fall 2009 (EMC) (Slides include.
Advanced Operating Systems - Spring 2009 Lecture 20 – Wednesday April 1 st, 2009 Dan C. Marinescu Office: HEC 439 B.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Silberschatz, Galvin and Gagne  Operating System Concepts Distributed File Systems Distributed file system (DFS) – a distributed implementation.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Silberschatz, Galvin and Gagne  Operating System Concepts Distributed File Systems Distributed file system (DFS) – a distributed implementation.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 16 Distributed-File Systems Background Naming and Transparency Remote File.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Chapter 17: Distributed-File Systems
Distributed File Systems
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Chapter 17: Distributed-File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed File Systems
Chapter 17: Distributed-File Systems
Distributed File Systems
Presentation transcript:

Distributed File SystemsCS-4513, D-Term Distributed File Systems CS-4513 Distributed Computing Systems (Slides include materials from Operating System Concepts, 7 th ed., by Silbershatz, Galvin, & Gagne, Modern Operating Systems, 2 nd ed., by Tanenbaum, and Distributed Systems: Principles & Paradigms, 2 nd ed. By Tanenbaum and Van Steen)

Distributed File SystemsCS-4513, D-Term Distributed Files Systems (DFS) A special case of distributed system Allows multi-computer systems to share files –Even when no other IPC or RPC is needed Sharing devices –Special case of sharing files E.g., –NFS (Sun’s Network File System) –Windows NT, 2000, XP –Andrew File System (AFS) & others …

Distributed File SystemsCS-4513, D-Term Distributed File Systems (continued) One of most common uses of distributed computing Goal: provide common view of centralized file system, but distributed implementation. –Ability to open & update any file on any machine on network –All of synchronization issues and capabilities of shared local files

Distributed File SystemsCS-4513, D-Term Naming of Distributed Files Naming – mapping between logical and physical objects. A transparent DFS hides the location where in the network the file is stored. Location transparency – file name does not reveal the file’s physical storage location. –File name denotes a specific, hidden, set of physical disk blocks. –Convenient way to share data. –Could expose correspondence between component units and machines. Location independence – file name does not need to be changed when the file’s physical storage location changes. –Better file abstraction. –Promotes sharing the storage space itself. –Separates the naming hierarchy from the storage-devices hierarchy.

Distributed File SystemsCS-4513, D-Term DFS – Three Naming Schemes 1.Mount remote directories to local directories, giving the appearance of a coherent local directory tree Mounted remote directories can be accessed transparently. Unix/Linux with NFS; Windows with mapped drives 2.Files named by combination of host name and local name; Guarantees a unique system wide name Windows Network Places, Apollo Domain 3.Total integration of component file systems. A single global name structure spans all the files in the system. If a server is unavailable, some arbitrary set of directories on different machines also becomes unavailable.

Distributed File SystemsCS-4513, D-Term Mounting Remote Directories (NFS)

Distributed File SystemsCS-4513, D-Term Mounting Remote Directories (continued) Note:– names of files are not unique As represented by path names E.g., Server A sees : /users/steen/mbox Client A sees: /remote/vu/mbox Client B sees: /work/me/mbox Consequence:– Cannot pass file “names” around haphazardly

Distributed File SystemsCS-4513, D-Term Mounting Remote Directories in NFS More later …

Distributed File SystemsCS-4513, D-Term DFS – File Access Performance Reduce network traffic by retaining recently accessed disk blocks in local cache Repeated accesses to the same information can be handled locally. –All accesses are performed on the cached copy. If needed data not already cached, copy of data brought from the server to the local cache. –Copies of parts of file may be scattered in different caches. Cache-consistency problem – keeping the cached copies consistent with the master file. –Especially on write operations

Distributed File SystemsCS-4513, D-Term DFS – File Caches In client memory –Performance speed up; faster access –Good when local usage is transient –Enables diskless workstations On client disk –Good when local usage dominates (e.g., AFS) –Caches larger files –Helps protect clients from server crashes

Distributed File SystemsCS-4513, D-Term DFS –Cache Update Policies When does the client update the master file? –I.e. when is cached data written from the cache to the file? Write-through – write data through to disk ASAP –I.e., following write() or put(), same as on local disks. –Reliable, but poor performance. Delayed-write – cache and then written to the server later. –Write operations complete quickly; some data may be overwritten in cache, saving needless network I/O. –Poor reliability unwritten data may be lost when client machine crashes Inconsistent data –Variation – scan cache at regular intervals and flush dirty blocks.

Distributed File SystemsCS-4513, D-Term DFS – File Consistency Is locally cached copy of the data consistent with the master copy? Client-initiated approach –Client initiates a validity check with server. –Server verifies local data with the master copy E.g., time stamps, etc. Server-initiated approach –Server records (parts of) files cached in each client. –When server detects a potential inconsistency, it reacts

Distributed File SystemsCS-4513, D-Term DFS – Remote Service vs. Caching Remote Service – all file actions implemented by server. –RPC functions –Use for small memory diskless machines –Particularly applicable if large amount of write activity Cached System –Many “remote” accesses handled efficiently by the local cache Most served as fast as local ones. –Servers contacted only occasionally Reduces server load and network traffic. Enhances potential for scalability. –Reduces total network overhead

Distributed File SystemsCS-4513, D-Term DFS – File Server Semantics Stateless Service –Avoids state information in server by making each request self-contained. –Each request identifies the file and position in the file. –No need to establish and terminate a connection by open and close operations. –Poor support for locking or synchronization among concurrent accesses

Distributed File SystemsCS-4513, D-Term DFS – File Server Semantics (continued) Stateful Service –Client opens a file (as in Unix & Windows). –Server fetches information about file from disk, stores in server memory, Returns to client a connection identifier unique to client and open file. Identifier used for subsequent accesses until session ends. –Server must reclaim space used by no longer active clients. –Increased performance; fewer disk accesses. –Server retains knowledge about file E.g., read ahead next blocks for sequential access E.g., file locking for managing writes –Windows

Distributed File SystemsCS-4513, D-Term DFS –Server Semantics Comparison Failure Recovery: Stateful server loses all volatile state in a crash. –Restore state by recovery protocol based on a dialog with clients. –Server needs to be aware of crashed client processes orphan detection and elimination. Failure Recovery: Stateless server failure and recovery are almost unnoticeable. –Newly restarted server responds to self-contained requests without difficulty.

Distributed File SystemsCS-4513, D-Term DFS –Server Semantics Comparison (continued) … Penalties for using the robust stateless service: – –longer request messages –slower request processing Some environments require stateful service. –Server-initiated cache validation cannot provide stateless service. –File locking (one writer, many readers).

Distributed File SystemsCS-4513, D-Term DFS – Replication Replicas of the same file reside on failure-independent machines. Improves availability and can shorten service time. Naming scheme maps a replicated file name to a particular replica. –Existence of replicas should be invisible to higher levels. –Replicas must be distinguished from one another by different lower-level names. Updates –Replicas of a file denote the same logical entity –Update to any replica must be reflected on all other replicas.

Distributed File SystemsCS-4513, D-Term Example Distributed File Systems NFS – Sun’s Network File System (ver. 3) Tanenbaum & van Steen, Chapter 11 NFS – Sun’s Network File System (ver. 4) Tanenbaum & van Steen, Chapter 11 AFS – the Andrew File System See Silbershatz §17.6

Distributed File SystemsCS-4513, D-Term NFS Sun Network File System (NFS) has become de facto standard for distributed UNIX file access. NFS runs over LAN –even WAN (slowly) Any system may be both a client and server Basic idea: –Remote directory is mounted onto local directory –Remote directory may contain mounted directories within

Distributed File SystemsCS-4513, D-Term Mounting Remote Directories (NFS)

Distributed File SystemsCS-4513, D-Term Nested Mounting (NFS)

Distributed File SystemsCS-4513, D-Term NFS Implementation NFS

Distributed File SystemsCS-4513, D-Term NFS Operations Lookup –Fundamental NFS operation –Takes pathname, returns file handle File Handle –Unique identifier of file within server –Persistent; never reused –Storable, but opaque to client 64 bytes in NFS v3; 128 bytes in NFS v4 Most other operations take file handle as argument

Distributed File SystemsCS-4513, D-Term Other NFS Operations (version 3) read, write link, symlink mknod, mkdir rename, rmdir readdir, readlink getattr, setattr create, remove Conspicuously absent –open, close

Distributed File SystemsCS-4513, D-Term NFS v3 — A Stateless Service Server retains no knowledge of client Server crashes invisible to client All hard work done on client side Every operation provides file handle Server caching Performance only Based on recent usage Client caching Client checks validity of caches files Client responsible for writing out caches …

Distributed File SystemsCS-4513, D-Term NFS v3 — A Stateless Service (continued) … No locking! No synchronization! Unix file semantics not guaranteed E.g., read after write Session semantics not even guaranteed E.g., open after close

Distributed File SystemsCS-4513, D-Term NFS v3 — A Stateless Service (continued) Solution: global lock manager Separate from NFS Typical locking operations Lock – acquire lock (non-blocking) Lockt – test a lock Locku – unlock a lock Renew – renew lease on a lock

Distributed File SystemsCS-4513, D-Term NFS Implementation Remote procedure calls for all operations –Implemented in Sun ONC –XDR is interface definition language Network communication is client-initiated –RPC based on UDP (non-reliable protocol) –Response to remote procedure call is de facto acknowledgement Lost requests are simply re-transmitted –As many times as necessary to get a response!

Distributed File SystemsCS-4513, D-Term NFS – Caching On client open(), client asks server if its cached attribute blocks are up to date. Once file is open, different client processes can write it and get inconsistent data. Modified data is flushed back to the server every 30 seconds.

Distributed File SystemsCS-4513, D-Term NFS Failure Recovery Server crashes are transparent to client Each client request contains all information Server can re-fetch from disk if not in its caches Client retransmits request if interrupted by crash –(i.e., no response) Client crashes are transparent to server Server maintains no record of which client(s) have cached files.

Distributed File SystemsCS-4513, D-Term Summary NFS That was version 3 of NFS Stateless file system High performance, simple protocol Based on UDP Everything has changed in NFS version 4 First published in 2000 Clarifications published in 2003 Almost complete rewrite of NFS

Distributed File SystemsCS-4513, D-Term NFS Version 4 Stateful file service Based on TCP – reliable transport protocol More ways to access server Compound requests I.e., multiple RPC calls in same packet More emphasis on security Mount protocol integrated with rest of NFS protocol

Distributed File SystemsCS-4513, D-Term NFS Version 4

Distributed File SystemsCS-4513, D-Term NFS Version 4 (continued) Additional RPC operations –Long list for managing files, caches, validating versions, etc. –Also security, permissions, etc. Also –Open() and close(). –With a server crash, some information may have to be recovered See –Silbershatz, p. 653 – htmhttp:// htm

Distributed File SystemsCS-4513, D-Term Questions?

Distributed File SystemsCS-4513, D-Term Andrew File System (AFS) Completely different kind of file system Developed at CMU to support all student computing. Consists of workstation clients and dedicated file server machines.

Distributed File SystemsCS-4513, D-Term Andrew File System (AFS) Stateful Single name space –File has the same names everywhere in the world. Lots of local file caching –On workstation disks –For long periods of time –Originally whole files, now 64K file chunks. Good for distant operation because of local disk caching

Distributed File SystemsCS-4513, D-Term AFS Need for scaling led to reduction of client-server message traffic. –Once a file is cached, all operations are performed locally. –On close, if the file is modified, it is replaced on the server. The client assumes that its cache is up to date! Server knows about all cached copies of file –Callback messages from the server saying otherwise. …

Distributed File SystemsCS-4513, D-Term AFS On file open() –If client has received a callback for file, it must fetch new copy –Otherwise it uses its locally-cached copy. Server crashes –Transparent to client if file is locally cached –Server must contact clients to find state of files See Silbershatz §17.6

Distributed File SystemsCS-4513, D-Term Distributed File Systems — Summary Performance is always an issue –Tradeoff between performance and the semantics of file operations (especially for shared files). Caching of file blocks is crucial in any file system, distributed or otherwise. –As memories get larger, most read requests can be serviced out of file buffer cache (local memory). –Maintaining coherency of those caches is a crucial design issue. Current research addressing disconnected file operation for mobile computers.

Distributed File SystemsCS-4513, D-Term Reading Assignment Silbershatz, Chapter 17 or Tanenbaum, Modern Operating Systems –§8.3 and § or Tanenbaum & van Steen, Chapter 11

Distributed File SystemsCS-4513, D-Term Questions?