Distributed File Systems: RPC, NFS, and AFS

Slides:



Advertisements
Similar presentations
RPC Robert Grimm New York University Remote Procedure Calls.
Advertisements

Remote Procedure CallCS-4513, D-Term Remote Procedure Call CS-4513 Distributed Computing Systems (Slides include materials from Operating System.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Distributed File Systems 17: Distributed File Systems
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
Ken Birman. Distributed File Systems Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked.
Distributed File Systems CS 3100 Distributed File Systems1.
G Robert Grimm New York University Lightweight RPC.
Distributed File Systems
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Chapter 17: Distributed-File Systems Part Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter.
490dp Prelude: Design Report Remote Invocation Robert Grimm (borrowing some from Hank Levy)
Remote Procedure Calls. 2 Announcements Prelim II coming up in one week: –Thursday, April 26 th, 7:30—9:00pm, 1½ hour exam –101 Phillips –Closed book,
Other File Systems: LFS, NFS, and AFS
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Remote Procedure Calls. 2 Client/Server Paradigm Common model for structuring distributed computations A server is a program (or collection of programs)
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
File Systems Implementation. 2 Goals for Today Filesystem Implementation Structure for –Storing files –Directories –Managing free space –Shared files.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed File Systems 1 CS502 Spring 2006 Distributed Files Systems CS-502 Operating Systems Spring 2006.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Advanced Operating Systems - Spring 2009 Lecture 21 – Monday April 6 st, 2009 Dan C. Marinescu Office: HEC 439 B. Office.
Distributed File Systems
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 20 Distributed File Systems Copyright © 2008.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
CSE 451: Operating Systems Winter 2015 Module 22 Remote Procedure Call (RPC) Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Manish Kumar,MSRITSoftware Architecture1 Remote procedure call Client/server architecture.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Implementing Remote Procedure Call Landon Cox February 12, 2016.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Case Study -- Sun’s Network File System (NFS) NFS is popular and widely used. NFS was originally designed and implemented by Sun Microsystems for use on.
File System Implementation
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
CSE 451: Operating Systems Winter 2006 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
Multiple Processor Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Winter 2007 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Distributed File Systems
DISTRIBUTED FILE SYSTEMS
Distributed File Systems
CSE 451: Operating Systems Winter 2004 Module 19 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring 2012 Module 22 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Autumn 2009 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Multiple Processor and Distributed Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Remote Procedure Call Hank Levy 1.
Chapter 15: File System Internals
Remote Procedure Call Hank Levy 1.
Distributed File Systems
CSE 451: Operating Systems Autumn 2010 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
Remote Procedure Call Hank Levy 1.
Distributed File Systems
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
Presentation transcript:

Distributed File Systems: RPC, NFS, and AFS

Announements Homework 6 available later tonight Due next Tuesday, December 2nd See me after class to pick up prelim Upcoming Agenda No class on Thursday—Happy Thanksgiving! Next week last week of classes—December 2nd and 4th Final—Thursday, December 18th at 2pm Room 131 Warren Hall Length is 2hrs

Goals for Today Distributed file systems (DFS) Network file system (NFS) Remote Procedure Calls (RPC) Andrew file system (AFS)

Distributed File Systems (DFS)

Distributed File Systems Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked documents Issues not common to usual file systems Naming transparency Load balancing Scalability Location and network transparency Fault tolerance We will look at some of these today

Transfer Model Upload/download Model: Remote Access Model: Client downloads file, works on it, and writes it back on server Simple and good performance Remote Access Model: File only on server; client sends commands to get work done

Naming transparency Naming is a mapping from logical to physical objects Ideally client interface should be transparent Not distinguish between remote and local files /machine/path or mounting remote FS in local hierarchy are not transparent A transparent DFS hides the location of files in system 2 forms of transparency: Location transparency: path gives no hint of file location /server1/dir1/dir2/x tells x is on server1, but not where server1 is Location independence: move files without changing names Separate naming hierarchy from storage devices hierarchy

File Sharing Semantics Sequential consistency: reads see previous writes Ordering on all system calls seen by all processors Maintained in single processor systems Can be achieved in DFS with one file server and no caching

Caching Keep repeatedly accessed blocks in cache How it works: Improves performance of further accesses How it works: If needed block not in cache, it is fetched and cached Accesses performed on local copy One master file copy on server, other copies distributed in DFS Cache consistency problem: how to keep cached copy consistent with master file copy Where to cache? Disk: Pros: more reliable, data present locally on recovery Memory: Pros: diskless workstations, quicker data access, Servers maintain cache in memory

File Sharing Semantics Other approaches: Write through caches: immediately propagate changes in cache files to server Reliable but poor performance Delayed write: Writes are not propagated immediately, probably on file close Session semantics (AFS): write file back on close Alternative (NFS): scan cache periodically and flush modified blocks Better performance but poor reliability File Locking: The upload/download model locks a downloaded file Other processes wait for file lock to be released

Network File System (NFS)

Network File System (NFS) Developed by Sun Microsystems in 1984 Used to join FSes on multiple computers as one logical whole Used commonly today with UNIX systems Assumptions Allows arbitrary collection of users to share a file system Clients and servers might be on different LANs Machines can be clients and servers at the same time Architecture: A server exports one or more of its directories to remote clients Clients access exported directories by mounting them The contents are then accessed as if they were local

Example

NFS Mount Protocol Client sends path name to server with request to mount Not required to specify where to mount If path is legal and exported, server returns file handle Contains FS type, disk, i-node number of directory, security info Subsequent accesses from client use file handle Mount can be either at boot or automount Using automount, directories are not mounted during boot OS sends a message to servers on first remote file access Automount is helpful since remote dir might not be used at all Mount only affects the client view!

NFS Protocol Supports directory and file access via remote procedure calls (RPCs) All UNIX system calls supported other than open & close Open and close are intentionally not supported For a read, client sends lookup message to server Server looks up file and returns handle Unlike open, lookup does not copy info in internal system tables Subsequently, read contains file handle, offset and num bytes Each message is self-contained Pros: server is stateless, i.e. no state about open files Cons: Locking is difficult, no concurrency control

NFS Implementation Three main layers: System call layer: Handles calls like open, read and close Virtual File System Layer: Maintains table with one entry (v-node) for each open file v-nodes indicate if file is local or remote If remote it has enough info to access them For local files, FS and i-node are recorded NFS Service Layer: This lowest layer implements the NFS protocol

NFS Layer Structure

How NFS works? Mount: Open: Sys ad calls mount program with remote dir, local dir Mount program parses for name of NFS server Contacts server asking for file handle for remote dir If directory exists for remote mounting, server returns handle Client kernel constructs v-node for remote dir Asks NFS client code to construct r-node for file handle Open: Kernel realizes that file is on remotely mounted directory Finds r-node in v-node for the directory NFS client code then opens file, enters r-node for file in VFS, and returns file descriptor for remote node

Cache coherency Clients cache file attributes and data Solutions: If two clients cache the same data, cache coherency is lost Solutions: Each cache block has a timer (3 sec for data, 30 sec for dir) Entry is discarded when timer expires On open of cached file, its last modify time on server is checked If cached copy is old, it is discarded Every 30 sec, cache time expires All dirty blocks are written back to the server

Remote Procedure Call (RPC)

Procedure Call More natural way is to communicate using procedure calls: every language supports it semantics are well defined and understood natural for programmers to use Basic idea: define server as a module that exports a set of procedures callable by client programs. To use the server, the client just does a procedure call, as if it were linked with the server call Client Server return

(Remote) Procedure Call So, we would like to use procedure call as a model for distributed communication. Lots of issues: how do we make this invisible to the programmer? what are the semantics of parameter passing? how is binding done (locating the server)? how do we support heterogeneity (OS, arch., language) etc.

Remote Procedure Call The basic model for Remote Procedure Call (RPC) was described by Birrell and Nelson in 1980, based on work done at Xerox PARC. Goal to make RPC as much like local PC as possible. Used computer/language support. There are 3 components on each side: a user program (client or server) a set of stub procedures RPC runtime support

RPC Basic process for building a server: Server program defines the server’s interface using an interface definition language (IDL) The IDL specifies the names, parameters, and types for all client-callable server procedures A stub compiler reads the IDL and produces two stub procedures for each server procedure: a client-side stub and a server-side stub The server writer writes the server and links it with the server-side stubs; the client writes her program and links it with the client-side stubs. The stubs are responsible for managing all details of the remote communication between client and server.

RPC Stubs Client-side stub is a procedure that looks to the client as if it were a callable server procedure. Server-side stub looks like a calling client to the server The client program thinks it is calling the server; in fact, it’s calling the client stub. The server program thinks it’s called by the client; in fact, it’s called by the server stub. The stubs send messages to each other to make RPC happen.

RPC Call Structure client program client makes local call to stub proc. server is called by its stub proc foo(a,b) begin foo... end foo server program call foo(x,y) call foo call foo stub unpacks params and makes call client stub proc foo(a,b) stub builds msg packet, inserts params call foo(x,y) server stub send msg msg received runtime sends msg to remote node runtime receives msg and calls stub RPC runtime RPC runtime Call

RPC Return Structure client program proc foo(a,b) begin foo... end foo server proc returns server program call foo(x,y) client continues return return stub builds result msg with output args client stub proc foo(a,b) stub unpacks msg, returns to caller call foo(x,y) server stub msg received send msg runtime responds to original msg runtime receives msg, calls stub RPC runtime RPC runtime return

RPC Information Flow Client Stub marshal args call send Client (caller) RPC Runtime return receive unmarshal ret vals mbox2 Network Network Machine A Machine B marshal return values mbox1 return Server (callee) Server Stub unmarshal args send RPC Runtime call receive

RPC Binding Binding is the process of connecting the client and server The server, when it starts up, exports its interface, identifying itself to a network name server and telling the local runtime its dispatcher address. The client, before issuing any calls, imports the server, which causes the RPC runtime to lookup the server through the name service and contact the requested server to setup a connection. The import and export are explicit calls in the code.

RPC Marshalling Marshalling is packing of procedure params into message packet. RPC stubs call type-specific procedures to marshall (or unmarshall) all of the parameters to the call. On client side, client stub marshalls parameters into call packet; On the server side the server stub unmarshalls the parameters to call the server’s procedure. On return, server stub marshalls return parameters into return packet; Client stub unmarshalls return params and returns to the client.

Problems with RPC Non-Atomic failures Performance Different failure modes in distributed system than on a single machine Consider many different types of failures User-level bug causes address space to crash Machine failure, kernel bug causes all processes on same machine to fail Some machine is compromised by malicious party Before RPC: whole system would crash/die After RPC: One machine crashes/compromised while others keep working Can easily result in inconsistent view of the world Did my cached data get written back or not? Did server do what I requested or not? Answer? Distributed transactions/Byzantine Commit Performance Cost of Procedure call « same-machine RPC « network RPC Means programmers must be aware that RPC is not free Caching can help, but may make failure handling complex

Cross-Domain Comm./Location Transparency How do address spaces communicate with one another? Shared Memory with Semaphores, monitors, etc… File System Pipes (1-way communication) “Remote” procedure call (2-way communication) RPC’s can be used to communicate between address spaces on different machines or the same machine Services can be run wherever it’s most appropriate Access to local and remote services looks the same Examples of modern RPC systems: CORBA (Common Object Request Broker Architecture) DCOM (Distributed COM) RMI (Java Remote Method Invocation)

Microkernel operating systems Example: split kernel into application-level servers. File system looks remote, even though on same machine Why split the OS into separate domains? Fault isolation: bugs are more isolated (build a firewall) Enforces modularity: allows incremental upgrades of pieces of software (client or server) Location transparent: service can be local or remote For example in the X windowing system: Each X client can be on a separate machine from X server; Neither has to run on the machine with the frame buffer. App file system Windowing Networking VM Threads Monolithic Structure File sys windows RPC address spaces threads Microkernel Structure

Andrew File System (AFS)

Andrew File System (AFS) Named after Andrew Carnegie and Andrew Mellon Transarc Corp. and then IBM took development of AFS In 2000 IBM made OpenAFS available as open source Features: Uniform name space Location independent file sharing Client side caching with cache consistency Secure authentication via Kerberos Server-side caching in form of replicas High availability through automatic switchover of replicas Scalability to span 5000 workstations

AFS Overview Based on the upload/download model Clients download and cache files Server keeps track of clients that cache the file Clients upload files at end of session Whole file caching is central idea behind AFS Later amended to block operations Simple, effective AFS servers are stateful Keep track of clients that have cached files Recall files that have been modified

AFS Details Has dedicated server machines Clients have partitioned name space: Local name space and shared name space Cluster of dedicated servers (Vice) present shared name space Clients run Virtue protocol to communicate with Vice Clients and servers are grouped into clusters Clusters connected through the WAN Other issues: Scalability, client mobility, security, protection, heterogeneity

AFS: Shared Name Space AFS’s storage is arranged in volumes Usually associated with files of a particular client AFS dir entry maps vice files/dirs to a 96-bit fid Volume number Vnode number: index into i-node array of a volume Uniquifier: allows reuse of vnode numbers Fids are location transparent File movements do not invalidate fids Location information kept in volume-location database Volumes migrated to balance available disk space, utilization Volume movement is atomic; operation aborted on server crash

AFS: Operations and Consistency AFS caches entire files from servers Client interacts with servers only during open and close OS on client intercepts calls, and passes it to Venus Venus is a client process that caches files from servers Venus contacts Vice only on open and close Does not contact if file is already in the cache, and not invalidated Reads and writes bypass Venus Works due to callback: Server updates state to record caching Server notifies client before allowing another client to modify Clients lose their callback when someone writes the file Venus caches dirs and symbolic links for path translation

AFS Implementation Client cache is a local directory on UNIX FS Venus and server processes access file directly by UNIX i-node Venus has 2 caches, one for status & one for data Uses LRU to keep them bounded in size

Summary RPC NFS: AFS: Call procedure on remote machine Provides same interface as procedure Automatic packing and unpacking of arguments without user programming (in stub) NFS: Simple distributed file system protocol. No open/close Stateless server Has problems with cache consistency, locking protocol AFS: More complicated distributed file system protocol Stateful server session semantics: consistency on close

Prelim II Prelims graded Re-grade policy Mean 73 (Median 76), Stddev 13.3, High 98 out of 100! Good job! Re-grade policy Submit written re-grade request to Nazrul. Entire prelim will be re-graded. We were generous the first time… If still unhappy, submit another re-grade request. Nazrul will re-grade herself If still unhappy, submit a third re-grade request. I will re-grade. Final grade is law.

Grade distribution

Question #3 Hardlinks Softlinks Need a count in inode/fileheader Remove decrements count and removes file if count is 0 New syscall: Link(char *src, char *dst) Softlinks Need to file type indicated in inode/fileheader Also, need path to target file Open needs to change to perform recursive lookup New syscall: SymLink(char *src, char *dst)

Question #4 Concurrent writers RAID 0, 1+0, 5, 6 Not 1 or 4 because cannot perform independent writes Concurrent readers (but not concurrent writers) RAID 1 and 4 2*k-1 disks and want concurrent readers Unavailable: RAID 1 — requires 2*k disks Undesirable: RAID 4, 5, and 6 — requires complex controllers

Happy Thanksgiving!!!