Lecture 24 Sun’s Network File System. PA3 In clkinit.c.

Slides:



Advertisements
Similar presentations
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Advertisements

The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
1 UNIX Internals – the New Frontiers Distributed File Systems.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Design and Implementation of the Sun Network File System Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon Appears in USENIX Annual.
Distributed File Systems
File System Implementation
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Jan 28, 2003CS475: Internetworking with UNIX TCP/IP1 XDR, RPC, NFS Internetworking with UNIX TCP/IP Winter
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
Lecture 17 FS APIs and vsfs. File and File Name What is a File? Array of bytes. Ranges of bytes can be read/written. File system consists of many files,
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
NETWORK FILE SYSTEM (NFS) By Ameeta.Jakate. NFS NFS was introduced in 1985 as a means of providing transparent access to remote file systems. NFS Architecture.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Network File System (NFS) Brad Karp UCL Computer Science CS GZ03 / M030 6 th, 7 th October, 2008.
Lecture 23 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
NFS : Network File System SMU CSE8343 Prof. Khalil September 27, 2003 Group 1 Group members: Payal Patel, Malka Samata, Wael Faheem, Hazem Morsy, Poramate.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Sun Network File System Presentation 3 Group A4 Sean Hudson, Syeda Taib, Manasi Kapadia.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
Manish Kumar,MSRITSoftware Architecture1 Remote procedure call Client/server architecture.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Lecture 25 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Lecture 22 Sun’s Network File System
Distributed File Systems
Andrew File System (AFS)
File System Implementation
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 15: File System Internals
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Last Class: Communication in Distributed Systems
Presentation transcript:

Lecture 24 Sun’s Network File System

PA3 In clkinit.c

Local file systems: vsfs FFS LFS Data integrity and protection Latent-sector errors (LSEs) Block corruption Misdirected writes Lost writes

Communication Abstractions Raw messages: UDP Reliable messages: TCP PL: RPC call Make it close to normal local procedural call semantic OS: global FS

Sun’s Network File System

NFS Architecture client File Server Local FS RPC

Primary Goal Local FS: processes on same machine access shared files. Network FS: processes on different machines access shared files in same way. sharing centralized administration security

Subgoals Fast+simple crash recovery both clients and file server may crash Transparent access can’t tell it’s over the network normal UNIX semantics Reasonable performance

Overview Architecture API Write Buffering Cache

NFS Architecture client File Server Local FS RPC

Main Design Decisions What functions to expose via RPC? Think of NFS as more of a protocol than a particular file system. Many companies have implemented NFS: Oracle/Sun, NetApp, EMC, IBM, etc

Today’s Lecture We’re looking at NFSv2. There is now an NFSv4 with many changes. Why look at an old protocol? To compare and contrast NFS with AFS.

General Strategy: Export FS mount Server Local FS Client Local FSNFS read

Overview Architecture API Write Buffering Cache

Strategy 1 Wrap regular UNIX system calls using RPC. open() on client calls open() on server. open() on server returns fd back to client. read(fd) on client calls read(fd) on server. read(fd) on server returns data back to client.

Strategy 1 Problems What about crashes? int fd = open(“foo”, O_RDONLY); read(fd, buf, MAX); … read(fd, buf, MAX); Imagine server crashes and reboots during reads…

Subgoals Fast+simple crash recovery both clients and file server may crash Transparent access can’t tell it’s over the network normal UNIX semantics Reasonable performance

Potential Solutions Run some crash recovery protocol upon reboot. complex Persist fds on server disk. what if client crashes instead?

Subgoals Fast+simple crash recovery both clients and file server may crash Transparent access can’t tell it’s over the network normal UNIX semantics Reasonable performance

Strategy 2: put all info in requests Use “stateless” protocol! server maintains no state about clients server still keeps other state, of course Need API change. One possibility: pread(char *path, buf, size, offset); pwrite(char *path, buf, size, offset); Specify path and offset each time. Server need not remember. Pros/cons? Too many path lookups.

Strategy 3: inode requests inode = open(char *path); pread(inode, buf, size, offset); pwrite(inode, buf, size, offset); This is pretty good! Any correctness problems? What if file is deleted, and inode is reused?

Strategy 4: file handles fh = open(char *path); pread(fh, buf, size, offset); pwrite(fh, buf, size, offset); File Handle =

NFS Protocol Examples NFSPROC_GETATTR expects: file handle; returns: attributes NFSPROC_SETATTR expects: file handle, attributes; returns: nothing NFSPROC_LOOKUP expects: directory file handle, name of file/directory to look up; returns: file handle NFSPROC_READ expects: file handle, offset, count; returns: data, attributes NFSPROC_WRITE expects: file handle, offset, count, data; returns: attributes

NFSPROC_CREATE expects: directory file handle, name of file, attributes; returns: nothing NFSPROC_REMOVE expects: directory file handle, name of file to be removed; returns: nothing NFSPROC_MKDIR expects: directory file handle, name of directory, attributes; returns: file handle NFSPROC_RMDIR expects: directory file handle, name of directory to be removed; returns: nothing NFSPROC_READDIR expects: directory handle, count of bytes to read, cookie; returns: directory entries, cookie (to get more entries)

Client Logic Build normal UNIX API on client side on top of the RPC-based API we have described. Client open() creates a local fd object. It contains: file handle offset Client tracks the state for file access Mapping from fd to fh Current file offset

File Descriptors read(5,1024) pread(fh, 123, 1024) fd 5 fh= Off=123 local RPC local local FS

Reading A File: Client-side And File Server Actions

NFS Server Failure Handling When a client does not receive a reply within a certain amount of time Just retry

Append fh = open(char *path); pread(fh, buf, size, offset); pwrite(fh, buf, size, offset); append(fh, buf, size); Would append() be a good idea? Problem: if our client retries if no ACK or return, what happens when append is retried?

TCP Remembers Messages Sender [send message] [timeout] [send message] [recv ack] Receiver [recv message] [send ack] [ignore message] [send ack]

Replica Suppression is Stateful If server crashes, it forgets what RPC’s have been executed! Solution: design API so that there is no harm is executing a call more than once. An API call that has this is “idempotent”. If f() is idempotent, then: f() has the same effect as f(); f(); … f(); f() pwrite is idempotent append is NOT idempotent

Idempotence Idempotent any sort of read pwrite? Not idempotent append What about these? mkdir creat

Subgoals Fast+simple crash recovery both clients and file server may crash Transparent access can’t tell it’s over the network normal UNIX semantics Reasonable performance

Overview Architecture Network API Write Buffering Cache

Write Buffers Server Local FS write buffer Client Local FS NFS write buffer write

What if server crashes? Server Write Buffer Lost client: write A to 0 write B to 1 server mem write C to 2 write X to 0 server disk write Y to 1 write Z to 2 Can we get XBZ on the server?

Write Buffers on the Server Side don’t use server write buffer use persistent write buffer

Cache We can cache data in different places, e.g.: server memory client disk client memory How to make sure all versions are in sync?

“Update Visibility” problem server doesn’t have latest Client NFS Cache: A Server Local FS Cache: A Client NFS Cache: A NFS Cache: B Local FS Cache: B flush

Update Visibility Solution A client may buffer a write. How can server and other clients see it? NFS solution: flush on fd close (not quite like UNIX) Performance implication for short-lived files?

“Stale Cache” problem client doesn’t have latest Client NFS Cache: B Server Local FS Cache: B Client NFS Cache: A NFS Cache: B read

Stale Cache Solution A client may have a cached copy that is obsolete. How can we get the latest? If we weren’t trying to be stateless, server could push out update. NFS solution: clients recheck if cache is current before using it. Cache metadata records when data was fetched. Before it is used, client does a GETATTR request to server: get’s last modified timestamp, compare to cache, and refetch if necessary

Measure then Build NFS developers found stat accounted for 90% of server requests. Why? Because clients frequently recheck cache. Solution: cache results of GETATTR calls. Why is this a terrible solution? Also make the attribute cache entries expire after a given time (say 3 seconds). Why is this better than putting expirations on the regular cache?

Misc VFS interface is introduced Defines the operations that can be done on a file within a file system

Summary Robust APIs are often: stateless: servers don’t remember clients states idempotent: doing things twice never hurts Supporting existing specs is a lot harder than building from scratch! Caching and write buffering is harder in distributed systems, especially with crashes.

Andrew File System Main goal:scale GETATTR is problematic for NFS scalability AFS cache consistency is simple and readily understood AFS is not stateless AFS requires local disk, and it does whole-file caching on the local disk

AFS V1 open: The client-side code intercepts open-system-call; decide ‘is this local file or remote’; contact a server (through the full path string in AFS-1) in case of remote files On the server side: locate the file; send the whole file to client Client side: take the whole file, put it in local disk, return a file-descriptor to user-level read/write: on the client side copy close: send the entire file and pathname to the server

Measure then re-build Main problems for AFSv1 Path-traversal costs are too high The client issues too many TestAuth protocol messages