Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?

Slides:



Advertisements
Similar presentations
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Advertisements

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
1 UNIX Internals – the New Frontiers Distributed File Systems.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Distributed File Systems
File System Implementation
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Remote Files. Traditional Memory Interfaces Process Virtual Memory Virtual Memory File Management File Management Physical Memory Physical Memory Storage.
File Systems (2). Readings r Silbershatz et al: 11.8.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Lecture 23 The Andrew File System. NFS Architecture client File Server Local FS RPC.
1 Overview Assignment 12: hints  Distributed file systems Assignment 11: solution  File systems.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
1 Chap8 Distributed File Systems  Background knowledge  8.1Introduction –8.1.1 Characteristics of File systems –8.1.2 Distributed file system requirements.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
ITEC 502 컴퓨터 시스템 및 실습 Chapter 11-2: File System Implementation Mi-Jung Choi DPNM Lab. Dept. of CSE, POSTECH.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Network File System Protocol
Sun Network File System Presentation 3 Group A4 Sean Hudson, Syeda Taib, Manasi Kapadia.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
Lecture 24 Sun’s Network File System. PA3 In clkinit.c.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
1 Reliable Distributed Systems Stateless and Stateful Client- Server Systems Based on K. Birman’s of Cornell, Dusseu’s of Wisconsin.
Jonathan Walpole Computer Science Portland State University
Lecture 22 Sun’s Network File System
Distributed File Systems
File System Implementation
Journaling File Systems
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Distributed File Systems
Distributed File Systems
Distributed File Systems
Distributed File System
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
Chapter 15: File System Internals
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Distributed File Systems
Presentation transcript:

Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems? How does NFS work? UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 537 Introduction to Operating Systems Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau

What is a distributed file system? Client Server Network Examples: NFS, AFS

Motivation Why are distributed file systems useful? Access from multiple clients –Same user on different machines can access same files Simplifies sharing –Different users on different machines can read/write to same files Simplifies administration –One shared server to maintain (and backup) Improve reliability –Add RAID storage to server

Challenges Transparent access User sees single, global file system regardless of location Scalable performance Performance does not degrade as more clients are added Fault Tolerance Client and server identify and respond appropriately when other crashes Consistency See same directory and file contents on different clients at same time Security Secure communication and user authentication Tension across these goals Example: Caching helps performance, but hurts consistency

Case Study: NFS Sun’s Network File System Introduced in 1980s, multiple versions (v2, v3, v4) Key idea #1: Stateless server Server not required to remember anything (in memory) –Which clients are connected, which files are open,... Implication: All client requests have enough info to complete op –Example: Client specifies offset in file to write to One Advantage: Server state does not grow with more clients Key idea #2: Idempotent server operations Operation can be repeated with same result (no side effects) Example: a=b+1; Not a=a+1; Helps which Challenge??

NFS Overview Remote Procedure Calls (RPC) for communication between client and server Client Implementation Provides transparent access to NFS file system –UNIX contains Virtual File system layer (VFS) –Vnode: interface for procedures on an individual file Translates vnode operations to NFS RPCs Server Implementation Stateless: Must not have anything only in memory Implication: All modified data written to stable storage before return control to client –Servers often add NVRAM to improve performance

Basic NFS Protocol Operations lookup(dirfh, name) returns (fh, attributes) –Use mount protocol for root directory create(dirfh, name, attr) returns (newfs, attr) remove(dirfs, name) returns (status) read(fh, offset, count) returns (attr, data) write(fh, offset, count, data) returns attr gettattr(fh) returns attr What is missing??

Mapping UNIX System Calls to NFS Operations Unix system call: fd = open(“/dir/foo”) Traverse pathname to get filehandle for foo –dirfh = lookup(rootdirfh, “dir”); –fh = lookup(dirfh, “foo”); Record mapping from fd file descriptor to fh NFS filehandle Set initial file offset to 0 for fd Return fd file descriptor Unix system call: read(fd,buffer,bytes) Get current file offset for fd Map fd to fh NFS filehandle Call data = read(fh, offset, bytes) and copy data into buffer Increment file offset by bytes Unix system call: close(fd) Free resources assocatiated with fd

Client-side Caching Caching needed to improve performance Reads: Check local cache before going to server Writes: Only periodically write-back data to server Avoid contacting server –Avoid slow communication over network –Server becomes scalability bottleneck with more clients Two client caches data blocks attributes (metadata)

Cache Consistency Problem: Consistency across multiple copies (server and multiple clients) How to keep data consistent between client and server? –If file is changed on server, will client see update? –Determining factor: Read policy on clients How to keep data consistent across clients? –If write file on client A and read on client B, will B see update? –Determining factor: Write and read policy on clients

NFS Consistency: Reads Reads: How does client keep current with server state? Attribute cache: Used to determine when file changes –File open: Client checks server to see if attributes have changed If haven’t checked in past T seconds (configurable, T=3) –Discard entries every N seconds (configurable, N=60) Data cache –Discard all blocks of file if attributes show file has been modified Eg: Client cache has file A’s attributes and blocks 1, 2, 3 Client opens A: Client reads block 1 Client waits 70 seconds Client reads block 2 Block 3 is changed on server Client reads block 3 Client reads block 4 Client waits 70 seconds Client reads block 1

NFS Consistency: Writes Writes: How does client update server? Files –Write-back from client cache to server every 30 seconds –Also, Flush on close() Directories –Synchronously write to server Example: Client X and Y have file A (blocks 1,2,3) cached Clients X and Y open file A Client X writes to blocks 1 and 2 Client Y reads block 1 30 seconds later... Client Y reads block 2 40 seconds later... Client Y reads block 1

Conclusions Distributed file systems Important for data sharing Challenges: Fault tolerance, scalable performance, and consistency NFS: Popular distributed file system Key features: –Stateless server, idempotent operations: Simplifies fault tolerance –Crashed server appears as slow server to clients Client caches needed for scalable performance –Rules for invalidating cache entries and flushing data to server are not straight-forward –Data consistency very hard to reason about