Distributed File Systems

Slides:



Advertisements
Similar presentations
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 12: Distributed.
Advertisements

Slides for Chapter 8: Distributed File Systems
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Slides for Chapter 1 Characterization of Distributed Systems From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3,
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
Case Study - GFS.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed system Distributed File System Nguyen Huu Tuong Vinh Huynh Thi Thu Thuy Dang Trang Tri.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
1 Chap8 Distributed File Systems  Background knowledge  8.1Introduction –8.1.1 Characteristics of File systems –8.1.2 Distributed file system requirements.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Exercises for Chapter 12: Distributed.
DISTRIBUTED FILE SYSTEMS Pages - all 1. Topics  Introduction  File Service Architecture  DFS: Case Studies  Case Study: Sun NFS  Case Study: The.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.
Lecture 27-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) December 3, 2013 Lecture 27 Distributed File Systems.
Information Management NTU Distributed File Systems.
1 Reliable Distributed Systems Stateless and Stateful Client- Server Systems.
Presenter: Seikwon KAIST The Google File System 【 Ghemawat, Gobioff, Leung 】
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture Chunkservers Master Consistency Model File Mutation Garbage.
Google File System Robert Nishihara. What is GFS? Distributed filesystem for large-scale distributed applications.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Distributed File Systems (DFS)
Lecture 25: Distributed File Systems
Google Filesystem Some slides taken from Alan Sussman.
Lecture 25: Distributed File Systems
Slides for Chapter 1 Characterization of Distributed Systems
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
Slides for Chapter 8: Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Slides for Chapter 1 Characterization of Distributed Systems
Distributed File Systems
Distributed File Systems
Exercises for Chapter 8: Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File System
Distributed File Systems
Lecture 25: Distributed File Systems
File-System Interface
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Slides for Chapter 1 Characterization of Distributed Systems
CSE 451: Operating Systems Distributed File Systems
Slides for Chapter 15: Replication
Slides for Chapter 1 Characterization of Distributed Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
THE GOOGLE FILE SYSTEM.
by Mikael Bjerga & Arne Lange
Slides for Chapter 18: Replication
Distributed File Systems
Distributed File Systems
Network management system
Lecture 4: File-System Interface
Distributed File Systems
Presentation transcript:

Distributed File Systems 王晓阳 2013.11.15 Most materials from: Coulouris, Dollimore, Kindberg and Blair “Distributed Systems: Concepts and Design” Edition 5, © Addison-Wesley 2012

Outline Introduction Flat file system Three examples Summary NFS AFS GFS Summary

Logical Access Methods Shared Storage System Logical Description Data Characteristics Storage Requirements Storage System Logical Access Methods Access Pattern

Overview/Review: Storage systems and their properties Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM File system 1 UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (DSM, Ch. 18) Remote objects (RMI/ORB) 1 CORBA Persistent object store 1 CORBA Persistent Object Service Peer-to-peer storage system 2 OceanStore (Ch. 10) Types of consistency: 1: strict one-copy. : slightly weaker guarantees. 2: considerably weaker guarantees.

Logical Data Representation Stream of bytes (8-bits) Metadata (or attributes) like “owner”, “time stamp”, “access control list”, “file type”, “file length” etc.

Common Logical Data Access filedes = open(name, mode) filedes = creat(name, mode) Opens an existing file with the given name. Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. status = close(filedes) Closes the open file filedes. count = read(filedes, buffer, n) count = write(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos = lseek(filedes, offset, whence) Moves the read-write pointer to offset (relative or absolute, depending on whence). status = unlink(name) Removes the file name from the directory structure. If the file has no other names, it is deleted. status = link(name1, name2) Adds a new name (name2) for a file (name1). status = stat(name, buffer) Gets the file attributes for file name into buffer.

Storage Requirements Transparency Concurrent updates File replication Hardware & software heterogeneity Fault tolerance Consistency Security Efficiency

Transparency Access transparency Location transparency Unaware of distribution of files Location transparency Uniform name space Mobility transparency No effect to clients when files are moved Performance transparency Perform with varied load Scaling transparency No effect to clients when servers are added

Common Software Structure for File Systems File system modules

Outline Introduction Flat file system Three examples NFS AFS GFS

Flat file system service architecture

(Stateless) Flat File Access

Directory Service Operations

Characteristics Repeatable operations Stateless servers Hierarchic file system Hierarchical names File groups Collection of files Group ID => (IP address + Date) “filesystem”

Outline Introduction Flat file system Three examples NFS AFS GFS

Sun Network File System (NFS) Making remote files as if local to the client

Virtual File System Adds an abstraction to all files, local or remote File handle Filesystem identifier + i-node number of file + i-node generation number Filesystem => Unix mountable filesystem

Figure 12.10 Local and remote file systems accessible on an NFS client Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.

Caching Server caching Client caching Memory buffer Read ahead/delayed write/sync Write through/write/commit Client caching Read validation check Polling for update status Write => delayed/sync Consistency is weak

Summary of NFS Access transparency? Location transparency? Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

Outline Introduction Flat file system Three examples NFS AFS GFS

Andrew File System Difference from NFS Good when: Whole-file serving Whole-file caching Good when: Files are small Read much more than write Sequential access is common Most files are read and written by only one user, when shared, only one user who modifies it Files are referenced in bursts

Figure 12.12 File name space seen by clients of AFS Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012

Cache Consistency Call-back mechanism

Summary of AFS Access transparency? Location transparency? Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

Outline Introduction Flat file system Three examples NFS AFS GFS

What’s the problem, Google? Large set of cheap machines Huge files Sequential reads Append only writes (by many machines)

Figure 21.9 Overall architecture of GFS Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012

GFS operations Create Open Close Read Write Snapshot Record append

GFS Characteristics Three (or more) replicas of each data chunk No client caching Native Unix server caching Logging extensively

Consistency of “Mutation”? Master 1 (request) Client 2 2 (designate the primary) 4 (write) Replica (primary) 3 3 3 (write to buffer) Replica 5 (commit Write) Replica 6: Replicas report to the primary (success/fail) 7: The primary reports to client (success/fail)

Summary of GFS Access transparency? Location transparency? Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

Summary Distributed file systems NFS, AFS, GFS Different characteristics call for different strategies Data characteristics Access characteristics Transparency is important Performance tradeoffs Other kinds of data/access characteristics E.g.: Multimedia server Not discussed Name service Security Latency vs throughput