Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed File Systems

Similar presentations


Presentation on theme: "Distributed File Systems"— Presentation transcript:

1 Distributed File Systems
王晓阳 Most materials from: Coulouris, Dollimore, Kindberg and Blair “Distributed Systems: Concepts and Design” Edition 5, © Addison-Wesley 2012

2 Outline Introduction Flat file system Three examples Summary NFS AFS
GFS Summary

3 Logical Access Methods
Shared Storage System Logical Description Data Characteristics Storage Requirements Storage System Logical Access Methods Access Pattern

4 Overview/Review: Storage systems and their properties
Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM File system 1 UNIX file system Distributed file system Sun NFS Web server Web Distributed shared memory Ivy (DSM, Ch. 18) Remote objects (RMI/ORB) 1 CORBA Persistent object store 1 CORBA Persistent Object Service Peer-to-peer storage system 2 OceanStore (Ch. 10) Types of consistency: 1: strict one-copy. : slightly weaker guarantees. 2: considerably weaker guarantees.

5 Logical Data Representation
Stream of bytes (8-bits) Metadata (or attributes) like “owner”, “time stamp”, “access control list”, “file type”, “file length” etc.

6 Common Logical Data Access
filedes = open(name, mode) filedes = creat(name, mode) Opens an existing file with the given name. Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. status = close(filedes) Closes the open file filedes. count = read(filedes, buffer, n) count = write(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos = lseek(filedes, offset, whence) Moves the read-write pointer to offset (relative or absolute, depending on whence). status = unlink(name) Removes the file name from the directory structure. If the file has no other names, it is deleted. status = link(name1, name2) Adds a new name (name2) for a file (name1). status = stat(name, buffer) Gets the file attributes for file name into buffer.

7 Storage Requirements Transparency Concurrent updates File replication
Hardware & software heterogeneity Fault tolerance Consistency Security Efficiency

8 Transparency Access transparency Location transparency
Unaware of distribution of files Location transparency Uniform name space Mobility transparency No effect to clients when files are moved Performance transparency Perform with varied load Scaling transparency No effect to clients when servers are added

9 Common Software Structure for File Systems File system modules

10 Outline Introduction Flat file system Three examples NFS AFS GFS

11 Flat file system service architecture

12 (Stateless) Flat File Access

13 Directory Service Operations

14 Characteristics Repeatable operations Stateless servers
Hierarchic file system Hierarchical names File groups Collection of files Group ID => (IP address + Date) “filesystem”

15 Outline Introduction Flat file system Three examples NFS AFS GFS

16 Sun Network File System (NFS)
Making remote files as if local to the client

17 Virtual File System Adds an abstraction to all files, local or remote
File handle Filesystem identifier + i-node number of file + i-node generation number Filesystem => Unix mountable filesystem

18

19

20 Figure 12.10 Local and remote file systems accessible on an NFS client
Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2.

21 Caching Server caching Client caching Memory buffer
Read ahead/delayed write/sync Write through/write/commit Client caching Read validation check Polling for update status Write => delayed/sync Consistency is weak

22 Summary of NFS Access transparency? Location transparency?
Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

23 Outline Introduction Flat file system Three examples NFS AFS GFS

24 Andrew File System Difference from NFS Good when: Whole-file serving
Whole-file caching Good when: Files are small Read much more than write Sequential access is common Most files are read and written by only one user, when shared, only one user who modifies it Files are referenced in bursts

25

26 Figure 12.12 File name space seen by clients of AFS
Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn © Pearson Education 2012

27

28 Cache Consistency Call-back mechanism

29

30 Summary of AFS Access transparency? Location transparency?
Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

31 Outline Introduction Flat file system Three examples NFS AFS GFS

32 What’s the problem, Google?
Large set of cheap machines Huge files Sequential reads Append only writes (by many machines)

33 Figure 21.9 Overall architecture of GFS
Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn © Pearson Education 2012

34 GFS operations Create Open Close Read Write Snapshot Record append

35 GFS Characteristics Three (or more) replicas of each data chunk
No client caching Native Unix server caching Logging extensively

36 Consistency of “Mutation”?
Master 1 (request) Client 2 2 (designate the primary) 4 (write) Replica (primary) 3 3 3 (write to buffer) Replica 5 (commit Write) Replica 6: Replicas report to the primary (success/fail) 7: The primary reports to client (success/fail)

37 Summary of GFS Access transparency? Location transparency?
Mobility transparency? Scalability? File replication? Hardware and software heterogeneity? Fault tolerance?

38 Summary Distributed file systems
NFS, AFS, GFS Different characteristics call for different strategies Data characteristics Access characteristics Transparency is important Performance tradeoffs Other kinds of data/access characteristics E.g.: Multimedia server Not discussed Name service Security Latency vs throughput


Download ppt "Distributed File Systems"

Similar presentations


Ads by Google