Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:

Similar presentations


Presentation on theme: "Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:"— Presentation transcript:

1 Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Email: ghada@fcih.net Web: www.fcih.net/ghada/teaching/operating- systems-2 CS 342: OS-2 Fall 2015 1 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E

2 Why we need DFS? CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 2

3 DFS: Distributed File Systems  Allows multi-computer systems multiple processes to share files  Redirecting user to the right copy of data.  How DFS are organized?  Client-Server Architectures.  Cluster-Based.  Symmetric Architectures. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 3

4 Client-Server Architectures (1)  A distributed file system enables clients to access files stored on one or more remote file servers.  File service  a specification of what the file system offers to clients.  A file service is specified by a set of file operations available to the user to access the service.  File server  The implementation of a file service and runs on one or more machines.  How to access files?  Upload/download model (entire files)  Remote access model (remote file operations) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 4 How DFS are organized?

5 Client-Server Architectures (2) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 5 The remote access modelThe upload/download model Work done at the server Consistent sharing (+) Server may be a bottleneck (‐) High communication cost (‐) Work done at the client Consistency is harder to maintain (‐) Low communication cost (+) Client-Server Architectures NFS Unaware of actual file location FTP

6 Server Types Stateless serversStateful servers Server does not maintain any client state Client must specify location for read/write, re‐authenticate for each request Can easily recover from failure: no need to restore any State Server provides open and close operations and maintains client state (e.g., files opened by each client, current read/write pointer for each file) Authenticate once at file open time, client does not need to specify location for read/write in request message Server must ensure that state can be recovered after a crash CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 6 Some terms you have to know about

7 NFS: Network File Systems (1)  NFS is a DFS that allows a user on a client computer to access files stored on a remote server as though they were on the user's own computer  Developed by Sun Microsystems in 1984  Client/server architecture  Client requests are forwarded to remote server  Client requests are implemented as remote procedure calls (RPCs)  NSF is OS‐independent: client and server  implementations exist for almost all operating systems and platforms  Allowing a heterogeneous collection of processes, possibly running different operating systems and machines to share a common file system. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 7 Client-Server Architectures

8 NFS: Network File Systems (2) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 8 Client-Server Architectures

9 NFS: Network File Systems (3) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 9 Client-Server Architectures  The virtual file system (VFS) layer is added to the UNIX kernel to allow applications to access different types of file systems in a uniform way  VFS provides a standard file system interface, hides difference between accessing local and remote file systems

10 NFS File System Model  Files are hierarchically organized into a naming graph  A directory node contains the mappings between file names and file handles (i.e., unique file identifiers)  To access a file, a client must first look up its name and obtain the associated file handle  In NFSv3, servers are stateless  No open and close operations  Server must check permission on each read and write call  In NFSv4, servers are stateful  open and close operations are provided  Server checks permission at file open time CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 10 Client-Server Architectures

11 NFS: An incomplete list of file system operations (1) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 11 Client-Server Architectures Replaced by open Subsumed by create Subsumed by remove

12 NFS: An incomplete list of file system operations (2) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 12 Client-Server Architectures Stateful server

13 Cluster-Based File Systems  With very large data collections, following a simple client- server approach is not going to work  For speeding up file accesses, apply striping techniques by which files can be fetched in parallel. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 13 Cluster-Based Architectures distributing whole files across several servers striping files for parallel access

14 The organization of a Google cluster of servers CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 14 Cluster-Based Architectures Master does not try to stay consistent all the time – update info by polling Chunks replicated Update by appending Massive numbers of servers = highly likely one or more is down! The master maintains only (file name, chunk server) table in main memory )

15 Symmetric Architectures  Depends on peer-to-peer technology.  Store data blocks in the underlying P2P system:  Every data block with content D is stored on a node with hash h(D).  Allows for integrity check.  Public-key blocks are signed with associated private key and looked up with public key. (DHT)  A local log of file operations to keep track of > pairs. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 15 Symmetric Architectures

16 DHT  Quickly find any given item  Can also distribute responsibility for data storage  Key/value pairs  The key value controls which node(s) stores the value  Each node is responsible for some section of the space  Basic operations: Store(key, val) …. val = Retrieve(key) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 16 Symmetric Architectures

17 The organization of the Ivy distributed file system CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 17 Symmetric Architectures DHash only knows about data blocks NFS-like

18 NFS CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 18

19 NFS: Implementations  CIFS (Microsoft Common Internet File System based on SMB protocol). Widely used in Microsoft Windows Networks and in heterogeneous environment.  NFS (SUN Microsystems initial implementation). Widely used in Unix environment.  Andrew file system (Carnegie-Mellon university implementation). Widely used in distributed and in academic environment. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 19

20 NFS: RPCs  In NFSv3, every operation is implemented as an RPC  NFSv4 supports compound procedures by which several operations can be grouped into a single RPC  Better performance in wide‐area networks CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 20 Reading data from a file in v3 Reading data using a compound procedure in v4.

21 NFS: RPC2 subsystem (1) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 21  An enhancement for RPC developed as part of Coda file system.  Support side effects : a mechanism by which the client and server can communicate using an application-specific protocol.  Ex: a client opening a video file at a server.  They need to set up a continuous data stream.  RPC2 allows the client and server to setup a separate connection for transferring the video on time.

22 NFS: RPC2 subsystem (2) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 22 Bulk transfer of large file Side effects in Coda’s RPC2 system pack the parameters to the message

23 NFS: RPC2 subsystem (3) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 23 Sending an invalidation message one at a time Or in parallel. When a file is modified a server invalidates local copies by notification If C1 crashes C2 can be delayed

24 NFS: Naming (1)  NSF provides clients transparent access to a remote file system by letting a client mount (part of) a remote file system into its own local file system  A sever can export a directory (i.e., make a directory and its entries available to clients)  An exported directory can be mounted into a client’s local name space  Each file on the server are identified by the file handler. And using file handler clients can access this file.  FreeBSD NFS implementation create file handlers using inode + file system id + generation number. The main aim of this manipulation to create file handler globally unique. CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 24

25 NFS: Naming (2) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 25 Mounting (part of) a remote file system in NFS mount –t nfs Server1:/export/people /usr/students mount –t nfs Server2:/nfs/users /usr/staff

26 NFS: Naming (Mount) When to mount remote file system? Boot time + Consistent view of FS - May do unnecessary work - Takes longer to boot On explicit command by user + Give user control - Require user to know & do things Automount + “Subdirectories magically appear” - “Subdirectories magically appear” CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 26

27 NFS: Naming (Automount) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 27

28 NFS: caching CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 28 Cashed data modified  must be flushed back to the server

29 NFS: caching (callback mechanism to recall file delegation) CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 29

30 Thanks CS 342: OS-2 Fall 2015 DISTRIBUTED SYSTEMS: PRINCIPLES AND PARADIGMS BY TANENBAUM, ANDREW S., VAN STEEN, MAARTEN 2E 30


Download ppt "Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:"

Similar presentations


Ads by Google