Ch 11 Distributed File System Ch11.1 Architecture Srithi Reddy Muthyala Oct 6 2017
Three Archs to Introduce Client-Server Arch (Centralized) NFS (Network File System) Cluster-based Arch (Less Centralized) GFS (Global File System) Symmetric Arch (Fully Distributed) DHT-based (Distributed Hash Table)
Three Archs to Introduce Client-Server Arch (Centralized) NFS Cluster-based Arch (Less Centralized) GFS Symmetric Arch (Fully Distributed) DHT-based
Intro to NFS 2 ways of C-S Arch Naive way. RPC
Intro to NFS- basics Although implemented by SUN Solaris, it is the predominant FS implementation on Unix System Layered Structure VFS: Virtual File System- Common Interface for Remote file and local RPC: For data transport
NFS API Interfaces
Three Archs to Introduce Client-Server Arch (Centralized) NFS Cluster-based Arch (Less Centralized) GFS Symmetric Arch (Fully Distributed) DHT-based
Cluster-Based Distributed File Systems Downsides of a C-S Arch Performance bottle neck Single-Point-Failure Solution: Files(resources) can be stored on a few servers A big file across multi servers File Stripping for big structured files Many files on different servers Most files are not well structured
Cluster-Based Distributed File Systems How to support file access in a Data Center? Files permanently growing File size might be multi gigabytes. A server might be malfunction File access request from any client should be responded in any condition
Cluster-Based Distributed File Systems
Cluster-Based Distributed File Systems GFS, how does it work? A cluster has a master node, which ONLY keeps meta information of files A big file is splited into CHUNKS, a CHUNK of size 64Mbs. Chunks are spread on many chunk servers More details on GFS Chunks are replicated --- Redundancy Master does not keep up-to-date of chunk locations A Chunks server knows what exactly it stores. If client retrieval failed(low probability), ask Master again, master update latest info from chunk servers
Cluster-Based Distributed File Systems GFS, how does it work? File update. Client pushes back updated file chunk to corresponding chunk server Chunk server conducts the backup/replication Master node is kept out of this loop, bottle neck problem is solved I/O performance of a GFS is pretty good and scalability is good as well
Three Archs to Introduce Client-Server Arch (Centralized) NFS(Network File System) Cluster-based Arch (Less Centralized) GFS ( Global File System) Symmetric Arch (Fully Distributed) DHT-based (Distributed Hash Table)
Symmetric Arch Peer-to-Peer No Client, No server, No Master, No Chunk First realization is Ivy (Multi user Read/Write)
Symmetric Arch
What is a DHT? Hash Table data structure that maps “keys” to “values” essential building block in software systems Distributed Hash Table (DHT) similar, but spread across many hosts Interface insert(key, value) lookup(key)
Symmetric Arch Ivy details Data storage. File composed of 8kb data blocks. Content-hash data blocks Public-key based blocks Replication Every block B is stored on K immediate successors, better availability
DHT: basic idea K V K V K V K V K V K V K V K V K V K V K V Operation: take key as input; route messages to node holding key
Future Developments Client-Server Arch (Centralized) NFS Cluster-based Arch (Less Centralized) GFS Symmetric Arch (Fully Distributed) DHT-based
Reference Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung. "The Google file system." ACM SIGOPS operating systems review. Vol. 37. No. 5. ACM, 2003. Sandberg, Russel, et al. "Design and implementation of the Sun network filesystem." Proceedings of the Summer USENIX conference. 1985. Muthitacharoen, Athicha, et al. "Ivy: A read/write peer-to-peer file system." ACM SIGOPS Operating Systems Review 36.SI (2002): 31-44. Naor, Moni, and Udi Wieder. "A simple fault tolerant distributed hash table."Peer-to-Peer Systems II. Springer Berlin Heidelberg, 2003. 88-97. Cai, Min, Ann Chervenak, and Martin Frank. "A peer-to-peer replica location service based on a distributed hash table." Proceedings of the 2004 ACM/IEEE conference on Supercomputing. IEEE Computer Society, 2004. Kleiman, Steve R. "Vnodes: An Architecture for Multiple File System Types in Sun UNIX." USENIX Summer. Vol. 86. 1986.