Abstract HyFS: A Highly Available Distributed File System Jianqiang Luo, Mochan Shrestha, Lihao Xu Department of Computer Science, Wayne State University Motivation HyFS Architecture HyFS Components File System Interface (FUSE) FUSE is a user space file system, which is a good tool to quickly develop a file system prototype File Operation Lib (fopen, fread, fwrite, fseek, fclose) This library supports fault tolerance ability for a single file. It provides POSIX file operation API. It can be used independently from HyFS Erasure Codes Lib (encoding, decoding) All encoding/decoding functions related with erasure codes are implemented in this library. This library can be also used outside of HyFS Network File System (NFS) HyFS is a stackable file system. To be a highly distributed file system, HyFS is configured to run on NFS HyFS Features High Flexibility A general framework is designed to support any erasure codes, and applications decide which code to use Easy to configure POSIX File API POSIX file operation API is supported by a library which is independent from HyFS, and it can be readily used by applications File System Level HyFS is a Linux file system, which can be installed in any popular Linux system. It has been tested on Ubuntu and SUSE Erasure Codes Support Some erasure codes are so far supported for academic research Building highly available distributed file systems is crucial to any commercial or scientific application. HyFS is designed to achieve this goal by employing erasure codes to implement distributed file system. HyFS provides a general framework to support any erasure code to be used. Thus, by applying different erasure codes, HyFS offers high flexibility for customizations to meet various application requirements. High Availability Requirement Storage devices are not as high available as we expect In large data centers, hardware failure is a common thing Proper data redundancy is the key to provide high reliability, availability and survivability Existing Solutions Most current fault tolerance file systems use replication as the redundancy scheme, which suffers from: High cost of hardware purchase and maintenance Performance in writing data, multiple replication of the same data Future Work Performance Test Overhead of encoding and decoding Examine key factors for the HyFS performance Scalability Study HyFS scalability when deployed to a large network Support More Functions Latent error recovery: latent sector error correction decoding algorithms are to be included Data modification detection: error detection algorithms are to be integrated to check data integrity HyFS and its supported erasure codes HyFS Demo HyFS Ideas Erasure Codes and File System Novel use of erasure codes in file system to achieve high availability with affordable cost File data will be stored to multiple storage nodes by employing erasure codes (encoding process) File data can be constructed from some of these storage nodes (decoding process) Erasure Codes MDS Erasure Codes (n, k) A message has k bits, we store it as n bits by adding (n-k) redundancy bits. To recover the message, we only need to have ANY k bits among these n bits Erasure Codes in HyFS When saving a file, we store it to n storage nodes. When reading the file, we only need to have ANY k accessible storage nodes to recover the file, thus we can tolerate up to (n-k) storage node failures Demo Steps Data Preparation A static html file with its pictures files is copied to a HyFS file system. HyFS is configured to use EVENODD codes (5, 3) as the erasure code and five flash drives as the storage nodes Fault Tolerance Test 1:Startup HyFS file system in a command console 2:All five flash drives are connected. Launch Firefox to check the static html file stored in HyFS. It succeeds 3:Detach one flash drive, and open Firefox to check the html files availability in HyFS. It succeeds 4: Detach two flash drives, and open Firefox again to check the html files availability in HyFS. It still succeeds HyFSDevelopment Tips It is being developed in the NISL lab at Wayne State University It starts from scratch It is developed in C language to get the best performance It has 17,000 lines of code so far, and it is still under active development HyFS Architecture (1)(2) (3) (4)