A Low-bandwidth Network File System Presentation by Joseph Thompson.

Slides:



Advertisements
Similar presentations
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Advertisements

The Zebra Striped Network File System Presentation by Joseph Thompson.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
OVERVIEW LBFS MOTIVATION INTRODUCTION CHALLENGES ADVANTAGES OF LBFS HOW LBFS WORKS? RELATED WORK DESIGN SECURITY ISSUES IMPLEMENTATION SERVER IMPLEMENTATION.
1 Network File System (NFS) a)The remote access model. b)The upload/download model.
Bandwidth and latency optimizations Jinyang Li w/ speculator slides from Ed Nightingale.
L-18 More DFS. 2 Review of Last Lecture Distributed file systems functionality Implementation mechanisms example  Client side: VFS interception in kernel.
Distributed Systems 2006 Styles of Client/Server Computing.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
File System Implementation
Incremental Network Programming for Wireless Sensors NEST Retreat June 3 rd, 2004 Jaein Jeong UC Berkeley, EECS Introduction Background – Mechanisms of.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
Wide-area cooperative storage with CFS
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
THE EVOLUTION OF NFS Dave Hitz and Andy Watson Network Appliance, Inc.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren Stanley.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Network File Systems Victoria Krafft CS /4/05.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
A Low-Bandwidth Network File System A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, NYU.
Network File Systems II Frangipani: A Scalable Distributed File System A Low-bandwidth Network File System.
A LOW-BANDWIDTH NETWORK FILE SYSTEM A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, New York U.
New Protocols for Remote File Synchronization Based on Erasure Codes Utku Irmak Svilen Mihaylov Torsten Suel Polytechnic University.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Distributed File Systems
Information Systems and Network Engineering Laboratory II DR. KEN COSH WEEK 1.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
SPECULATIVE EXECUTION IN A DISTRIBUTED FILE SYSTEM E. B. Nightingale P. M. Chen J. Flint University of Michigan.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
Lecture 8 – Distributed File Systems Distributed Systems.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
Solutions for the Fourth Problem Set COSC 6360 Fall 2014.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Synchronization in Distributed File Systems Advanced Operating System Zhuoli Lin Professor Zhang.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Mobile File Systems.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Distributed File Systems
File System Implementation
6.4 Data and File Replication
Nache: Design and Implementation of a Caching Proxy for NFSv4
Nache: Design and Implementation of a Caching Proxy for NFSv4
Web Caching? Web Caching:.
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
THE GOOGLE FILE SYSTEM.
Presentation transcript:

A Low-bandwidth Network File System Presentation by Joseph Thompson

Problem Without a network file system, people have two methods of editing files remotely: –Make and edit local copies of files Risk of update conflicts –Use remote login If low latency networks, unresponsive applications become a problem. Most network files systems are designed for local high bandwidth networks.

Goal To create a network file system capable of operating over a WAN by building a system that consumes less bandwidth than most current file systems.

Plan Provide traditional file system semantics and consistency. Exploit cross-file similarities. Use positive aspects of other file systems –NFS, AFS, Echo, JetFile, CODA, Bayou, OceanStore, TACT, Rsync.

Provide traditional file system semantics and consistency LBFS provides close-to-open consistency. –After a client has successfully written and closed a file, the data is safely stored to the server. Wanted to build a file system that could directly substitute for a widely accepted network file system in use today.

Consistency con’t (more detail) Server issues read leases to clients accessing a file. –“The lease is a commitment on the part of the server to notify the client of any modifications made to that file during the term of the lease” Files committed Atomically “If multiple clients are writing the same file, then the last one to close the file will win and overwrite changes from the others”

Exploit cross-file similarities LBFS institutes a large client-side persistent file cache. When a process needs a new file it checks its file cache to see if it can reuse already downloaded segments of a file. When it writes data, it only sends chunks of data that are different from the servers.

Indexing file chunks Uses the SHA-1 has function to hash file chunks. Assumes that there are no hashing collisions between different file chunks. The implication is that any chunk that hashes to the same index contains the same data. Using this method we can determine whether or not we need to send the file data by sending hashes of file chunks.

Indexing Woes Imagine fixed chunk offsets (every 8 bytes). –If you insert one byte to the front of the file, all chunks will have their bytes moved one over and all potential savings are lost. Rsync looks at two files with the same name and tries to do a file comparison to see which parts need to be resent –This method negates benefits of renamed files, files that are build from other files, and files that have similar segments based on being written by the same application.

LBFS’ Solution Divide files into chunks dynamically with each modification Reads every overlapping 48-byte region using the Rabin fingerprint algorithm to chose break- points (chunk boundaries). –Rabin fingerprint used because of its efficient computation and its highly uniform distribution properties. –Given the probability of the Rabin algorithm, each chunk size is estimated to be 8KB. –In order to avoid inefficiencies, a min/max chunk sizes are enforced: 2KB/64KB.

Example Explained Example shows how new chunks are created/destroyed based on file modifications.

File Reads New RPC GETHASH function returns a vector (array) containing the hash values of all chunks in a file.

File Writes “LBFS uses temporary files to implement atomic updates. The server first creates a unique temporary file, writes the temporary file, and only then atomically commits the contents to the real file being updated” Four new RPC Functions: –MKTMPFILE Creates a temporary file for use in atomic update –TMPWRITE Writes to the temp file on the server instead of the permanent one –CONDWRITE Includes a hash value the server can check and if the chunk needs to be written the server returns HASHNOTFOUND msg –COMMITTMP If no errors have occurred during any of the previous calls, committmp merges the temporary file with the permanent version and updates the file chunks.

File Write con’t

Graphs explained

Graphs con’t

Graphs Finished

Paper’s Summary “In many situations, LBFS makes transparent remote file access a viable and less frustrating alternative to running interactive programs on remote machines”