A Low-Bandwidth Network File System A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, NYU.

Slides:



Advertisements
Similar presentations
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Advertisements

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
OVERVIEW LBFS MOTIVATION INTRODUCTION CHALLENGES ADVANTAGES OF LBFS HOW LBFS WORKS? RELATED WORK DESIGN SECURITY ISSUES IMPLEMENTATION SERVER IMPLEMENTATION.
Bandwidth and latency optimizations Jinyang Li w/ speculator slides from Ed Nightingale.
L-18 More DFS. 2 Review of Last Lecture Distributed file systems functionality Implementation mechanisms example  Client side: VFS interception in kernel.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
PRASHANTHI NARAYAN NETTEM.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
THE EVOLUTION OF NFS Dave Hitz and Andy Watson Network Appliance, Inc.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Network File Systems Victoria Krafft CS /4/05.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
Network File Systems II Frangipani: A Scalable Distributed File System A Low-bandwidth Network File System.
A LOW-BANDWIDTH NETWORK FILE SYSTEM A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, New York U.
1 The Google File System Reporter: You-Wei Zhang.
New Protocols for Remote File Synchronization Based on Erasure Codes Utku Irmak Svilen Mihaylov Torsten Suel Polytechnic University.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
Networked File System CS Introduction to Operating Systems.
POSTER TEMPLATE BY: Whitewater HTTP Vulnerabilities Nick Berry, Joe Joyce, & Kevin Vaccaro. Syntax & Routing Attempt to capture.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Kerberos Named after a mythological three-headed dog that guards the underworld of Hades, Kerberos is a network authentication protocol that was designed.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
SPECULATIVE EXECUTION IN A DISTRIBUTED FILE SYSTEM E. B. Nightingale P. M. Chen J. Flint University of Michigan.
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Distributed File Systems
ENERGY-EFFICIENCY AND STORAGE FLEXIBILITY IN THE BLUE FILE SYSTEM E. B. Nightingale and J. Flinn University of Michigan.
A Low-bandwidth Network File System Presentation by Joseph Thompson.
Lecture 8 – Distributed File Systems Distributed Systems.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Solutions for the Fourth Problem Set COSC 6360 Fall 2014.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Synchronization in Distributed File Systems Advanced Operating System Zhuoli Lin Professor Zhang.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Distributed File Systems
File System Implementation
Nache: Design and Implementation of a Caching Proxy for NFSv4
Chapter 16: Distributed System Structures
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Synchronization in Distributed File System
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Building a Database on S3
Today: Coda, xFS Case Study: Coda File System
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
Cary G. Gray David R. Cheriton Stanford University
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Transactions in Distributed Systems
THE GOOGLE FILE SYSTEM.
Distributed File Systems
Distributed File Systems
Presentation transcript:

A Low-Bandwidth Network File System A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, NYU

Key Ideas A network file systems for slow or wide-area networks Exploits similarities between files or versions of the same file Avoids sending data that can be found in the server’s file system or the client’s cache Also uses conventional compression and caching Requires 90% less bandwidth than traditional network file systems

Working on slow networks Make local copies Must worry about update conflicts Use remote login Only for text-based applications Use instead a LBFS Better than remote login Must deal with issues like auto-saves blocking the editor for the duration of transfer

LBFS Exploits cross-file similarities especially with previous versions of the same file Auto-save files, … LBFS file server divides the files it stores into chunks and indexes the chunks by hash value LBFS client similarly indexes a large persistent file cache LBFS never transfers chunks that the recipient already has

Previous Work (I) AFS Callbacks require server to notify clients when a cached file has been modified Leases achieve same goal but have an expiration time Coda supports slow networks and even disconnected operation Defers some updates to saves bandwidth OceanStore applies Bayou’s conflict resolution mechanisms to a file system

Previous Work (II) Operation-based updates (Lee et al.) Proxy-client close to the server duplicates client computations in the hope of duplicating its output files Spring and Wetherall propose to use two large cooperating caches storing identical copies of the last n megabytes of network traffic Rsync uses directory tree mirroring at client and server.

LBFS LBFS provides close-to-open consistency Similar to AFS session consistency LBFS assumes clients will have a cache large enough to contain a user’s entire working set of files When possible, LBFS reconstitutes files using chunks of existing data in the file system and client cache instead of transmitting those chunks over the network

Indexing Issues Major challenge is keeping the index a reasonable size while dealing with shifting offsets Indexing conventional file blocks would not work Indexing and hashing overlapping file blocks at all offsets would require too much space

LBFS Solution Considers only non-overlapping chunks of files Sets chunk boundaries based on file contents to avoid sensitivity to shifting file offset Examines every overlapping 48-byte region of the file to selects boundary regions, or breakpoints, using Rabin fingerprints Expected chunk size is 8 KB plus the size of the 48-byte breakpoint window

Handling Insertions

More Indexing Issues Pathological cases Very small chunks Sending hashes of chunks would consume as much bandwidth as just sending the file Very large chunks Cannot be sent in a single RPC LBFS imposes minimum and maximum chuck sizes

The Chunk Database Indexes each chunk by the first 64 bits of its SHA-1 hash To avoid synchronization problems, LBFS always recomputes the SHA-1 hash of any data chunk before using it Simplifies crash recovery Recomputed SHA-1 values are also used to detect hash collisions in the database

Protocol Based on NFS version 3 Adds Extensions to exploit inter-file commonality (GETHASH) Leases Compresses all traffic using conventional gzip

File Consistency (I) Whenever a client makes any RPC on an LBFS file, it gets back a read lease on the file. If a user opens a file whose lease has expired, the client asks the server for the attributes of the file Grants the client a lease on the file. Client can check if it has the current version of the file in its cache If the file times have changed, client must obtain new contents of file from server

File Consistency (II) No need for write leases LBFS provides close-to-open consistency Server never demands back a dirty file If multiple clients are writing the same file,the last one to close the file will overwrite changes from the others File updates are atomic Limits damage caused by concurrent updates

Security Issues LBFS uses SFS security infrastructure Servers have public keys Messages are encrypted Specific security issue: A user could check whether the file system contains a particular chunk of data by observing subtle timing differences in server’s answer to CONDWRITE request

Implementation (I)

Implementation (II) Uses NFS Two NFS-related issues When server commits a temporary file to a target file, it must copy the contents of the temporary file onto the target file to preserve the target file i-node Hard to preserve previous contents of a truncated file Message order is guaranteed by TCP

Evaluation (I) Communality of data in /usr/local

Evaluation (II) Normalized bandwidth consumption (2 of 3 benchmarks)

Key First four bars of each workload show upstream bandwidth, the second four downstream bandwidth. CIFS is Windows natural network file system “Leases+Gzip” uses LBFS file caching, leases, and data compression but not its chunking scheme “LBFS, new DB” is LBFS starting with a a new database

Evaluation (III) Normalized application times

Key Execution times weere normalized orma,ized execution times Measurements made over a cable modem link with 384 Kb/sc uplink and 1.5 Mb/s downlink LAN data were obtained on a 100 Mb/s full- duplex LAN.

Conclusion Under normal circumstances, LBFS consumes 90% less bandwidth than traditional file systems. Makes transparent remote file access a viable and less frustrating alternative to running interactive programs on remote machines.