Network File Systems Victoria Krafft CS 614 10/4/05.

Slides:

Advertisements

Similar presentations

SUNDR: Secure Untrusted Data Repository

Advertisements

Mendel Rosenblum and John K. Ousterhout Presented by Travis Bale 1.

High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.

The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.

The Zebra Striped Network File System Presentation by Joseph Thompson.

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.

11-May-15CSE 542: Operating Systems1 File system trace papers The Zebra striped network file system. Hartman, J. H. and Ousterhout, J. K. SOSP '93. (ACM.

OVERVIEW LBFS MOTIVATION INTRODUCTION CHALLENGES ADVANTAGES OF LBFS HOW LBFS WORKS? RELATED WORK DESIGN SECURITY ISSUES IMPLEMENTATION SERVER IMPLEMENTATION.

G Robert Grimm New York University Sprite LFS or Let’s Log Everything.

The design and implementation of a log-structured file system The design and implementation of a log-structured file system M. Rosenblum and J.K. Ousterhout.

Coda file system: Disconnected operation By Wallis Chau May 7, 2003.

Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.

G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.

Large Scale Sharing GFS and PAST Mahesh Balakrishnan.

DEFER Cache – an Implementation Sudhindra Rao and Shalaka Prabhu Thesis Defense Master of Science Department of ECECS OSCAR Lab.

CS252/Patterson Lec /28/01 CS 213 Lecture 10: Multiprocessor 3: Directory Organization.

MetaSync File Synchronization Across Multiple Untrusted Storage Services Seungyeop Han Haichen Shen, Taesoo Kim*, Arvind Krishnamurthy,

Case Study - GFS.

File Systems (2). Readings r Silbershatz et al: 11.8.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

Distributed File Systems Sarah Diesburg Operating Systems CS 3430.

A Low-Bandwidth Network File System A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, NYU.

Network File Systems II Frangipani: A Scalable Distributed File System A Low-bandwidth Network File System.

A LOW-BANDWIDTH NETWORK FILE SYSTEM A. Muthitacharoen, MIT B. Chen, MIT D. Mazieres, New York U.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.

Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Atul Adya, Bill Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken,

1 The Google File System Reporter: You-Wei Zhang.

RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.

Networked File System CS Introduction to Operating Systems.

Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.

1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.

Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.

Chapter 20 Distributed File Systems Copyright © 2008.

The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.

Properties of Layouts Single failure correcting: no two units of same stripe are mapped to same disk –Enables recovery from single disk crash Distributed.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Presenters: Rezan Amiri Sahar Delroshan

Serverless Network File Systems Overview by Joseph Thompson.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.

Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.

A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.

ENERGY-EFFICIENCY AND STORAGE FLEXIBILITY IN THE BLUE FILE SYSTEM E. B. Nightingale and J. Flinn University of Michigan.

A Low-bandwidth Network File System Presentation by Joseph Thompson.

Distributed FS, Continued Andy Wang COP 5611 Advanced Operating Systems.

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?

GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.

CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)

Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.

Truly Distributed File Systems Paul Timmins CS 535.

Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1.

Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.

DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)

Outline Introduction and motivation, The architecture of Tycho,

Distributed File Systems

Memory Management for Scalable Web Data Servers

CMSC 611: Advanced Computer Architecture

Today: Coda, xFS Case Study: Coda File System

Cooperative Caching, Simplified

Lecture 15 Reading: Bacon 7.6, 7.7

Distributed FS, Continued

Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.

Database System Architectures

Presentation transcript:

Network File Systems Victoria Krafft CS /4/05

General Idea People move around Machines may want to share data Want a system with: No new interface for applications No need to copy all the data No space consuming version control

Network File Systems Diagram from

A Brief History Network File System (NFS) developed in 1984 Simple client-server model Some problems Andrew File System (AFS) developed in 1985 Better performance More caching client-side SFS 1999 NFS can be run on untrusted networks

Lingering Issues Central server is a major bottleneck All choices still require lots of bandwidth LANs getting faster & lower latency Remote memory faster than local disk ATM faster with more nodes sending data

Cooperative Caching Michael D. Dahlin, Randolph Y. Wang, Thomas E. Anderson, and David A. Patterson in 1994 ATM, Myrinet provide faster, low-latency network This makes remote memory 10-20x faster than disk Want to get data from memory of other clients rather than server disk

Cooperative Caching Data can be found in: 1. Local memory 2. Server memory 3. Other client memory 4. Server disk How should we distribute cache data?

Design Decisions Private/Global Coop. Cache? Coordinated Cache Entries? Static/Dynamic Partition? Block Location? Weighted LRU Hash N-Chance Direct Client Cooperation Greedy Forwarding Cent. Coord Private Any Client DynamicStatic Coordination Global No Coordination Fixed

Direct Client Cooperation Active clients use idle client memory as a backing store Simple Don’t get info from other active clients

Greedy Forwarding Each client manages its local cache greedily Server stores contents of client caches Still potentially large amounts of data duplication No major cost for performance improvements

Centrally Coordinated Caching Client cache split into two parts – local and global

N-Chance Forwarding Clients prefer to cache singlets, blocks stored in only one client cache. Instead of discarding a singlet, set recirculation count to n and pass on to a random other client.

Sensitivity Variation in Response Time with Client Cache Size Variation in Response Time with Network Latency

Simulation results Average read response timeSever load

Simulation results Slowdown

Results N-Chance forwarding close to best possible performance Requires clients to trust each other Requires fast network

Serverless NFS Thomas E. Anderson, Michael D. Dahlin, Jeanna M. Neefe, David A. Patterson, Drew S. Roselli, and Randolph Y. Wang in 1995 Eliminates central server Takes advantage of ATM and Myrinet

Starting points RAID: Redundancy if nodes leave or fail LFS: Recovery when system fails Zebra: Combines LFS and RAID for distributed systems Multiprocessor Cache Consistency: Invalidating stale cache info

To Eliminate Central Servers Scaleable distributed metadata, which can be reconfigured after a failure Scalable division into groups for efficient storage Scalable log cleaning

How it works Each machine has one or more roles: 1. Client 2. Storage Server 3. Manager 4. Cleaner Management split among metadata managers Disks clustered into stripe groups for scalability Cooperative caching among clients

xFS xFS is a prototype of the serverless network file system Lacks a couple features: Recovery not completed Doesn’t calculate or distribute new manager or stripe group maps No distributed cleaner

File Read

File Write Buffered into segments in local memory Client commits to storage Client notifies managers of modified blocks Managers update index nodes & imaps Periodically, managers log changes to stable storage

Distributing File Management First Writer – management goes to whoever created the file *does not include all local hits

Cleaning Segment utilization maintained by segment writer Segment utilization stored in s-files Cleaning controlled by stripe group leader Optimistic Concurrency control resolves cleaning / writing conflicts

Recovery Several steps are O(N 2 ), but can be run in parallel Steps For Recovery

xFS Performance Aggregate Bandwidth Writing 10MB files Aggregate Bandwidth Reading 10MB files NFS max with 2 clients AFS max with 32 clients AFS max with 12 clients NFS max with 2 clients

xFS Performance Average time to complete the Andrew benchmark, varying the number of simultaneous clients

System Variables Aggregate Large-Write Bandwidth with Different Storage Server Configurations Variation in Average Small File Creation Speed with more Managers

Possible Problems System relies on secure network between machines, and trusted kernels on distributed nodes Testing done on Myrinet

Low-Bandwidth NFS Want efficient remote access over slow or wide area networks File systems better than CVS, copying all data over Want close-to-open consistency

LBFS Large client cache containing user’s working set of files Don’t send all the data – reconstitute files from previous data, and only send changes

File indexing Non-overlapping chunks between 2K and 64K Broken up using 48 byte Rabin fingerprints Identified by SHA-1 hash, indexing on first 64 bits Stored in database, recomputed before use to avoid synchronization issues

Protocol Based on NFS, added GETHASH, MKTMPFILE, TMPWRITE, CONDWRITE, COMMITTMP Security infrastructure from SFS Whole file caching Retrieve from server on read unless valid copy in cache Write back to server when file closed

File Reads

File Writes

Implementation LBFS server accesses file system as an NFS client Server creates trash directory for temporary files Server inefficient when files overwritten or truncated, which could be fixed by lower- level access. Client uses xfs driver

Evaluation

Bandwidth consumption Much higher bandwidth for first build

Application Performance

Bandwidth and Round Trip Time

Conclusions New technologies open up new possibilities for network file systems Cost of increased traffic over Ethernet may cause problems for xFS, cooperative caching.