Enhancements to NFS 王信富 R88725039 2000/11/6. Introduction File system modules File system modules –Directory module –File module –Access control module.

Slides:



Advertisements
Similar presentations
Petal and Frangipani. Petal/Frangipani Petal Frangipani NFS “SAN” “NAS”
Advertisements

The Zebra Striped Network Filesystem. Approach Increase throughput, reliability by striping file data across multiple servers Data from each client is.
The Zebra Striped Network File System Presentation by Joseph Thompson.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Scalable Clusters Jed Liu 11 April Overview Microsoft Cluster Service Built on Windows NT Provides high availability services Presents itself to.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
Lesson 1: Configuring Network Load Balancing
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
PRASHANTHI NARAYAN NETTEM.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Client-Server Computing in Mobile Environments
Distributed File System: Design Comparisons II Pei Cao.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Hands-On Microsoft Windows Server 2008 Chapter 8 Managing Windows Server 2008 Network Services.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Configuring File Services Lesson 6. Skills Matrix Technology SkillObjective DomainObjective # Configuring a File ServerConfigure a file server4.1 Using.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Networked File System CS Introduction to Operating Systems.
Module 12: Designing High Availability in Windows Server ® 2008.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CH2 System models.
Distributed File Systems
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Distributed File Systems Case Studies: Sprite Coda.
Chapter 20 Distributed File Systems Copyright © 2008.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Serverless Network File Systems Overview by Joseph Thompson.
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
Distributed File Systems
Information Management NTU Distributed File Systems.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Review CS File Systems - Partitions What is a hard disk partition?
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Configuring File Services
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Today: Coda, xFS Case Study: Coda File System
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Today: Distributed File Systems
Distributed File Systems
Distributed File Systems
Presentation transcript:

Enhancements to NFS 王信富 R /11/6

Introduction File system modules File system modules –Directory module –File module –Access control module –File access module –Block module

Introduction Distributed file system requirements Distributed file system requirements –Transparency »Access transparency »location transparency »Scaling transparency –Consistency –Security

Sun NFS architecture

Andrew file system architecture

Mobility enhancement Mobile file system (MFS) Mobile file system (MFS)

Mobile file system (MFS) Client modules Client modules

Mobile file system (MFS) Proxy modules Proxy modules »Source: Maria-Teresa Segarra IRISA Research Institute Campus de Beaulieu

Mobility enhancementcont. NFS/M NFS/M –Enable the mobile user to access the information regardless of »the location of the user »the state of the communication channel »the state of the data server

NFS/M architecture

NFS/M modules

Cache Manager (CM) Cache Manager (CM) –All the file system operations to any cached objects in the local disk cache are managed by the CM –It functions only in the connected phase

NFS/M modules Proxy Server (PS) Proxy Server (PS) –Emulates the functionalities of the remote NFS server by using the cached file system objects in the local disk cache –It functions in the disconnected phase

NFS/M modules Reintegrator (RI) Reintegrator (RI) –Propagates the changes of the data objects in the local disk cache performed during the disconnected period back to the NFS server –Three tasks for the RI »Conflict detection »Update propagation »Conflict resolutions

NFS/M modules Data Prefetcher (DP) Data Prefetcher (DP) –Improving data access performance –Data prefetching techniques can be classified into two categories »Informed prefetching »Predictive prefecting

NFS/M modules

Phase of the NFS/M NFS/M client maintains an internal state which terms as phase, which is used to indicate how file system service provided under different conditions of network connectivity NFS/M client maintains an internal state which terms as phase, which is used to indicate how file system service provided under different conditions of network connectivity Three phases: Three phases: –Connected phase –Disconnected phase –Reintegration

Phase of the NFS/M »John C.S. Lui, Oldfield K.Y. So, T.S. Tam, Department of Computer Science & Engineering

Case : wireless andrew It builds on the university’s wired network infrastructure,which currently provides 10/100Mb/s Ethernet service It builds on the university’s wired network infrastructure,which currently provides 10/100Mb/s Ethernet service To supply high-speed wireless service to the campus, lucent WaveLAN equipments have installed To supply high-speed wireless service to the campus, lucent WaveLAN equipments have installed For wireless access off campus or otherwise out of the range of the WaveLAN network, using cellular digital packet data For wireless access off campus or otherwise out of the range of the WaveLAN network, using cellular digital packet data

Case : wireless andrew »Wireless Andrew [mobile computing for university campus] 1999 IEEE

Reference URL w.htm w.htm w.htm w.htm verview.html verview.html verview.html verview.html k.html k.html k.html k.html

Scalibility about NFS 學生:朱漢農 R /11/6

NFS - Scalability NFS - Scalability AFS - Scalability AFS - Scalability NFS Enhancement -Spritely NFS, NQNFS, WebNFS, NFS Version4 NFS Enhancement -Spritely NFS, NQNFS, WebNFS, NFS Version4 AFS Enhancement - RAID, LSF, xFS AFS Enhancement - RAID, LSF, xFS Frangipani Frangipani

NFS - Scalability The performance of a single server can be increased by the addition of processors, disks and controllers. The performance of a single server can be increased by the addition of processors, disks and controllers. When the limits of that process are reached, additional servers must be installed and the filesystems must be reallocated between them. When the limits of that process are reached, additional servers must be installed and the filesystems must be reallocated between them.

NFS - Scalability (cont’d) The effectiveness of that strategy is limited by the existence of ‘hot spot’ files. The effectiveness of that strategy is limited by the existence of ‘hot spot’ files. When loads exceed the maximum performance, a distributed file system that supports replication of updatable files, or one that reduces the protocol traffic by the caching of whole files, may offer a better solution. When loads exceed the maximum performance, a distributed file system that supports replication of updatable files, or one that reduces the protocol traffic by the caching of whole files, may offer a better solution.

AFS - Scalability The differences between AFS and NFS are attributable to the identification of scalability as the most important design goal. The differences between AFS and NFS are attributable to the identification of scalability as the most important design goal. The key strategy is the caching of whole file in client nodes. The key strategy is the caching of whole file in client nodes.

AFS - Scalability (cont’d) Whole-file serving: The entire contents of directories and files are transmitted to client computers by AFS servers. Whole-file serving: The entire contents of directories and files are transmitted to client computers by AFS servers. Whole-file caching: Once a copy of a file has been transferred to a client computer, it is stored in a cache on the local disk. Whole-file caching: Once a copy of a file has been transferred to a client computer, it is stored in a cache on the local disk. The cache is permanent, surviving reboots of the client computer. The cache is permanent, surviving reboots of the client computer.

NFS enhancement - Spritely NFS is an implementation of the NFS protocol with the addition of open and close calls. is an implementation of the NFS protocol with the addition of open and close calls. The parameters of the Sprite open operation specify a mode and include counts of the number of local processes that currently have the file open for reading and for writing. The parameters of the Sprite open operation specify a mode and include counts of the number of local processes that currently have the file open for reading and for writing. Spritely NFS implements a recovery protocol that interrogates a list of clients to recover the full open files table. Spritely NFS implements a recovery protocol that interrogates a list of clients to recover the full open files table.

NFS enhancement - NQNFS maintains similar client-related state concerning open files, but it uses leases to aid recovery after a server crash. maintains similar client-related state concerning open files, but it uses leases to aid recovery after a server crash. Callbacks are used in a similar manner to Spritely NFS to request clients to flush their caches when a write request occurs. Callbacks are used in a similar manner to Spritely NFS to request clients to flush their caches when a write request occurs.

NFS enhancement - WebNFS makes it possible for application programs to become clients of NFS servers anywhere in the Internet (using the NFS protocol directly) makes it possible for application programs to become clients of NFS servers anywhere in the Internet (using the NFS protocol directly) implementing Internet applcations that share data directly, such as multi-user games or clients of large dynamics databases. implementing Internet applcations that share data directly, such as multi-user games or clients of large dynamics databases.

NFS enhancement - NFS version 4 will include the features of WebNFS will include the features of WebNFS the use of callback or leases to maintain consistency the use of callback or leases to maintain consistency on-the-fly recovery on-the-fly recovery Scalability will be improved by using proxy servers in a manner analogous to their use in the Web. Scalability will be improved by using proxy servers in a manner analogous to their use in the Web.

AFS enhancement RAID RAID Log-structured file storage Log-structured file storage xFS xFS –implements a software RAID storage system, striping file data across disks on multiple computers together with a log-structuring technique.

Frangipani A highly scalable distributed file system developed and deployed at the Digital Systems Research Center. A highly scalable distributed file system developed and deployed at the Digital Systems Research Center.

Frangipani (cont’d) The responsibility for managing files and associated tasks is assigned to hosts dynamically. The responsibility for managing files and associated tasks is assigned to hosts dynamically. All machines see a unified file name space with coherent access to shared updatable files. All machines see a unified file name space with coherent access to shared updatable files.

Frangipani - System Structure Two totally independent layers - Two totally independent layers - 1. Petal distributed virtual disk system - Data is stored in a log-structured and striped format in the virtual disk store. - Providing a storage repository - Providing highly available storage that can scale in throughput and capacity as resources are added to it - Petal implements data replication for high availability, obviating the need for Frangipani to do so.

Frangipani - System Structure (cont’d) 2. Frangipani server modules. - Providing names, directories, and files - Providing a file system layer that makes Petal useful to applications while retaining and extending its good properties.

Frangipani

Frangipani

Frangipani

Frangipani

Frangipani - Logging and Recovery uses write-ahead redo logging of metadata to simplify failure recovery and improve performance. uses write-ahead redo logging of metadata to simplify failure recovery and improve performance. User data is not logged. User data is not logged. Each Frangipani has its own private log in Petal. Each Frangipani has its own private log in Petal. As long as the underlying Petal volume remains available, the system tolerates an unlimited number of Frangipani failures. As long as the underlying Petal volume remains available, the system tolerates an unlimited number of Frangipani failures.

Frangipani - Logging and Recovery Frangipani’s locking protocol ensures that updates requested to the same data by different servers are serialized. Frangipani’s locking protocol ensures that updates requested to the same data by different servers are serialized. Frangipani ensures that recovery applies only updates that were logged since the server acquired the locks that cover them, and for which it still holds the locks. Frangipani ensures that recovery applies only updates that were logged since the server acquired the locks that cover them, and for which it still holds the locks.

Frangipani - Logging and Recovery Recovery never replays a log describing an update that has already been completed. Recovery never replays a log describing an update that has already been completed. For each block that a log record updates, the record contains a description of the changes and the new version number. For each block that a log record updates, the record contains a description of the changes and the new version number. During recovery, the changes to a block are applied only if the block version number is less than the record version number. During recovery, the changes to a block are applied only if the block version number is less than the record version number.

Frangipani - Logging and Recovery Frangipani reuses freed metadata blocks only to hold new metadata. Frangipani reuses freed metadata blocks only to hold new metadata. At any time, only one recovery demon is trying to replay the log region of a specific server. At any time, only one recovery demon is trying to replay the log region of a specific server. If a sector is damaged such that reading it returns a CRC error, Petal’s built-in replication can recover it. If a sector is damaged such that reading it returns a CRC error, Petal’s built-in replication can recover it.

Frangipani - Logging and Recovery In both local UNIX and Frangipani, a user can get better consistency semantics by calling fsync at suitable checkpoint. In both local UNIX and Frangipani, a user can get better consistency semantics by calling fsync at suitable checkpoint.

Frangipani - Synchronization and Cache Coherence Frangipani uses multiple-reader/single- writer locks to implement the necessary synchronization. Frangipani uses multiple-reader/single- writer locks to implement the necessary synchronization. When the lock service detects conflicting lock requests, the current holder of the lock is asked to release or downgrade it to remove the conflict. When the lock service detects conflicting lock requests, the current holder of the lock is asked to release or downgrade it to remove the conflict.

Frangipani - Synchronization and Cache Coherence When a Frangipani crashes, the locks that it owns cannot be released until appropriate recovery actions have been performed. When a Frangipani crashes, the locks that it owns cannot be released until appropriate recovery actions have been performed. When a Frangipani’s lease expires, the lock service will ask the clerk on another machine to perform recovery and release all locks belonging to the crashed Frangipani. When a Frangipani’s lease expires, the lock service will ask the clerk on another machine to perform recovery and release all locks belonging to the crashed Frangipani.

Frangipani - Synchronization and Cache Coherence Petal can continue operation in the face of network partitions, as long as a majority of the Petal remain up and in communication. Petal can continue operation in the face of network partitions, as long as a majority of the Petal remain up and in communication. The lock service continues operation as long as a majority of lock servers are up and in communication. The lock service continues operation as long as a majority of lock servers are up and in communication.

Frangipani - Synchronization and Cache Coherence If a Frangipani server is partitioned away from the lock service, it will be unable to renew its lease. If a Frangipani server is partitioned away from the lock service, it will be unable to renew its lease. If a Frangipani server is partitioned away from Petal, it will be unable to read or write the virtual disk. If a Frangipani server is partitioned away from Petal, it will be unable to read or write the virtual disk.

Frangipani - Adding Servers The new server need be told which Petal virtual disk to use and where to find the lock service. The new server need be told which Petal virtual disk to use and where to find the lock service. The new server contacts the lock service to obtain a lease, determines which portion of the log space to use from the lease identifier. The new server contacts the lock service to obtain a lease, determines which portion of the log space to use from the lease identifier.

Frangipani - Removing Servers Simply shut the server off. Simply shut the server off. Preferable for the server to flush all its dirty data and release its locks before halting, but not strictly be needed Preferable for the server to flush all its dirty data and release its locks before halting, but not strictly be needed

Frangipani - Servers halts abruptly Recovery will run on its log the next time one of its locks is needed, birnging the shared disk into a consistent state. Recovery will run on its log the next time one of its locks is needed, birnging the shared disk into a consistent state. Petal can also be added and removed transparently. Lock servers are added and removed in a similar manner. Petal can also be added and removed transparently. Lock servers are added and removed in a similar manner.

Frangipani - Scaling Operational latencies are unchanged and throughput scales linearly as servers are added. Operational latencies are unchanged and throughput scales linearly as servers are added.

Frangipani - Scaling

The performance is seen to scale well because of no contention until the ATM links to the Petal are saturated. The performance is seen to scale well because of no contention until the ATM links to the Petal are saturated. Since the virtual disk is replicated, each write from a Frangipani server turns into two writes to the Petal. Since the virtual disk is replicated, each write from a Frangipani server turns into two writes to the Petal.

Frangipani - Conclusions Providing its users with coherent, shared access to the same set of files, yet is scalable to providing more storage space, higher performance, and load balancing Providing its users with coherent, shared access to the same set of files, yet is scalable to providing more storage space, higher performance, and load balancing It was feasible to build because of its two- layer structure, consisting of multiple file servers running the same file system code on top of a shared Petal. It was feasible to build because of its two- layer structure, consisting of multiple file servers running the same file system code on top of a shared Petal.

Reference Source Timothy Mann and Edward K. Lee,. Frangipani: A Scalable Distributed File System Timothy Mann and Edward K. Lee,. Frangipani: A Scalable Distributed File System