CSE 542: Operating Systems

Slides:



Advertisements
Similar presentations
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Advertisements

CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Distributed Systems 2006 Styles of Client/Server Computing.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Jan 17, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements I will be out of town Monday and Tuesday to present at Multimedia Computing and Networking.
Advanced Operating Systems - Spring 2009 Lecture 21 – Monday April 6 st, 2009 Dan C. Marinescu Office: HEC 439 B. Office.
Distributed File Systems
Switch off your Mobiles Phones or Change Profile to Silent Mode.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
Transaction Processing Concepts Muheet Ahmed Butt.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Dsitributed File Systems
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
System Models Advanced Operating Systems Nael Abu-halaweh.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
Storage Systems CSE 598d, Spring 2007 Lecture 13: File Systems March 8, 2007.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 7: Advanced File System Management.
File-System Management
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
File System Implementation
Distributed File Systems
File System Implementation
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Today: Coda, xFS Case Study: Coda File System
A Redundant Global Storage Architecture
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Printed on Monday, December 31, 2018 at 2:03 PM.
Distributed File Systems
Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Chapter 2: Operating-System Structures
Distributed File Systems
Distributed File Systems
CSE 542: Operating Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 542: Operating Systems
CSE 451: Operating Systems Distributed File Systems
Chapter 15: File System Internals
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Mid term grades Mean = 48.59, Median = 48.5, Min = 40, Max = 56.
Review Stateless (NFS) vs Statefull (AFS)
File system implementation
Distributed File Systems
Outline for today Feasibility of a Serverless Distributed File System deployed on an Existing set of Desktop PCs – Microsoft research. ACM SIGMETRICS 2000.
Content Distribution Network
Chapter 2: Operating-System Structures
Distributed File Systems
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

CSE 542: Operating Systems Outline Chapter 15: Distributed System Structures Chapter 16: Distributed File Systems AFS paper Should be familiar to you - ND uses AFS for most of its file storage 24-Feb-19 CSE 542: Operating Systems

Advantages of Distributed Systems Resource sharing Computation speedup Load sharing Reliability Replicated services - e.g. web services (yahoo.com) Network Operating Systems Explicit network service access Distributed Systems - transparent Data migration Computation migration Process migration 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Network constraints Specific system design depends on the network constraints LAN vs WAN (latency, reliability, available bandwidth, etc.) Naming and Name resolution (Internet address) Routing, data transmission, connection and other networking strategies Distributed File System as a Distributed “Operating” system service 24-Feb-19 CSE 542: Operating Systems

Distributed File System Naming and transparency: Location transparency: Name does not hint on the file’s physical storage location (/net/wizard/tmp is not location transparent) Location independence: Name does not have to be changed when the physical storage location changes AFS provides location independence (/afs/nd.edu/user37/surendar) 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Remote file access Caching scheme Cache consistency problem Blocks (NFS) to files (AFS) Cache location Main memory vs disk vs remote memory Cache update policy Write-through policy, delayed-write policy (consistency vs performance) Consistency (client initiated or server initiated) Depends on who maintains state 24-Feb-19 CSE 542: Operating Systems

Stateful vs stateless service Either server tracks each file access or it provides block service (stateless) AFS vs NFS Server crash looks like a slow server to stateless client. Server crash means that state has to be rebuilt in stateful server Server needs to perform orphan detection and elimination to detech “dead” clients in stateful service Stateless servers: larger requests packets, as each request carrys the complete state Replication - to improve availability 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems AFS Developed in mid 80’s at CMU to support about 5000 workstations on campus Stateful server with call backs for invalidation Shared global name space Clusters of servers implement this name space at the granularity of volumes All client requests are encrypted AFS uses ACLs for directories and UNIX protection for files 24-Feb-19 CSE 542: Operating Systems

File operations and consistency semantics Each client provides a local disk cache Clients cache entire files (for the most part - AFS3 allows blocks) Large files pose problems with local cache and initial latency Clients register call back with server & Server notifies clients on a conflict read-write conflict to invalidate cache On close, data is written back to the server Directory and symbolic links are also cached in later versions AFS coexists with UNIX file systems and uses UNIX calls for cached copies 24-Feb-19 CSE 542: Operating Systems

Design principles for AFS and Coda Workstations have cycles to burn - use them Cache whenever possible Exploit file usage properties Temporary files are not stored in AFS Systems files use read-only replication Minimize system wide knowledge and change Trust the fewest possible entities Batch if possible 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Extra material Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Chord Content Distribution Network 24-Feb-19 CSE 542: Operating Systems

Content Distribution Networks (slides courtesy Girish Borkar: Udel) original content Replica congested Replica Not congested Client 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Persistent store E.g. files (traditional operating systems), persistent objects (in a object based system) Applications operate on objects in persistent store Powerpoint operates on a persistent .ppt file, mutating its contents Palm calendar operates on my calendar which is replicated in myYahoo, Palm Desktop and the Pilot itself Storage is cheap but maintenance is not ~ 4 $/GB 24-Feb-19 CSE 542: Operating Systems

Global Persistent Store Persistent store is fundamental for future ubiquitous computing because it allows "devices" to operate transparently, consistently and reliably on data. Transparent: Permits behavior to be independent of the device themselves Consistently: Allows users to safely access the same information from many different devices simultaneously. Reliably: Devices can be rebooted or replaced without losing vital configuration information 24-Feb-19 CSE 542: Operating Systems

Persistent store on a wide-scale 10 billion users, 10,000 files per user = 100 trillion files!! Information: should be separated from location. To achieve uniform and highly-available access to information, servers must be geographically distributed, but exploit caching close to clients for performance must be secure must be durable must be consistent 24-Feb-19 CSE 542: Operating Systems

Oceanstore system model: Data Utility CaliforniaStore IndianaStore USAStore SanJoseStore Ameritech End User with roaming access 24-Feb-19 CSE 542: Operating Systems

Oceanstore system model: Data Utility CaliforniaStore IndianaStore USAStore SanJoseStore Ameritech End User with roaming access 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Oceanstore Goals Untrusted infrastructure (utility model – telephone) Only clients can be trusted Servers can crash, or leak information to third parties Most of the servers are working correctly most of the time Class of trusted servers that can carry out protocols on the clients behalf (financially liable for integrity of data) Nomadic Data Access Data can be cached anywhere, anytime (promiscuous caching) Continuous introspective monitoring to locate data close to the user 24-Feb-19 CSE 542: Operating Systems

Oceanstore Persistent Object Named by a globally unique id (GUID) Such GUIDs are hard to use. If you are expecting 10 trillion files, your GUID will have to be a long (say 128 bit) ID rather than a simple name passwd vs 12agfs237dfdfhj459uxzozfk459ldfnhgga self-certifying names secureHash(/id=surendar,ou=uga,key=<SecureKey>/etc/passwd) -> uniqueId Map uniqueId->GUID Users would use symbolic links for easy usage /etc/passwd -> uniqueId 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems SecureHash Pros: The self-certifying name specifies my access rights Cons: If I lose the key, the data is lost Key management issues Keys can be upgraded Keys can be revoked How do we share data? 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Access Control All read-shared-users share an encryption key Revocation: Data should be deleted from all replicas Data should be re-encrypted New keys should be distributed Clients can still access old data till it is deleted in all replicas All writes are signed Validity checked by Access Control Lists (ACLs) If A says trust B, B says trust C, C says trust D, what can you infer about A ? D 24-Feb-19 CSE 542: Operating Systems

Oceanstore Persistent Object Objects are replicated on multiple servers. Replicated objects are not tied to particular servers i.e. floating replicas Replicas located by a probabilistic algorithm first before using a deterministic algorithm Data can be active or archival. Archival data is read-only and spread over multiple servers – deep archival storage 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Updates Objects are modified through updates (data is never overwritten) i.e. versioning system Application level conflict resolution Updates consist of a predicate and value pair. If a predicate evaluates to true, the corresponding value is applied. <room 453 free?>, <reserve room> <room 527 free?>, <reserve room> <else> <go to Jittery Joes> This is similar to Bayou 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Introspection Oceanstore uses introspection to monitor system behavior Use this information for cluster recognition Use this information for replica management 24-Feb-19 CSE 542: Operating Systems

MSR Serverless Distributed File System They’ve actually implemented this system within Microsoft and hence have real results Assumption 1: not-fully-trusted environment Assumption 2: Disk space is not that free Each disk is partitioned into three areas: Scratch area – for local computations Global storage area Local cache for global storage 24-Feb-19 CSE 542: Operating Systems

Efficiency consideration Compress data in storage Coalesce distinct files that have identical contents Probably an artifact of Windows environment that stores files in specific locations e.g. c:\windows\system\ File are replicated Machines that are topologically close Machines that are lightly loaded Non-cache reads and writes to prevent buffer cache pollution 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Replica management Files in a directory are replicated together When new machines join, its data is replicated to other machines Replicas of other files are moved into the new machine When machine leaves, the data in that machine is replicated in other machines from other replicas 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Security File updates are digitally signed File contents are encrypted before replication Convergent encryption to coalesce encrypted file Encryption: Hash(file contents) -> uniqueHash Encrypt(unencrypterfile, uniqueHash)->encryptedfile User1: encrypt(UserKey1, uniqueHash) -> Key1 User2: encrypt(UserKey2, uniqueHash) -> Key2 Decryption User1: decrypt(UserKey1, Key1) -> uniqueHash Decrypt(encryptedfile, uniqueHash) -> unencryptedfile 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Application API Related read, write operations to objects form a session (defined by the application developer) Users specify the session guarantees required for each session Applications can register call back functions for exceptions 24-Feb-19 CSE 542: Operating Systems

Transactions (Database technology) A transaction is a program unit that must be executed atomically; either the entire unit is executed or none at all. The transaction either completes in its entirety, or it does not (or at least, nothing appears to have happened). A transaction can generally be thought of as a sequence of reads and writes, which is either committed or aborted. A committed transaction is one that has been completed entirely and successfully, whereas an aborted transaction is one that has not. If a transaction is aborted, then the state of the system must be rolled-back to the state it had before the aborted transaction began. 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems ACID semantics Atomicity – each transaction is atomic, every operation succeeds or none at all Consistency – maintaining correct invariants across the data before and after the transaction Isolation - either has the value before the atomic action or after it, but never intermediate Durability – persistent on stable storage (backups, transaction logging, checkpoints) 24-Feb-19 CSE 542: Operating Systems

CSE 542: Operating Systems Relaxed semantics Relax the ACID constraints We could relax consistency for better performance (ala Bayou) where you are willing to tolerate inconsistent data for better performance. For example, you are willing to work with partial calendar update and are willing to work with partial information rather than wait for confirmed data. More on this later on in the course. 24-Feb-19 CSE 542: Operating Systems