Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain, Sander Klous, Justin Wozniak, Paul Brenner, Aaron Striegel, and Jesus.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
The Consequences of Decentralized Security in a Cooperative Storage System Douglas Thain, Chris Moretti, Paul Madrid, Phil Snowberger, and Jeff Hemmes.
PlanetLab Operating System support* *a work in progress.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Research Issues in Cooperative Computing Douglas Thain
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Enabling Data-Intensive Science with Tactical Storage Systems Douglas Thain
Enabling Data-Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame
Common System Components
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame
Hands-On Microsoft Windows Server 2003 Administration Chapter 6 Managing Printers, Publishing, Auditing, and Desk Resources.
1 DNS,NFS & RPC Rizwan Rehman, CCS, DU. Netprog: DNS and name lookups 2 Hostnames IP Addresses are great for computers –IP address includes information.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Module – 7 network-attached storage (NAS)
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
 Distributed Software Chapter 18 - Distributed Software1.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Networked File System CS Introduction to Operating Systems.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Distributed File Systems
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
Page 1 of John Wong CTO Twin Peaks Software Inc. Mirror File System A Multiple Server File System.
Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
Enabling Data Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame
Flexibility, Manageability and Performance in a Grid Storage Appliance John Bent, Venkateshwaran Venkataramani, Nick Leroy, Alain Roy, Joseph Stanley,
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
The Globus eXtensible Input/Output System (XIO): A protocol independent IO system for the Grid Bill Allcock, John Bresnahan, Raj Kettimuthu and Joe Link.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
MINIX Presented by: Clinton Morse, Joseph Paetz, Theresa Sullivan, and Angela Volk.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
BIG DATA/ Hadoop Interview Questions.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Disk Cache Main memory buffer contains most recently accessed disk sectors Cache is organized by blocks, block size = sector’s A hash table is used to.
CASTOR: possible evolution into the LHC era
Introduction to Data Management in EGI
CHAPTER 3 Architectures for Distributed Systems
Storage Virtualization
Advanced Operating Systems
Haiyan Meng and Douglas Thain
Page Replacement.
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
INFNGRID Workshop – Bari, Italy, October 2004
Distributed File Systems
Distributed File Systems
Presentation transcript:

Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame

Abstract Users of distributed systems encounter many practical barriers between their jobs and the data they wish to access. Problem: Users have access to many resources (disks), but are stuck with the abstractions (cluster NFS) provided by administrators. Solution: Tactical Storage Systems allow any user to create, reconfigure, and tear down abstractions without bugging the administrator.

Transparent Distributed Filesystem shared disk The Standard Model

Transparent Distributed Filesystem shared disk Transparent Distributed Filesystem shared disk private disk private disk private disk private disk FTP, SCP, RSYNC, HTTP,...

Problems with the Standard Model Users encounter partitions in the WAN. –Easy to access data inside cluster, hard outside. –Must use different mechanisms on diff links. –Difficult to combine resources together. Resources go unused. –Disks on each node of a cluster. –Unorganized resources in a department/lab. Unnecessary cross-talk between users. –User A demands async NFS for performance. –User B demands sync NFS for consistency. A global file system is not possible!

What if... Users could easily access any storage? I could borrow an unused disk for NFS? An entire cluster can be used as storage? Multiple clusters could be combined? I could reconfigure structures without root? –(Or bugging the administrator daily.) Solution: Tactical Storage System (TSS)

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Dynamic Linking in HEP Simulation –Remote Database Access in HEP Simulation –Expandable Filesystem for Experimental Data –Expandable Database for Bioinformatics Simulation Ongoing Work –Malloc, Dynamic Views, DACLs, PINS Final Thought

Tactical Storage Systems (TSS) A TSS allows any node to serve as a file server or as a file system client. All components can be deployed without special privileges – but with security. Users can build up complex structures. –Filesystems, databases, caches,... Two Independent Concepts: –Resources – The raw storage to be used. –Abstractions – The organization of storage.

file system file system file system file system file system file system file system Central Filesystem App Distributed Database Abstraction Adapter App Distributed Filesystem Abstraction Adapter App Cluster administrator controls policy on all storage in cluster UNIX Workstations owners control policy on each machine. file server file server file server file server file server file server file server UNIX ??? Adapter

Components of a TSS: 1 – File Servers 2 – Catalogs 3 – Abstractions 4 – Adapters

1 – File Servers Unix-Like Interface –open/close/read/write –getfile/putfile to stream whole files –opendir/stat/rename/unlink Complete Independence –choose friends –limit bandwidth/space –evict users? Trivial to Deploy –run server + setacl –no privilege required –can be thrown into a grid system Flexible Access Control file server A file server B Chirp Protocol file system owner of server A owner of server B

Access Control in File Servers Unix Security is not Sufficient –No global user database possible/desirable. –Mapping external credentials to Unix gets messy. Instead, Make External Names First-Class –Perform access control on remote, not local, names. –Types: Globus, Kerberos, Unix, Hostname, Address Each directory has an ACL: globus:/O=NotreDame/CN=DThain RWLA RWL hostname:*.cs.nd.edu RL address: * RWLA

Problem: Shared Namespace file server globus:/O=NotreDame/* RWLAX a.out test.ctest.dat cms.exe

Solution: Reservation (V) Right file server O=NotreDame/CN=* V(RWLA) /O=NotreDame/CN=Monk RWLA mkdir a.outtest.c /O=NotreDame/CN=Monk mkdir /O=NotreDame/CN=Ted RWLA a.outtest.c /O=NotreDame/CN=Ted mkdir only!

2 - Catalogs catalog server catalog server periodic UDP updates HTTP XML, TXT, ClassAds

3 - Abstractions An abstraction is an organizational layer built on top of one or more file servers. End Users choose what abstractions to employ. Working Examples: –CFS: Central File System –DSFS: Distributed Shared File System –DSDB: Distributed Shared Database Others Possible? –Distributed Backup System –Striped File System (RAID/Zebra)

CFS: Central File System file server adapter appl file CFS

ptr DSFS: Dist. Shared File System file server appl file server file server file adapter DSFS lookup file location access data

DSDB: Dist. Shared Database adapter appl file server file server file database server file index query direct access insert create file DSDB

system calls trapped via ptrace tcsh catvi tcsh catvi file table process table Like an OS Kernel –Tracks procs, files, etc. –Adds new capabilities. –Enforces owner’s policies. Delegated Syscalls –Trapped via ptrace interface. –Action taken by Parrot. –Resources chrgd to Parrot. User Chooses Abstr. –Appears as a filesystem. –Option: Timeout tolerance. –Option: Cons. semantics. –Option: Servers to use. –Option: Auth mechanisms. 4 - Adapter Adapter - Parrot Abstractions: CFS – DSFS - DSDB

file system file system file system file system file system file system file system Central Filesystem App Distributed Database Abstraction Adapter App Distributed Filesystem Abstraction Adapter App Cluster administrator controls policy on all storage in cluster UNIX Workstations owners control policy on each machine. file server file server file server file server file server file server file server UNIX ??? Adapter

Performance Summary Nothing comes for free! –System calls: order of magnitude slower. –Memory bandwidth overhead: extra copies. –TSS can drive network/switch to limits. Compared to NFS Protocol: –TSS slightly better on small operations. (no lookup) –TSS much better in network bandwidth. (TCP) –NFS caches, TSS doesn’t (today), mixed blessing. On real applications: –Measurable slowdown –Benefit: far more flexible and scalable.

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Dynamic Linking in HEP Simulation –Remote Database Access in HEP Simulation –Expandable Filesystem for Astrophysics Data –Expandable Database for Mol. Dynamics Simulation Ongoing Work –Malloc, Dynamic Views, DACLs, PINS Final Thoughts

Remote Dynamic Linking appl adapter ld.so FTP server file system liba.so libb.so libc.so WAN Credit: Igor Fermi National Lab FTP driver Modular Simulation Needs Many Libraries –Devel. on workstations, then ported to grid. –Selection of library depends on analysis tech. Solution: Dynamic Link with TSS and FTP: –LD_LIBRARY_PATH=/ftp/server.name/libs Send adapter along with job. Send adapter along with job. select several MB from 60 GB of libraries Anon. Login.

Related Work Lots of file services for the Grid: –GridFTP, Freeldr, NeST, IBP, SRB, RFIO,... –Adapter interfaces with many of these! Why have another file server? –Reason 1: Must have precise Unix semantics! Apps distinguish ENOENT vs EACCES vs EISDIR. FTP always returns error 550, regardless of error. –Reason 2: TSS focused on easy deployment. No privilege required, no config files, no rebuilding, flexible access control,...

Remote Database Access script adapter TSS file server file system DB data libdb.so sim.exe WAN CFS HEP Simulation Needs Direct DB Access –App linked against Objectivity DB. –Objectivity accesses filesystem directly. –How to distribute application securely? Solution: Remote Root Mount via TSS: parrot –M /=/chirp/fileserver/rootdir parrot –M /=/chirp/fileserver/rootdir DB code can read/write/lock files directly. DB code can read/write/lock files directly. GSI Auth GSI Credit: Sander NIKHEF

Performance on EDG Testbed Setup Time to Init Time/Event Unix 446 +/ /- 4664s LAN/NFS / s LAN/TSS / s WAN/TSS / s

Expandable Filesystem for Experimental Data Credit: John Notre Dame Astrophysics Dept. buffer disk 10 GB/day today could be lots more! daily tape daily tape daily tape daily tape daily tape 25-year archive analysis code Can only analyze the most recent data. Project GRAND

Expandable Filesystem for Experimental Data Credit: John Notre Dame Astrophysics Dept. buffer disk 10 GB/day today could be lots more! daily tape daily tape daily tape daily tape daily tape 25-year archive Project GRAND file server file server file server file server Distributed Shared Filesystem Adapter analysis code Can analyze all data over large time scales.

Appl: Distributed MD Database State of Molecular Dynamics Research: –Easy to run lots of simulations! –Difficult to understand the “big picture” –Hard to systematically share results and ask questions. Desired Questions and Activities: –“What parameters have I explored?” –“How can I share results with friends?” –“Replicate these items five times for safety.” –“Recompute everything that relied on this machine.” GEMS: Grid Enabled Molecular Sims –Distributed database for MD siml at Notre Dame. –XML database for indexing, TSS for storage/policy.

GEMS Distributed Database database server catalog server catalog server XML ->host1:fileA host7:fileB host3:fileC ACB YZX XML ->host6:fileX host2:fileY host5:fileZ data XML+ Temp>300K Mol==CH 4 host5:fileZ host6:fileX Credit: Jesus Izaguirre and Aaron Striegel, Notre Dame CSE Dept.

Active Recovery in GEMS

GEMS and Tactical Storage Dynamic System Configuration –Add/remove servers, discovered via catalog Policy Control in File Servers –Groups can Collaborate within Constraints –Security Implemented within File Servers Direct Access via Adapters –Unmodified Simulations can use Database –Alternate Web/Viz Interfaces for Users.

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Dynamic Linking in HEP Simulation –Remote Database Access in HEP Simulation –Expandable Filesystem for Astrophysics Data –Expandable Database for Mol. Dynamics Simulation Ongoing Work –Malloc, Dynamic Views, DACLs, PINS Final Thoughts

Ongoing Work Malloc() for the Filesystem –Resource owners want to limit users. (quota) –End users need space assurance. (alloc) –Need per-user allocations, not just global limits. Dynamic Data Views –Convert from DB to FS and back again. Distributed Access Control –ACLs refer to group definitions elsewhere. –What’s new? Fault tolerance / policy management. Processing in Storage (PINS) –Move computation to data. –Needs new programming (scripting) model.

Malloc in the Filesystem Paper: “Grid3: Principles and Practice” –90% of jobs would fail, most due to disk! Users need to alloc disk like anything else. –(Not accessible to user: quotas, loopback) –Allocation integrated with directory tree: scratch 100 GB job2 80 GB job1 10 GB job3 20 GB inputoutput taska 40 GB taskb 40 GB

Dynamic Data Views The same data can be perceived as either a file system or a database. Example: –DB: get files s.t. (T>300K) && (Mol==“CH4”) –FS: then process using scripts and shell –DB: associate derived files with original –FS: export and tar files for others.

Dynamic Data Views database server ACBYZX XML ->host6:fileX host2:fileY host5:fileZ Temp>300K Mol==CH 4 Distributed Filesystem Abstraction App

Distributed Access Control Lists Users are very comfortable with the ACL and group model. Can it be adapted to a grid environment? –Yes, can let an ACL refer to remote server. –Challenges: failures, caching, sharing policy. TSS client file server A Access Control List hostname:*.nd.edu RL group:serverB/presidents RWL file server B Group “Presidents” /O=NotreDame/CN=Jenkins /O=Purdue/CN=Jischke /O=Indiana/CN=Herbert

PINS: Processing in Storage Observation: –Traditional clusters separate CPU and storage into two distinct systems/problems. –Distributed computing is always some direct combination of CPU and I/O needs. Idea: PINS –Cluster HW is already a tighly integrated complex of CPU and I/O. Make the SW reflect the HW. –Key: Always compute in the same place that the data is located. Leave newly created data in place.

Processing in Storage file server database server XML index of data files ABAC X DC (X 200) 1. Compute Y = F(X). 3. Y is stored on S3. S1 S2S3S4 Y F 2 Dispatch F to S3.

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Dynamic Linking in HEP Simulation –Remote Database Access in HEP Simulation –Expandable Filesystem for Astrophysics Data –Expandable Database for Mol. Dynamics Simulation Ongoing Work –Malloc, Dynamic Views, DACLs, PINS Final Thoughts

Tactical Storage Systems Separate Abstractions from Resources Components: –Servers, catalogs, abstractions, adapters. –Completely user level. –Performance acceptable for real applications. Independent but Cooperating Components –Owners of file servers set policy. –Users must work within policies. –Within policies, users are free to build.

Acknowledgments Science Collaborators: –Jesus Izaguirre –Sander Klous –Peter Kunzst –Erwin Laure –John Poirer –Igor Sfiligoi –Aaron Striegel CSE Graduate Students: –Paul Brenner –James Fitzgerald –Jeff Hemmes –Paul Madrid –Chris Moretti –Phil Snowberger –Justin Wozniak

For more information... Cooperative Computing Lab Cooperative Computing Lab Cooperative Computing Tools Cooperative Computing Tools Douglas Thain Douglas Thain –

Extra Slides

Performance – System Calls

Performance - Applications parrot only

Performance – I/O Calls

Performance – Bandwidth

Performance – DSFS