Enabling Data Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Data Management Expert Panel - WP2. WP2 Overview.
More on File Management
Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain, Sander Klous, Justin Wozniak, Paul Brenner, Aaron Striegel, and Jesus.
The Consequences of Decentralized Security in a Cooperative Storage System Douglas Thain, Chris Moretti, Paul Madrid, Phil Snowberger, and Jeff Hemmes.
High Performance Computing Course Notes Grid Computing.
PlanetLab Operating System support* *a work in progress.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Research Issues in Cooperative Computing Douglas Thain
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
GridFTP: File Transfer Protocol in Grid Computing Networks
File Management Systems
File System Implementation
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
COS 420 DAY 25. Agenda Assignment 5 posted Chap Due May 4 Final exam will be take home and handed out May 4 and Due May 10 Latest version of Protocol.
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame
Enabling Data-Intensive Science with Tactical Storage Systems Douglas Thain
Enabling Data-Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Networked File System CS Introduction to Operating Systems.
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Windows 2000 Operating System -- Active Directory Service COSC 516 Yuan YAO 08/29/2000.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
Module 11: Implementing ISA Server 2004 Enterprise Edition.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Background: Operating Systems Brad Karp UCL Computer Science CS GZ03 / M th November, 2008.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
ITEC 502 컴퓨터 시스템 및 실습 Chapter 11-2: File System Implementation Mi-Jung Choi DPNM Lab. Dept. of CSE, POSTECH.
Flexibility, Manageability and Performance in a Grid Storage Appliance John Bent, Venkateshwaran Venkataramani, Nick Leroy, Alain Roy, Joseph Stanley,
Module 4 Planning for Group Policy. Module Overview Planning Group Policy Application Planning Group Policy Processing Planning the Management of Group.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
CS 346 – Chapter 11 File system –Files –Access –Directories –Mounting –Sharing –Protection.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
I NTRODUCTION TO N ETWORK A DMINISTRATION. W HAT IS A N ETWORK ? A network is a group of computers connected to each other to share information. Networks.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Introduction TO Network Administration
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Computer System Structures
Disk Cache Main memory buffer contains most recently accessed disk sectors Cache is organized by blocks, block size = sector’s A hash table is used to.
Simulation Production System
File System Implementation
CHAPTER 3 Architectures for Distributed Systems
File Transfer and access
Haiyan Meng and Douglas Thain
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
IS3440 Linux Security Unit 4 Securing the Linux Filesystem
Lecture 15 Reading: Bacon 7.6, 7.7
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Enabling Data Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame

The Cooperative Computing Lab Our model of computer science research: –Understand how users with complex, large-scale applications need to interact with computing systems. –Design novel computing systems that can be applied by many different users == basic CS research. –Deploy code in real systems with real users, suffer real bugs, and learn real lessons == applied CS. Application Areas: –Astronomy, Bioinformatics, Biometrics, Molecular Dynamics, Physics, Game Theory,... ??? External Support: NSF, IBM, Sun

Two Talks in One Paper at Supercomputing Applications of Tactical

Abstract Users of distributed systems encounter many practical barriers between their jobs and the data they wish to access. Problem: Users have access to many resources (disks), but are stuck with the abstractions (cluster NFS) provided by administrators. Solution: Tactical Storage Systems allow any user to create, reconfigure, and tear down abstractions without bugging the administrator.

Transparent Distributed Filesystem shared disk The Standard Model

Transparent Distributed Filesystem shared disk Transparent Distributed Filesystem shared disk private disk private disk private disk private disk FTP, SCP, RSYNC, HTTP,...

Problems with the Standard Model Users encounter partitions in the WAN. –Easy to access data inside cluster, hard outside. –Must use different mechanisms on diff links. –Difficult to combine resources together. Different access modes for different purposes. –File transfer: preparing system for intended use. –File system: access to data for running jobs. Resources go unused. –Disks on each node of a cluster. –Unorganized resources in a department/lab. A global file system can’t satisfy everyone!

What if... Users could easily access any storage? I could borrow an unused disk for NFS? An entire cluster can be used as storage? Multiple clusters could be combined? I could reconfigure structures without root? –(Or bugging the administrator daily.) Solution: Tactical Storage System (TSS)

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Database Access for BaBar Code –Remote Dynamic Linking for CDF Code –Logical Data Access for Bioinformatics Code –Expandable Database for MD Simulation Improving the OS for Grid Computing

Tactical Storage Systems (TSS) A TSS allows any node to serve as a file server or as a file system client. All components can be deployed without special privileges – but with security. Users can build up complex structures. –Filesystems, databases, caches,... Two Independent Concepts: –Resources – The raw storage to be used. –Abstractions – The organization of storage.

file transfer file system file system file system file system file system file system file system Central Filesystem App Distributed Database Abstraction Adapter App Distributed Filesystem Abstraction Adapter App Cluster administrator controls policy on all storage in cluster UNIX Workstations owners control policy on each machine. file server file server file server file server file server file server file server UNIX ??? Adapter 3PT

Components of a TSS: 1 – File Servers 2 – Catalogs 3 – Abstractions 4 – Adapters

1 – File Servers Unix-Like Interface –open/close/read/write –getfile/putfile to stream whole files –opendir/stat/rename/unlink Complete Independence –choose friends –limit bandwidth/space –evict users? Trivial to Deploy –run server + setacl –no privilege required –can be thrown into a grid system Flexible Access Control file server A file server B Chirp Protocol file system owner of server A owner of server B

Related Work Lots of file services for the Grid: –GridFTP, NeST, SRB, RFIO, SRM, IBP,... –(Adapter interfaces with many of these!) Why have another file server? –Reason 1: Must have precise Unix semantics! Apps distinguish ENOENT vs EACCES vs EISDIR. FTP always returns error 550, regardless of error. –Reason 2: TSS focused on easy deployment. No privilege required, no config files, no rebuilding, flexible access control,...

Access Control in File Servers Unix Security is not Sufficient –No global user database possible/desirable. –Mapping external credentials to Unix gets messy. Instead, Make External Names First-Class –Perform access control on remote, not local, names. –Types: Globus, Kerberos, Unix, Hostname, Address Each directory has an ACL: globus:/O=NotreDame/CN=DThain RWLA RWL hostname:*.cs.nd.edu RL address: * RWLA

Problem: Shared Namespace file server globus:/O=NotreDame/* RWLAX a.out test.ctest.dat cms.exe

Solution: Reservation (V) Right file server O=NotreDame/CN=* V(RWLA) /O=NotreDame/CN=Monk RWLA mkdir a.outtest.c /O=NotreDame/CN=Monk mkdir /O=NotreDame/CN=Ted RWLA a.outtest.c /O=NotreDame/CN=Ted mkdir only!

2 - Catalogs catalog server catalog server periodic UDP updates HTTP XML, TXT, ClassAds

3 - Abstractions An abstraction is an organizational layer built on top of one or more file servers. End Users choose what abstractions to employ. Working Examples: –CFS: Central File System –DSFS: Distributed Shared File System –DSDB: Distributed Shared Database Others Possible? –Distributed Backup System –Striped File System (RAID/Zebra)

CFS: Central File System file server adapter appl file CFS

ptr DSFS: Dist. Shared File System file server appl file server file server file adapter DSFS lookup file location access data pointers to multiple copies

DSDB: Dist. Shared Database adapter appl file server file server file database server file index query direct access insert create file DSDB

system calls trapped via ptrace tcsh catvi tcsh catvi file table process table Like an OS Kernel –Tracks procs, files, etc. –Adds new capabilities. –Enforces owner’s policies. Delegated Syscalls –Trapped via ptrace interface. –Action taken by Parrot. –Resources chrgd to Parrot. User Chooses Abstr. –Appears as a filesystem. –Option: Timeout tolerance. –Option: Cons. semantics. –Option: Servers to use. –Option: Auth mechanisms. 4 - Adapter Adapter - Parrot Abstractions: CFS – DSFS - DSDB HTTP, FTP, RFIO, NeST, SRB, gLite ???

file transfer file system file system file system file system file system file system file system Central Filesystem App Distributed Database Abstraction Adapter App Distributed Filesystem Abstraction Adapter App Cluster administrator controls policy on all storage in cluster UNIX Workstations owners control policy on each machine. file server file server file server file server file server file server file server UNIX ??? Adapter

Performance Summary Nothing comes for free! –System calls: order of magnitude slower. –Memory bandwidth overhead: extra copies. However: –TSS can take full advantage of bandwidth (!NFS) –TSS can drive network/switch to limits. –Typical slowdown on real apps: 5-10 percent. –Allows one to harness resources that would go unused. –Observation: Most users constrained by functionality.

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Database Access for BaBar Code –Remote Dynamic Linking for CDF Code –Logical Data Access for Bioinformatics Code –Expandable Database for MD Simulation Improving the OS for Grid Computing

Remote Database Access script Parrot TSS file server file system DB data libdb.so sim.exe WAN CFS HEP Simulation Needs Direct DB Access –App linked against Objectivity DB. –Objectivity accesses filesystem directly. –How to distribute application securely? Solution: Remote Root Mount via TSS: parrot –M /=/chirp/fileserver/rootdir parrot –M /=/chirp/fileserver/rootdir DB code can read/write/lock files directly. DB code can read/write/lock files directly. GSI Auth GSI Credit: Sander NIKHEF

Remote Application Loading appl Parrot HTTP server file system liba.so libb.so libc.so Credit: Igor Fermi National Lab HTTP Modular Simulation Needs Many Libraries –Devel. on workstations, then ported to grid. –Selection of library depends on analysis tech. –Constraint: Must use HTTP for file access. Solution: Dynamic Link with TSS+HTTP: –/home/cdfsoft -> /http/dcaf.fnal.gov/cdfsoft select several MB from 60 GB of libraries proxy

Technical Problem HTTP is not a filesystem! (No directories) –Advantages: Firewalls, caches, admins. Appl Parrot HTTP Module HTTP Server root etchomebin alicecmsbabar opendir(/home) GET /home HTTP/1.0

Technical Problem Solution: Turn the directories into files. –Can be cached in ordinary proxies! Appl Parrot HTTP Module HTTP Server root etchomebin alicecmsbabar opendir(/home) GET /home/.dir HTTP/1.0.dir make httpfs alice babar cms

Logical Access to Bio Data Many databases of biological data in different formats around the world: –Archives: Swiss-Prot, TreMBL, NCBI, etc... –Replicas: Public, Shared, Private, ??? Users and applications want to refer to data objects by logical name, not location! –Access the nearest copy of the non-redundant protein database, don’t care where it is. Solution: EGEE data management system maps logical names (LFNs) to physical names (SFNs). Credit: Christophe Blanchet, Bioinformatics Center of Lyon, CNRS IBCP, France

Logical Access to Bio Data BLAST Parrot RFIOgLiteHTTPFTP Chirp Server FTP Server gLite Server EGEE File Location Service Run BLAST on LFN://ncbi.gov/nr.data open(LFN://ncbi.gov/nr.data) Where is LFN://ncbi.gov/nr.data? Find it at: SFN://ibcp.fr/nr.data nr.data RETR nr.data open(SFN://ibcp.fr/nr.data)

Appl: Distributed MD Database State of Molecular Dynamics Research: –Easy to run lots of simulations! –Difficult to understand the “big picture” –Hard to systematically share results and ask questions. Desired Questions and Activities: –“What parameters have I explored?” –“How can I share results with friends?” –“Replicate these items five times for safety.” –“Recompute everything that relied on this machine.” GEMS: Grid Enabled Molecular Sims –Distributed database for MD siml at Notre Dame. –XML database for indexing, TSS for storage/policy.

GEMS Distributed Database database server catalog server catalog server XML ->host1:fileA host7:fileB host3:fileC ACB YZX XML ->host6:fileX host2:fileY host5:fileZ data XML+ Temp>300K Mol==CH 4 Credit: Jesus Izaguirre and Aaron Striegel, Notre Dame CSE Dept. host5:fileZ host6:fileX DSFS Adapter

Active Recovery in GEMS

GEMS and Tactical Storage Dynamic System Configuration –Add/remove servers, discovered via catalog Policy Control in File Servers –Groups can Collaborate within Constraints –Security Implemented within File Servers Direct Access via Adapters –Unmodified Simulations can use Database –Alternate Web/Viz Interfaces for Users.

Outline Problems with the Standard Model Tactical Storage Systems –File Servers, Catalogs, Abstractions, Adapters Applications: –Remote Database Access for BaBar Code –Remote Dynamic Linking for CDF Code –Logical Data Access for Bioinformatics Code –Expandable Database for MD Simulation Improving the OS for Grid Computing

OS Support for Grid Computing Distributed computing in general suffers because of limitations in the operating system. How can we improve the OS in the long term? Resource allocation: –Cannot reserve space -> jobs crash –Hard to clean up procs -> unreliable systems Security and permissions: –No ACLs -> hard to share data –Only root can setuid -> hard to secure services.

job23 Allocation in the Filesystem root jobslogs inputoutput ftp ftp.log coredump

100 GB allocation job23 Allocation in the Filesystem root jobslogs inputoutput dalloc 200 GB allocation ftpftp.log

student root alice httpd visitor kerberos bob visitor anon1anon2 These two users are completely different: root:kerberos:alice:visitor root:kerberos:bob:visitor The web server can create distinct anonymous accounts. No need for global nobody. kerberos given to the login server. alice created by krb5 login. student created at run-time.

Approach by Degrees What can we do as an ordinary user? –Simulate OS functionality within Parrot. –Drawback: Performance / Assurance. What can we do as root? –Setuid toolkit to manage system on request. –Drawback: Limitations in Policy / Expr. What can we do by modifying the OS? –Modify kernel/FS to support to new features. –Drawback: Deployment.

Tactical Storage Systems Separate Abstractions from Resources Components: –Servers, catalogs, abstractions, adapters. –Completely user level. –Performance acceptable for real applications. Independent but Cooperating Components –Owners of file servers set policy. –Users must work within policies. –Within policies, users are free to build.

Parting Thought Many users of the grid are constrained by functionality, not performance. TSS allows end users to build the structures that they need for the moment without involving an admin. Analogy: building blocks for distributed storage. for distributed storage.

Acknowledgments Science Collaborators: –Christophe Blanchet –Sander Klous –Peter Kunzst –Erwin Laure –John Poirer –Igor Sfiligoi CS Collaborators: –Jesus Izaguirre –Aaron Striegel CS Students: –Paul Brenner –James Fitzgerald –Jeff Hemmes –Paul Madrid –Chris Moretti –Phil Snowberger –Justin Wozniak

For more information... Cooperative Computing Lab Cooperative Computing Lab Cooperative Computing Tools Cooperative Computing Tools Douglas Thain Douglas Thain –

Performance – System Calls

Performance - Applications parrot only

Performance – I/O Calls

Performance – Bandwidth

Performance – DSFS

SP5 Performance on EDG Testbed Setup Time to Init Time/Event Unix 446 +/ /- 4664s LAN/NFS / s LAN/TSS / s WAN/TSS / s