Overview of the SDSC Storage Resource Broker Wayne Schroeder (and other SRB team members) May, 2004 San Diego Supercomputer Center, University of California.

Slides:



Advertisements
Similar presentations
National University Community Research Institute (NUCRI) NU Community Research Institute (NUCRI) HASTAC (higher education)/HASS grid National School Board.
Advertisements

Building Shared Collections Using the Storage Resource Broker Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center & National Partnership for Advance Computational Infrastructure Storage Resource Broker Reagan W. Moore San Diego Supercomputer.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
The Storage Resource Broker and.
The Storage Resource Broker and.
Peter Berrisford RAL – Data Management Group SRB Services.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids Reagan W. Moore San Diego Supercomputer Center.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids, Digital Libraries and Persistent Archives Reagan.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida A Data Storage Language for.
San Diego Supercomputer Center NARA Research Prototype Persistent Archive Building Preservation Environments with Data Grid Technology (NARA Research Prototype.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
A Very Brief Introduction to iRODS
Federating Archives in the DELAMAN Network Reagan W. Moore San Diego Supercomputer Center Storage Resource.
Security Requirements for Shared Collections Storage Resource Broker Reagan W. Moore
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
Modern Data Management Overview Storage Resource Broker Reagan W. Moore
On Developing Data Grid Workflows using Storage Resource Broker (SRB) and Kepler Tim H. Wong - UC Davis Efrat Frank - SDSC Dr. Bertram Ludäscher - UC Davis.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Data Grid Interactions with Firewalls Michael Wan Reagan Moore SDSC/UCSD/NPACI.
January, 23, 2006 Ilkay Altintas
By: Roman Olschanowsky An Introduction to the.
National Partnership for Advanced Computational Infrastructure Digital Library Architecture Reagan Moore Chaitan Baru Amarnath Gupta George Kremenek Bertram.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Dataflows in SRB using SDSC Matrix Arun Jagatheesan Architect & Team.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Grid Services/SRB/SRM & Practical Hai-Ning Wu Academia Sinica Grid Computing.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Data Grids, Digital Libraries, and Persistent Archives Reagan.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky An Introduction to the.
Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Introduction to The Storage Resource.
Biomedical Informatics Research Network The Storage Resource Broker & Integration with NMI Middleware Arcot Rajasekar, BIRN-CC SDSC October 9th 2002 BIRN.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Interlib Technology Integration Reagan.
SDSC Storage Resource Broker & Meta-data Catalog SRB Archives HPSS, ADSM, UniTree, DMF Databases DB2, Oracle, Sybase File Systems Unix, NT, Mac OSX Application.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
The SMB Archive System: Data Backup Across the Web Kenneth R. Sharp Stanford Synchrotron Radiation Laboratory.
SAN DIEGO SUPERCOMPUTER CENTER An Intelligent Rule-Oriented Data Management System DataGrid Wayne Schroeder San Diego Supercomputer Center, University.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
1 eScience Grid Environments th May 2004 NESC - Edinburgh Deployment of Storage Resource Broker at CCLRC for E-science Projects Ananta Manandhar.
Building Preservation Environments from Federated Data Grids Reagan W. Moore San Diego Supercomputer Center Storage.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Collection Based Persistent Archives
Policy-Based Data Management integrated Rule Oriented Data System
Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan,
Interlib Technology Integration
VORB Virtual Object Ring Buffers
Presentation transcript:

Overview of the SDSC Storage Resource Broker Wayne Schroeder (and other SRB team members) May, 2004 San Diego Supercomputer Center, University of California San Diego SDSC/UCSD/NPACI

What is SRB? (1 of 3) The SDSC Storage Resource Broker (SRB) is client- server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing unique or replicated data objects. SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their logical names or attributes rather than their names and physical locations.

What is SRB? (2 of 3) The SDSC SRB system is a comprehensive distributed data management solution, with features to support the management, collaborative (and controlled) sharing, publication, and preservation of distributed data collections. The SRB also serves as middleware via a rich set of APIs available to higher-level applications and by providing a management layer on top of a wide variety of storage systems.

What is SRB? (3 of 3) The SRB is an integrated solution which includes: –a logical namespace, –interfaces to a wide variety of storage systems, –high performance data movement (including parallel I/O), –fault-tolerance and fail-over, –WAN-aware performance enhancements (bulk operations), –storage-system-aware performance enhancements ('containers' to aggregate files), –metadata ingestion and queries (a MetaData Catalog (MCAT)), –user accounts, groups, access control, audit trails, GUI administration tool –data management features, replication –user tools (including a Windows GUI tool (inQ), a set of SRB Unix commands, and Web (mySRB)), and APIs (including C, C++, Java, and Python). SRB Scales Well (many millions of files, terabytes) Supports Multiple Administrative Domains / MCATs (srbZones) And includes SDSC Matrix: SRB-based data grid workflow management system to create, access and manage workflow process pipelines.

SRB Projects Digital Libraries –UCB, Umich, UCSB, Stanford,CDL –NSF NSDL - UCAR / DLESE NASA Information Power Grid Astronomy –National Virtual Observatory –2MASS Project (2 Micron All Sky Survey) Particle Physics –Particle Physics Data Grid (DOE) –GriPhyN –SLAC Synchrotron Data Repository Medicine –Digital Embryo (NLM) Earth Systems Sciences –ESIPS –LTER Persistent Archives –NARA –LOC Neuro Science & Molecular Science –TeleScience/NCMIR, BIRN –SLAC, AfCS, … Over 90 Tera Bytes in 16 million files

SRB Scalability

Case Study: SRB in BIRN BIRN Toolkit Mediator Viewing/Visualization Queries/ResultsApplications Data Management File System MCAT HPSS Data Model Data Access Data Grid Computational Grid Collaboration NMI Grid Management Globus GridPort Scheduler Distributed Resources Database SRB Database

SRB History A DataGrid since SRB 1.0, Production 1997 SDSC Started by General Atomics, 1985 –GA/UCSD Staff –On UCSD Campus –SRB by GA Employees Today, SDSC no longer GA, all UCSD –All staff UCSD employees GA Commercial SRB Version (Nirvana) –Based on SRB (2001) –Nirvana and SDSC versions diverged –SDSC SRB free to academic organizations –License from Nirvana for commercial

SRB – A Data Grid Solution Storage Resource Broker –Collaborative client-server system that federates distributed heterogeneous resources using uniform interfaces and metadata –Provides a simple tool to integrate data and metadata handling – attribute-based access –Blends browsing and searching –Developed at SDSC -Operational for 5+ years; -Under continual development since 1997; -Customer-driven; - Brokering over 90 TeraBytes in over 16 million files at SDSC

Using a Data Grid - Details SRB MCAT DB SRB Data Grid has arbitrary number of servers Complexity is hidden from users

SDSC Storage Resource Broker & Meta-data Catalog SRB Archives HPSS, ADSM, UniTree, DMF Databases DB2, Oracle, Sybase File Systems Unix, NT, Mac OSX Application C, C++, Linux I/O Unix Shell Dublin Core Resource, User Defined Application Meta-data Remote Proxies DataCutter Third-party copy Java, NT Browsers Web Prolog Predicate MCAT HRM

SRB server SRB agent SRB server Federated SRB Operation MCAT Read Application in Boston SRB agent /6 Logical Name Or Attribute Condition 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control Peer-to-peer Brokering Server(s) Spawning Data Access Parallel Data Access R1 R2 San Diego Durham

inQ Windows GUI

Virtual Hierarchical Collection Management

Attributes SRB metadata –Location, protocol –Unix semantics –Authorization, authentication –Latency management –Container aggregation Administrative –Dublin core, provenance –Annotations, comments Discipline specific attributes –Collection –User defined

Authentication Management Grid Security Infrastructure (GSI) Encrypted Password GSS-API for Kerberos or DCE Collection-owned Data Collection ID installed at each storage system Users authenticate themselves to the SRB SRB authenticates to local server Or GSI Delegation (Ananta Manandhar, CCLRC)

Logical File Name One of the major functions of SRB is the mapping between a logical file name and its physical file. The mapped info of a logical filename includes: Location of name in collection hierarchy Physical file location: host name and path Protocol: for fetching local file Unix semantics for file manipulation Location in container Audit trail Access control list Locking status

Replica Management Files can be replicated into any valid physical storage resource registered in SRB. Each replica is managed by the same logical filename as the original one and a unique replication number. Each replica can have unique metadata. 1-to-many Replication: A logical resource can contain several physical storage resources. Multiple replicas can be made to the same storage resource Many Modes of Replication: Synchronous Replication; Sput to a logical resource Asynchronous Replication; Sput then later Sreplicate Out of Band Replication; Outside SRB, then register

Containers Physical Grouping of Objects Similar to tar but has significant differences Multiple Uses: –To take advantage of resource characteristics –To aid access patterns –Move data sets together –Tie together logically different files –Automatic Archiving/Caching Chaining of Containers Sharing of metadata Containers for Collections

Proxy Operation Proxy operation - –server performs operations on behalf of client –performs operations where the data are located –subset and filter operations – datacutter –Metadata extraction and ingestion checks –srbExecCommand() API and Spcommand utility - request a specific server to execute a specific command and stream the result to stdin used by the NVO(national virtual observatory) cutout service

SRB – More Features Client Support –Pure Java Client –Web Services - WSDL, Matrix workflow system –Web Support - MySRB Extensions –Pure Java Client & Browser –inQ Version 3.1 and more Windows Support Administrative Support –GUI-based Administration –More Features - Resource, User, Method Management –User-friendly Installation Procedures

Metadata Management Metadata Insertion Through User Interfaces Bulk Metadata Insertion Template Based Metadata Extraction Metadata Search system data user-defined metadata File Content Search: Key words are pre-extracted by a template and saved as user-defined metadata.

Storage Resource Broker SRB wears many hats: –It is a distributed but unified file system –It is a database access interface –It is a digital library –It is a semantic web –It is a data grid system –It is an advanced archival system

Criticisms of SRB Not completely open source –But semi-open and available to academics Not standards-based –But internal protocols need not be Monolithic –Integrated –And well partitioned

Some SRB Weaknesses (my view) Difficult to explain and understand –SRB does so much, people tend to learn subsets and are often unaware of useful features –Different groups are interested in different sets of features –An elevator speech is either vague or incomplete Not completely open source Collaborations difficult –Need to expand Limited Staff –Feature-focused projects (+/-): docs, error messages

Some SRB Strengths Integrated solution –High performance –Highly functional –Relatively easy to enhance Middle-ware and Complete-ware Customer driven Sound architecture Mature, but also being actively developed Growing user base Highly coordinated centralized team

TeamSRB, San Diego Reagan Moore (Program Director, DAKS) Arcot Rajasekar (Director) Michael Wan (Chief Architect) Wayne Schroeder George Kremenek Bing Zhu Sheau-Yen Chen Charles Cowart Arun Jagatheesan (GriPhyN) Lucas Gilbert Roman Olsachnowsky (BIRN) Tim Warnock (BIRN)

Contacts For Additional Information: Web: Mail: Mailing-list: