Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute for Neural.

Slides:



Advertisements
Similar presentations
Panel 2 – Promoting Re-Use of Scientific Collections John Harrison SHAMAN Project University of Liverpool
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
GFS OGF-22 Global Resource Naming Developers: Reagan Moore Arcot Mike.
OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
Data Management Systems Richard Marciano Reagan W. Moore Wayne Schroeder Arcot Rajasekar Mike Wan San Diego Supercomputer Center
Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids Reagan W. Moore San Diego Supercomputer Center.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida A Data Storage Language for.
San Diego Supercomputer Center NARA Research Prototype Persistent Archive Building Preservation Environments with Data Grid Technology (NARA Research Prototype.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
Integrated Rule Oriented Data System (iRODS) Reagan W. Moore Arcot Rajasekar Mike Wan
The Frame NSF-funded national supercomputer centers Centers have hosted significant projects: TeraGrid, NPACI, GEON, SCEC, Chronopolis Fostered development.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons.
IRODS: integrated Rule Oriented Data System Ray Idaszak Director, Collaborative Environments RENCI University of North Carolina at Chapel Hill.
A Very Brief Introduction to iRODS
Sustainable Preservation Services for Archivists through Distributed Custody Caryn Wojcik State of Michigan Records Management Services.
Towards a Federated Infrastructure for the Preservation and Analysis Archival Data Chien-Yi HOU Richard MARCIANO {chienyi, School.
iRODS: Interoperability in Data Management
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Modern Data Management Overview Storage Resource Broker Reagan W. Moore
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
National Science Foundation Cooperative Agreement: OCI
National Data Infrastructure Projects EarthCube Layered Architecture (GEO) DataNet Federation Consortium (OCI) integrated Rule Oriented Data System (SDCI)
National Partnership for Advanced Computational Infrastructure Digital Library Architecture Reagan Moore Chaitan Baru Amarnath Gupta George Kremenek Bertram.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Information Management and Distributed Data Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
OSG Public Storage and iRODS
Rule-Based Distributed Data Management Reagan W. Moore Wayne Schroeder Arcot Rajasekar Mike Wan San Diego Supercomputer Center
Managing Simulation Output Storage Resource Broker Reagan W. Moore
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
Rule-Based Distributed Data Management iRODS Jan 23, Reagan W. Moore Mike Wan Arcot Rajasekar Wayne Schroeder San Diego.
1 integrated Rule Oriented Data System Tutorial: iRODS Capabilities.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
SLIDE 1DID Meeting - Montreal Integrating Data Mining and Data Management Technologies for Scholarly Inquiry Ray R. Larson University of California,
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
Interoperability within the Grid NDIIPP Partners Meeting Arlington, VA July 9, 2008 Interoperability within the Grid Robert H. McDonald Digital Preservation.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
SRB 1 & iRODS 2 Arcot Rajasekar Reagan Moore Mike Wan SDSC/UCSD Pathways to OOI-CI CyberData Architecture 1 Storage Resource Broker 2 integrated Rule Oriented.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Data Grids, Digital Libraries, and Persistent Archives Reagan.
From SRB to IRODS: Policy Virtualization using Rule-Based Data Grids Reagan W. Moore Wayne Schroeder Arcot Rajasekar Mike Wan San Diego Supercomputer Center.
National Science Foundation Cooperative Agreement: OCI Reagan Moore, PI Mary Whitton, Project Manager.
1 HPEC'02 Distributed Data Management Architecture for Embedded Computing The Problem: –Integrated real-time management of large, distributed, heterogeneous.
Biomedical Informatics Research Network The Storage Resource Broker & Integration with NMI Middleware Arcot Rajasekar, BIRN-CC SDSC October 9th 2002 BIRN.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Interlib Technology Integration Reagan.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Policy Based Data Management Environments (iRODS) Reagan W. Moore Arcot Rajasekar Mike Wan Mike Conway Antoine de Torcy Richard Marciano Jewel Ward
Use of Policies to Enforce Collection Properties Richard Marciano Reagan Moore University of North Chapel Hill Data Intensive Cyber Environments.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
1 iRODS Status Integrated Rule-Oriented Data System Reagan Moore Mike Wan Jean-Yves Nief
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
DataNet Federation Consortium
Collection Based Persistent Archives
Policy-Based Data Management integrated Rule Oriented Data System
Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan,
Interlib Technology Integration
Distributed Data Management Architecture for Embedded Computing
VORB Virtual Object Ring Buffers
Presentation transcript:

Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute for Neural Computation (INC), University of California San Diego irods.org, dice.unc.edu, diceresearch.org IRODS: the Integrated Rule- Oriented Data-Management System

Who Are We? Computer Scientists and Software Engineers Started in 1997 Grew out of High Performance Computing Now Broader and Digital Libraries/Preservation Doing applied research Digital Preservation and Data-Grids Develop and distribute Integrated Rule-Oriented Data management System (iRODS) Open Source; PCs to High-Performance Computing

What Problems Are We Solving? Researchers have perhaps millions of computer files Keep them safely stored and replicated (remotely) Distribute them across network; remote access Automatic handling; rules, work-flows Keep track of what they are (meta-data) Be able to find the right ones quickly (queries) Share them, in a controlled manner (authentication, access control, audit trails) Preserve them; change storage transparently

What Does iRODS Do? (1 of 3)  Remote High-Performance Data Access get/put, read/write Parallel threads for large transfers  Unified View Of Disparate Data Separates physical from logical (logical name-space) Keeps track of names and locations of files  Storage Type Independent Unix/Windows File Systems HPSS (Archival Storage) Etc

What Does iRODS Do? (2 of 3)  Replication/Backup physical  Metadata (RDBMS) System and user-defined PostgreSQL, Oracle, MySQL Queries/Information Discovery  Controlled Access users/groups Secure Passwords, Grid Security Infrastructure (GSI), Kerberos, Shibboleth soon

What Does iRODS Do? (3 of 3)  Rules/Micro-services Highly configurable  Workflows Rules/Micro-services, possibly delayed  Management of Large Collections irsync Audit trails metadata etc

Scientist A Adds data to Shared Collection Scientists can use iRODS as a “data grid” to share multiple types of data, near and far. iRODS Rules also enforce and audit human subjects access restrictions. Sharing Data in iRODS Data System Brain Data Server, CA iRODS Metadata Catalog iRODS Data System Audio Data Server, NJ Video Data Server, TN Scientist B Accesses and analyzes shared Data

DICE Technologies Helping UCSD Projects The National Center for Microscopy and Imaging Research (NCMIR) is using DICE SRB and testing iRODS in the Cell Centered Database project. DICE iRODS helps computational seismologists from the Southern California Earthquake Center (SCEC) manage large-scale earthquake simulation data at SDSC and other TeraGrid sites. UCSD Libraries Digital Asset Management System (DAMS) using DICE technologies, including SRB. DICE iRODS helps Ocean Observatories Initiative (OOI) with Scripps and Calit2 manage large-scale diverse ocean data, including real-time streaming data. And others including CineGrid, TDLC, etc.

Connecting Data Collections for New Science  "Federating" isolated "silos" of data enables new collaborations OOI ocean data flows in iRODS data grid to NOAA National Climatic Data Center (NCDC) NCDC climate data is accessed through data grid for CUAHSI hydrology research on floods CUAHSI hydrology data connects to Odom Institute for social science research on human impacts and response to floods OOI climate data discovered and flows to iPlant Consortium for designing drought-resistant plants for climate change adaptation

Growing Use of iRODS Data System Astronomy: NOAO, NVO, Observatoire de Strasbourg, France; CADAC, etc. Geo: NOAA NCDC; OOI; SCEC, etc. HPC: TeraGrid sites, SDSC, TACC, NICS, etc. NASA NCCS Bio: TDLC, NICMIR, iPlant, etc. Preservation: NARA TPAP, French National Library; Texas Digital Library; Fedora Commons; Dspace, etc. Workflow: Kepler, Taverna, etc. International: EU SHAMAN; Australian ARCS; UK e-Science; KEK (Japan); Academica Sinica (Taiwan); CC-IN2P3 HEP, France; etc. Industry: IBM, Oracle/Sun, Atos Origin, Microsoft, DataDirect

DICE Team  Data Intensive Cyber Environments Center University of North Carolina at Chapel Hill (UNC)  UNC School of Information and Library Science (SILS)  Renaissance Computing Institute (RENCI) Reagan Moore (Professor) Arcot Rajasekar (Professor) Richard Marciano (Professor) Antoine de Torcy, Chien-Yi Hou, Mike Conway UC San Diego  Institute for Neural Computation (INC) Mike Wan Wayne Schroeder Sheau-Yen Chen, Bing Zhu, Paul Tooby iRODS development is supported by NSF OCI "NARA Transcontinental Persistent Archives Prototype" ( ) NSF SDCI "Data Grids for Community Driven Applications" ( )