Download presentation
Presentation is loading. Please wait.
Published byShannon Pitts Modified over 9 years ago
1
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2014 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons (ericlyons@email.arizona.edu) Plant Sciences & iPlant Collaborative University of Arizona http://goo.gl/p4j3mhttp://goo.gl/p4j3m or https://sites.google.com/site/appliedciconcepts/ Will Computers Crash Genomics? Science Vol 331 Feb 2011
2
Topic Coverage Lifecycle Issues (example from MIT) Why DM (Data Management) iRODS Introduction Scaling the Infrastructure for Data Management (Chapter 3 from FiMDA) Group homework
3
Reality of data “We are drowning in data, but starving of information” - Attribution unknown
4
Data Life Cycle http://www.data-archive.ac.uk/create-manage/life-cycle
5
iRODS Background and Evolution integrated Rule-Oriented Data System (iRODS) http://www.irods.org Originated at SDSC, developed by the DICE (Data Intensive Cyber Environments) group Based on decade-long SRB development experience for managing distributed data Community-driven Most of the group migrated to UNC Chapel Hill in 2008-2009 – The group is bi-coastal: DICE-UNC, DICE-UCSD First release of iRODS in 2009 iRODS picked up where SRB left off 5
6
iRODS Background and Evolution Modular, extensible, customizable Open source (BSD license) Supported at UNC with complementary activities by DICE and RENCI, a research unit of UNC Chapel Hill https://github.com/irods/irods 6
7
iRODS I. Data grid middleware II. Data management infrastructure III. A framework for procedural implementation of data management policy (policy-driven data management) iRODS is all these.
8
iRODS Unified Virtual Collection
9
iRODS as a Data Grid Sharing data across: – geographic and institutional boundaries – heterogeneous resources (hardware/software) Virtual (logical) collections of distributed data Global name spaces – data: files and collections – users: single sign on – storage: virtual resources Metadata catalogue (iCAT) manages mappings between logical and physical name spaces
10
A RENCI Data Grid iRODS Server Metadata Catalog (iCAT) iRODS Server Client asks for data – request goes to an iRODS server Server contacts the iCAT-enabled server Information (location, access rights, etc) is retrieved from the iCAT Server containing data is signaled to send data to authorized client Client asks for data – request goes to an iRODS server Server contacts the iCAT-enabled server Information (location, access rights, etc) is retrieved from the iCAT Server containing data is signaled to send data to authorized client iPlant iRODS Server NCSU UNC-A Duke UNC-CH iRODS Server RENCI, Europa Center A complete data grid ( zone ) has one metadata catalogue (iCAT)
11
TUCASI Infrastructure Project (TIP) Federated Data Grids 11 Independent data grids ( zones ), each with its own iCAT, can be federated 18 September 2012
12
Federation of Data Grids NASA – Disparate data collections: Satellite data, model data, remote sensing data – Manage the collections separately (technically and administratively) with separate data grids – Federate the data grids to give users an overall view onto NASA data Collaboration between consortia – DataNet Federation Consortium: 6 science domain partners, federating their data grids to share data, users – Users authenticate to home data grid, access federated data grids For geographically distributed replication, evolution in data life cycle 1218 September 2012
13
iPlant Data Store Free Your Data Different Users, Different Access Needs: One Data Store Different Users, Different Access Needs: One Data Store
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.