Download presentation
Presentation is loading. Please wait.
1
DataNet Federation Consortium
Reagan W. Moore (UNC-CH, PI) Arcot Rajasekar (UNC-CH, co-PI) Jonathan Goodall (USC, co-PI) William Regli (Drexel, co-PI) John Orcutt (UCSD, co-PI) Stan Ahalt (RENCI) Mary Whitton (UNC-CH, Project Manager) Mike Wan (UCSD) Wayne Schroeder (UCSD) Sheau-Yen Chen (UCSD) Lisa Stillwell (RENCI) Helen Tibbo (UNC-CH) Cal Lee (UNC-CH) Jewel Ward (UNC-CH) Ken Galluppi (ASU) Isaac Simons (Drexel University) Mirza Billah (University of South Carolina)
2
DataNet Federation Consortium Data Driven Science
Implement national data infrastructure Federate existing discipline-specific data management systems to enable national research collaborations Enable collaborative research on shared data collections Manage collection life cycle as the user community broadens Integrate “live” research data into education initiatives Enable student research participation through control policies Project Shared Collection Processing Pipeline Digital Library Reference Collection Federation Collection Life Cycle Cyber-infrastructure Partners: Univ. of North Carolina, Chapel Hill Univ. of California, San Diego University of South Carolina Drexel University Arizona State University Duke University University of Arizona Science and Engineering Initiatives: Ocean Observatories Initiative Hydrology - CUAHSI, EarthCube Engineering - CIBER-U digital library the iPlant Collaborative Odum Social Science Research Institute Temporal Dynamics of Learning Center Policy-based data management National Science Foundation Cooperative Agreement: OCI
3
DFC Organizational Structure
Vice Chancellor of Research, UNC-CH Barbara Entwisle PI, Reagan Moore, and Executive Committee External Advisory Board Community of Practice Expertise Boards Project Manager Mary Whitton Steering Committee Facilities & Operations Stan Ahalt Lisa Stillwell Sheau-Yen Chen Institutions and Sustainability Richard Marciano Science and Engineering William Regli OOI John Orcutt CIBER-U William Regli Hydrology Ken Galluppi Technology and Research Arcot Rajasekar Wayne Schroeder Mike Wan Outreach & Education Marilyn Lombardi Julian Lombardi TDLC Andrea Chiba iPlant Sudha Ram Odum Jonathan Crabtree Policies and Standards Helen Tibbo Cal Lee Jewel Ward 3
4
Build National Infrastructure Through Federation
Ocean Observatories Initiative, National Climatic Data Center Data grid for oceanography, sensor control, real-time data streams, archive CUAHSI, UNC Institute for the Environment, National Climatic Data Center Data grid for hydrology, watershed modeling workflow integration CIBER-U (Engineering design, undergraduate education) Digital Library, OOI sensor documents Years 3-5 the iPlant Collaborative Data grid for plant biology, federation with existing biology resources Odum Social Science Research Institute DataVerse federation, data archive Temporal Dynamics of Learning Center Data grid for cognitive science
5
Enabling Tools Data grid Soft links Federated data grids
Build shared name spaces for users, files, resources, metadata, rules, procedures Soft links Register data from external data management system, accessed through its protocol Federated data grids Cross-register users between data management systems Workflow integration Register workflows into data grid for storage side procedures Integrate data management workflows with external workflows
6
Policy-based Data Management
Researchers - Client Data Grid iRODS controlled workflows Data Grid iRODS controlled workflows Shared Collection Storage Storage Storage Storage Consensus on Policies and Procedures controls the shared data within the federation
7
Extensibility Operations on Name Spaces
8
Community-Based Collection Life Cycle
Each life cycle stage re-purposes the original collection Project Collection Private Local Policy Data Grid Shared Distribution Policy Data Processing Pipeline Analyzed Service Policy Digital Library Published Description Policy Reference Collection Preserved Representation Policy Federation Sustained Re-purposing Policy Stages correspond to addition of new policies to support a broader community The evolution of policies quantifies how impact is broadened
9
Accomplishments Installed three data grids
OOI : Drexel engineering : USC Hydrology Installed Federation hub at RENCI Based on version 3.1 of iRODS data grid Federated with EUDAT, NCDC Created engineering digital library Integration of MediaWiki with iRODS Automated hydrology workflows Established collaborations with NCDC, NCCS, EarthCube
10
icat.oceanobservatories.org, port 1247
ooi-ucsdResc1 ooi-osuResc1 DataNet Federation Communication Ports Port 1247 Port 1247 ooi Zone icat.oceanobservatories.org, port 1247 4 resources at ucsd_irods.oceanobservatories.org icat.oceanobservatories.org cg_east_whoi.oceanobservatories.org ooi.coas.oregonstate.edu hydrology Zone iren.renci.org, port 2823 2 resources at ce-broad.ce.sc.edu iren.renci.org res-dfcmain Port 2823 Port 1247 Port 2823 Port 2823 Port 1237 Port 1247 Port 1247 Port 1237 usc-resource Port 1237 renci-vault19 ooi-icatResc1 ooi-cgResc1 dfcmain Zone iren.renci.org, port 1237 Federates with 4 zones 2 resources at iren.renci.org srbbrick15.ucsd.edu renci-vault2 renci-vault1 Port 1237 Port 1237 Port 1237 Port 1247 Port 1247 Port 1247 Port 1247 res-bk15 renci Zone iren.renci.org, port 1247 > 10 resources engineering Zone edge.cs.drexel, port 1247 1 resource edge.cs.drexel Port 1247 Port 1247 Port 1247 Port 1247 loadingResc europa-vault1 resource group edge
11
iRODS Integration in MediaWiki
Date: July 10th, 2012
12
New features – iRODS wikipage
Any mediawiki page that is added or edited from now on is synchronized with iRODS (a copy of the page is stored on iRODS server) You know if a page is synchronized with iRODS by looking at the bottom of a page, under “Irods Report”:
13
iRODS File Details
14
Hydrology Use Cases VIC model automation (USC)
RHESSys model automation (UNC-CH, EarthCube) Sharing of workflows NCDC archiving of data from OOI SigClimate sustainability group (NCDC, NCCS)
15
Eco-Hydrology Choose gauge or outlet (HIS) RHESSys workflow to develop a nested watershed parameter file (worldfile) containing a nested ecogeomorphic object framework, and full, initial system state. Extract drainage area (NHDPlus) Digital Elevation Model (DEM) Slope Aspect Nested watershed structure Streams (NHD) Soil and vegetation parameter files Roads (DOT) Strata Patch Land Use NLCD (EPA) Hillslope Basin Leaf Area Index Landsat TM Stream network Phenology MODIS Flowtable Worldfile Soil Data USDA RHESSys
16
Workflow Management Workflow file eCWkflow.mss
Directory holding all input and output files associated with workflow file (mounted collection that is linked to the workflow file) /earthCube/eCWkflow eCWkflow.run Automatically generated run file for Executing each input file eCWkflow2.run Input parameter file, lists parameters and input and output file names eCWkflow.mpf eCWkflow2.mpf Directory holding all output files generated for invocation of eCWkflow.run, the version number is incremented /earthCube/eCWkflow/eCWkflow.runDir0 Outfile Output file created for eCWKflow.mpf /earthCube/eCWkflow/eCWkflow2.runDir0 Output file created for eCWKflow2.mpf Newfile
17
Workflow Re-execution & Sharing
eCWkflow.mss imcoll imcoll …. /earthCube/eCWkflow /hydrology/myWkflow eCWkflow.run myWkflow.run myWkflow.mpf eCWkflow.mpf /earthCube/eCWkflow/eCWkflow.runDir0 /hydrology/myWkflow/myWkflow.runDir0 Outfile Outfile /earthCube/eCWkflow/eCWkflow.runDir1 /hydrology/myWkflow/myWkflow.runDir1 Outfile Outfile
18
Re-use Architecture Components
Research Environment {9} Portals, Applications {5}, Workflows {2} After generating results within a collaboration environment Apply appropriate policies and procedures to publish the results as a digital library Register a Community Resource that can be used by subsequent research initiatives Collaboration Environment – Data Grid {9} Protocols {0} Web Services {6} Protocols {0} Brokers {7} Protocols {0} Web Services {6} Protocols {0} Community Resource Collaboration Environment – Data Grid {9}
19
Education Develop policies and procedures to make “live” collections accessible by students Support classification, categorization, feature detection algorithms Integrate with student digital libraries UNC-CH School of Information and Library Science LifeTime Library Students build their own personal reference collection
20
Life-Time Library (SILS)
Student digital libraries Enable students to build collections of Photographs MP3 audio files Video Class documents Web site archive Resources provided by School of Information and Library Science Student collections range from 2 GBytes to 150 Gbytes Number of files from 2000 to 12,000
21
LifeTime Library Policies
Integrity Replication Checksums Versioning Management Strict access controls Quotas Metadata catalog replication Installation environment archiving Ingestion Automated synchronization of student directory with LifeTime Library
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.