EarthCube Layered Architecture Concept Award Interoperability Mechanisms.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
Task WA-01 GEO Work Plan Symposium 2014 Managing and Sharing Data WA-01 R. Lawford and M. Schlummer based on contributions from D. Arctur, D. Maidment,
Linking HIS and GIS How to support the objective, transparent and robust calculation and publication of SWSI? Jeffery S. Horsburgh CUAHSI HIS Sharing hydrologic.
CUAHSI HIS Data Services Project David R. Maidment Director, Center for Research in Water Resources University of Texas at Austin (HIS Project Leader)
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Two NSF Data Services Projects Rick Hooper, President Consortium of Universities for the Advancement of Hydrologic Science, Inc.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Introducing the CUAHSI Hydrologic Information System Desktop Application (HydroDesktop) and Open Development Community Jiří Kadlec, Daniel Ames, Teva Velupillai.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
National Science Foundation Cooperative Agreement: OCI
U.S. Department of the Interior U.S. Geological Survey U.S. National Water Census “Cyber – Platform” Update Progress and challenges to overcome in realizing.
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
A Global Agriculture Drought Monitoring and Forecasting System (GADMFS) Meixia Deng and Liping Di.
HydroShare: Advancing Hydrology through Collaborative Data and Model Sharing David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames, Jon Goodall, Larry.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
National Data Infrastructure Projects EarthCube Layered Architecture (GEO) DataNet Federation Consortium (OCI) integrated Rule Oriented Data System (SDCI)
Crossing the Digital Divide Presented by: Fernando R. Salas David Maidment, Enrico Boldrini, Stefano Nativi, Ben Domenico OGC Technical Meeting – Met/Occean.
HydroShare: An online, collaborative environment for the sharing of hydrologic data and models David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames,
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
Water Web Services David R. Maidment Center for Research in Water Resources University of Texas at Austin Open Waters Symposium Delft, the Netherlands.
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
HydroShare: An online collaborative environment for the sharing of hydrologic data and models IN11A-1510 We envision that HydroShare will enable more rapid.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
BioData a new bioassessment database for the USGS Briefing for the CDI
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
EarthCube Building Block for Integrating Discrete and Continuous Data (DisConBB) David Maidment, University of Texas at Austin (Lead PI) Alva Couch, Tufts.
Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
Data Management Planning Session Kevin Gomes Michael Meisinger Arcot Rajasekar Michael Wan October 19, 2007.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
HydroShare: Advancing Hydrology through Collaborative Data and Model Sharing David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames, Jon Goodall, Larry.
Ocean Observatories Initiative OOI Cyberinfrastructure Data Management Michael Meisinger & David Stuebe OOI Cyberinfrastructure Life Cycle Objectives Milestone.
National Science Foundation Cooperative Agreement: OCI
Sharing SRP Water Sample Data Using CUAHSI HIS Infrastructure Ilya Zaslavsky, Thomas Whitenack, Keith Pezzoli, Hiram Sarabia University of California at.
HydroShare: Advancing Hydrology through Collaborative Data and Model Sharing David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames, Jon Goodall, Larry.
Geoinformatics 2006 A Virtual Data Product Toolkit Based on Geospatial Web Service Orchestration Peisheng Zhao, Liping Di, Yaxing Wei Center for Spatial.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
1 CLASS – Simple NOAA Archive Access Portal SNAAP Eric Kihn and Rob Prentice NGDC CLASS Developers Meeting July 14th, 2008 Simple NOAA Archive Access Portal.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
Ocean Observatories Initiative OOI Cyberinfrastructure Life Cycle Objectives Review January 8-9, 2013 Scientific Workflows for OOI Ilkay Altintas Charles.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Data Grid Research Group Dept. of Computer Science and Engineering The Ohio State University Columbus, Ohio 43210, USA David Chiu and Gagan Agrawal Enabling.
Workflow Management Concepts and Requirements For Scientific Applications.
Origami: Scientific Distributed Workflow in McIDAS-V Maciek Smuga-Otto, Bruce Flynn (also Bob Knuteson, Ray Garcia) SSEC.
HydroShare: Advancing Hydrology through Collaborative Data and Model Sharing David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames, Jon Goodall, Larry.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Working Group: Data Foundations and Terminology (Practical Policy Considerations) Reagan Moore.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
DataNet Federation Consortium
CyberSKA: Global Federated e-Infrastructure
DataNet Collaboration
Policy-Based Data Management integrated Rule Oriented Data System
Joseph JaJa, Mike Smorul, and Sangchul Song
SDM workshop Strawman report History and Progress and Goal.
Presentation transcript:

EarthCube Layered Architecture Concept Award Interoperability Mechanisms

Layered Architecture Concept Award Reagan Moore (UNC-CH/DICE)Collaboration environments Ilkay Altintas (UCSD)Workflows David Arctur (OGC)Web services Lawrence Band (UNC-CH/IE)Eco-hydrology modeling Liping Di (GMU)Geospatial knowledge building Janet Fredericks (WHOI)Data quality Jeff Horsburgh (Utah State University)CUAHSI / DataONE Yong Liu (UIUC / NCSA)Workflows / Cyberintegrator Chris MacDermaid (Colorado State Univ.)Physics model frameworks Brian Miles (UNC-CH/IE) Eco-hydrology workflows Michael Schoffner (RENCI)Web service integration Antoine de Torcy (UNC-CH/DICE)Workflow integration Weiguo Han(GMU)GeoBrain

Research Environment - Applications, Workflows Collaboration Environment – Data Grids, Portals Protocols Web Services Protocols Web Services Protocols Brokers / Messaging / Structured Object Manipulation Protocols Web Services Protocols Web Services Protocols Community Resources Policies Loosely Coupled – Layered Architecture EarthCube Infrastructure

Interoperability Mechanisms For interactions with a collaboration environment – Register a remote file into the collaboration space Collection of linksSoft link – Operations on a remote file THREDDS, OpeNDAP, NetCDF, HDF5, FITS, …Posix I/O extensions – Access (get, put) a remote file Web service invoking remote protocolMicro-services – Asynchronously post and read messages Message queue forwarding (AMQP)Queuing – Operations on aggregations of files Operations associated with a collectionPosix I/O extensions – Operations on aggregations of procedures Workflows and structured information exchangeRule exchange – Policy enforcement Policy-encoded objectsPolicy exchange

Use Case Collaborations – Register a remote file into the collaboration space DataNet Federation Consortium data grid (DFC) – Operations on a remote file THREDDS, OpeNDAP, NetCDF, HDF5, storage drivers for OOI – Access (get, put) a remote file DataONE, CUAHSI, (OGC, Data Conservancy) – Asynchronously post and read messages SEAD - VIVO – Operations on aggregations of files OOI time series archive – Operations on aggregations of procedures Kepler (Gulf of Mexico hypoxia), NCSA Cyberintegrator (Texas drought) – Policy enforcement Research Data Alliance policy sharing

Use Cases Demonstrate reproducible science. A use case could include the registration, storage, sharing, and re-execution of a workflow. The hypoxia use case from the Cross-Domain and Brokering Concept groups could be used as an example. Automate data retrieval. A use case could demonstrate remote access to a data collection, retrieval of desired data sets, transformation, and use in an analysis workflow. An eco-hydrology example that automates access to digital elevation maps and land use coverage is being built. Integrate community resources with collaboration environments. An example would be use of the DAB protocol to identify and cache local copies of relevant data sets for local analysis. Integrate multiple community resources. A use case could be demonstration of invocation of multiple workflow systems within the same analysis. An example is the integration of Cyberintegrator workflow with collaboration environments to support drought prediction.

Eco- Hydrology Choose gauge or outlet (HIS) Extract drainage area (NHDPlus) Digital Elevation Model (DEM) Worldfile Flowtable RHESSys Slope Aspect Streams (NHD) Roads (DOT) Strata Hillslope Patch Basin Stream network Nested watershed structure Land Use Leaf Area Index Phenology Soil Data NLCD (EPA) Landsat TM MODIS USDA Soil and vegetation parameter files RHESSys workflow to develop a nested watershed parameter file (worldfile) containing a nested ecogeomorphic object framework, and full, initial system state.

iRODS Rule for RHESSys main { getExtentForGageReachcode(*gageReachcode, *extentInNHD_Vect_Coords); convertExtentToNHD_DEM(*extentInNHD_Vect_Coords, *extentInNHD_DEM_Coords); extractTileFromNHD_DEM(trimr(*extentInNHD_DEM_Coords, "\n")); importDEMTileIntoNewGRASSLocationAsUTM(*extentInNHD_Vect_Coords, *newLocPhysPath, *newLocObjPath); delineateWatershedForNHDGage(*nhdStreamGageID, *newLocPhysPath, *newLocObjPath); } Modular workflow composed by chaining basic transformation Define input variables Call functions to apply each transformation step Store results in shared collection

extractTileFromNHD_DEM(*extentCoords) { # Split path to object into collection and name msiSplitPath(*nhdDEMObjPath, *nhdDEMObjColl, *nhdDEMObjName); writeLine("serverLog", *nhdDEMObjColl); writeLine("serverLog", *nhdDEMObjName); # Build query to discover physical path msiAddSelectFieldToGenQuery("DATA_PATH", "null", *genQInp); msiAddConditionToGenQuery("DATA_NAME", "=", *nhdDEMObjName, *genQInp); msiAddConditionToGenQuery("COLL_NAME", "=", *nhdDEMObjColl, *genQInp); msiAddConditionToGenQuery("DATA_RESC_NAME", "=", *rescName, *genQInp); # Run query msiExecGenQuery(*genQInp, *genQOut); # Extract path from query result foreach (*genQOut) {msiGetValByKey(*genQOut, "DATA_PATH", *filePath); } writeLine("serverLog", *filePath); # Determine physical path of input directory msiSplitPath(*filePath, *inFileDir, *headerFileIgnore); # Generate physical path of output file msiSplitPath(*inFileDir, *inFileParentDir, *rasterDatasetName) *tileFileName = "SUBSET-"++*rasterDatasetName++".img" *tileFilePath = *inFileParentDir++"/"++*tileFileName; # Generate iRODS path of output msiSplitPath(*nhdDEMObjColl, *nhdDEMObjCollParent, *junk) *tileObjPath = *nhdDEMObjCollParent++"/"++*tileFileName *args = "-of HFA -projwin "++*extentCoords++" "++"'*inFileDir'"++" "++"'*tileFilePath'"; writeLine("serverLog", *args); msiExecCmd("gdal_translate", *args, "iren.renci.org", "null", "null", *cmd_out); writeLine("serverLog", *cmd_out); # Register tile file with iRODS msiPhyPathReg(*tileObjPath, *rescName, *tileFilePath, "null", *status); }

Event-Driven Real-Time Drought Analysis/Prediction Workflow Data Grid – Collaboration Environment RAPID (river routing model) RAPID (river routing model) NASA NLDAS-2 Other data sources Invoke Monitor Output Store Visualization NCSA Cyberintegrator

Management of Workflows Workflow components – File containing input parameters, input file names, output file names – Input files – File containing workflow language – Output files Each invocation of the workflow generates versioned instance – Compare results across input file versions – Share workflows – Re-execute workflows Automatically associates input parameters with each workflow invocation and with resulting output files

Workflow Management eCWkflow.mss Workflow file /earthCube/eCWkflow Directory holding all input and output files associated with workflow file (mounted collection that is linked to the workflow file) eCWkflow.mpf Input parameter file, lists parameters and input and output file names /earthCube/eCWkflow/eCWkflow.runDir0 Directory holding all output files generated for invocation of eCWkflow.run, the version number is incremented eCWkflow.run Automatically generated run file for Executing each input file Outfile Output file created for eCWKflow.mpf eCWkflow2.run eCWkflow2.mpf /earthCube/eCWkflow/eCWkflow2.runDir0 Newfile Output file created for eCWKflow2.mpf

Workflow Re-execution & Sharing eCWkflow.mss /earthCube/eCWkflow eCWkflow.mpf /earthCube/eCWkflow/eCWkflow.runDir0 eCWkflow.run Outfile /hydrology/myWkflow myWkflow.mpf /hydrology/myWkflow/myWkflow.runDir0 myWkflow.run Outfile …. imcoll /earthCube/eCWkflow/eCWkflow.runDir1 Outfile /hydrology/myWkflow/myWkflow.runDir1 Outfile

DFC + DataONE Interoperability Goal: support interoperability between a DFC data grid and DataONE Task: Retrieve a file from DataONE, load into a DFC collaboration environment and add metadata 14

How It Works 1.Query DataONE Coordinating Nodes with SOLR query 2.Create iRODS collection with same name as query 3.Get list of identifiers for metadata files from search 4.Download the metadata file for each identifier 5.Store the metadata file in DFC data grid 15

What the Demo Shows 16 REST APIs Collection “rain” Query: “rain” 1 2 Matching identifier list 3 Get metadata file for each identifier 4 File goes into collection Mercury Web portal file

EarthCube Layered Architecture NSF EAGER DataNet Federation Consortium NSF OCI iRODS Policy-based data management NSF SDCI