OAC-1664061 OAC-1664018 OAC-1664119 2017-2021 ACI-1148453 ACI-1148090 2012-2017 The HydroShare domain-specific repository for archiving and active management of hydrologic data Access these slides in HydroShare by searching for RDA2018 or going to https://www.hydroshare.org/resource/eb68813d6dec4aa4b4325107b406bb1a/ David G Tarboton, Jerad Bales, Ray Idaszak, Jeffery S Horsburgh, Daniel P Ames, Jonathan L Goodall, Alva Couch, Lawrence E Band, Venkatesh Merwade, Richard P Hooper, David R Maidment, Pabitra K Dash, Michael Stealey, Hong Yi, Christopher Calloway, Tian Gan, Anthony M Castronova, Zhiyu Li, Mohamed M Morsy, Shawn Crawley, Maurier Ramirez, Jeffrey Sadler, Zhaokun Xue, Martyn Clark, Shaowen Wang, Bart Nijssen and Christina Bandaragoda HydroShare is operated by CUAHSI with ongoing development through a collaborative project among Utah State University, Brigham Young University, CyberGIS Center University of Illinois, Tufts, University of Virginia, and RENCI University of North Carolina. http://www.hydroshare.org
What is CUAHSI? CUAHSI is a 501(c)3 Non-Profit Consortium of about 130 U.S. Academic Institutions, Non-Profits, and International Universities Mission is to shape the future of water science by: Strengthening interdisciplinary collaboration in the water-science community Empowering the community by providing critical infrastructure Promoting education in the water sciences at all levels Key Activities Community Services, such as workshops, community meeting, training, etc. Data and Model Services, including HydroShare and time-series services CUAHSI = Consortium of Universities for the Advancement of Hydrologic Science Key support comes from: Department of Homeland Security Federal Emergency Management Agency Johnson Family Foundation National Aeronautics and Space Administration National Oceanic and Atmospheric Administration National Science Foundation National Weather Service William Penn Foundation
Motivation: Hydrologic research is a team sport Advancing Hydrologic Understanding requires integration of information from multiple sources may be data and computationally intensive requires collaboration and working as a team/community Grand challenge (NRC 2001): Better hydrologic forecasting that quantifies effects and consequences of land surface change on hydrologic processes and conditions Floods and Droughts
CyberInfrastructure Challenges Data Analysis Models The data deluge Large datasets, data heterogeneity, Inadequate metadata Data Organization and Model Input preparation Reproducibility Software installation and configuration Platform dependencies, Library dependencies, Licensing Computational resources Memory, disk and processing
Data and models used by hydrologists are diverse… Time series Geographic rasters Geographic features Multidimensional space/time Model programs Model instances … http://www.usgs.gov But, the data we use are diverse, and we don’t currently have a lot of options for sharing models or model instances. We’ve nailed time series down, but we still don’t have great cyberinfrastructure embraced by our community that supports the broad spectrum of data and models we use. http://www.unidata.ucar.edu http://www.esri.com HydroShare can hold data in a wide variety of formats, and data in any format as “generic” From Jeff Horsburgh
HydroShare is a platform for sharing Hydrologic Resources and Collaborating File Storage DropBox-ish Functionality dropbox.com Meta Data Descriptions Data Access API Web Apps Social Functions DOI Data Publication Value Added Functionality From Dan Ames
HydroShare is a platform for sharing Hydrologic Resources and Collaborating File Storage DropBox-ish Functionality The goal of HydroShare is to advance hydrologic science by enabling the scientific community to more easily and freely share products resulting from their research - not just the scientific publication summarizing a study, but also the data and models used to create the scientific publication. Meta Data Descriptions Data Access API Web Apps Social Functions DOI Data Publication Value Added Functionality From Dan Ames
HydroShare provides Data Creation Data Preparation Data Description Data Publication Data Discovery Data Analysis Share dataset in HydroShare with trusted colleagues or to the open public Data versioning function for dataset curation and update as it evolves Metadata functions for researchers to annotate the dataset Data publication function to make data citable and permanently published with DOI Search & filter function for researchers to discover the dataset OPeNDAP service for researchers to reuse the dataset and derive new dataset A platform for data management to support mandates for open data and access to the data that supports research findings Integration of information from multiple sources to enhance research Re-use of data beyond the purpose for which it was originally collected, extending the value of measurement, monitoring and research investments Data Life-Cycle approach to capture metadata early and often Tools (web apps) to provide “what is in it for me” value to working with data in HydroShare, and serve as a gateway to High Performance Computing and computing in the Cloud Enhanced trust in research findings and management decisions through transparency and support for reproducibility From Tian Gan
OAI-ORE standard based Resource Data Model
Collaborative data sharing Add content to HydroShare to share with your colleagues or formally publish to document result reproducibility
Resources (data and models) in HydroShare are objects of collaboration (social objects) For each resource you can Manage who has access To edit To view Comment or rate Get unique identifier Describe with metadata Organize into collections Formally publish with DOI Version Open with compatible web app
Automatic and natural metadata gathering eases some of the pain of metadata entry For geographic raster WGS 84 Coverage information automatically harvested from GeoTIFF coordinate system information For multidimensional netCDF data with CF convention metadata the HydroShare metadata can be fully and automatically completed
Apps act on resources to support web based visualization and analysis http://www.hydroshare.org/apps
Jupyter Python Notebook App Write and execute code acting on content of HydroShare resources and saving results back to HydroShare
Resource Organization Concepts A composite resource can hold multiple aggregations Each being a different type of data Managed as one discoverable resource One set of access controls (Owners, Editors etc.) One unique identifier One set of resource level metadata A collection can hold multiple resources Each has own unique identifier Own access control (separate owners and editors etc.) Separate resource level metadata and landing page Collections and their members may each be discovered separately Composite resources may be members of Collections Unique keyword tags form informal collections (e.g. “AGU2017”) Composite Multi-dimensional Feature Grid Collection
How HydroShare Works HydroShare Federated Data Store Data Store HydroShare Apps Django website iRODS “Network File System” API OAuth Web software to operate on content you have access to (Apps) Extensibility Organize and annotate your content Manage access Resource exploration Actions on Resources Anyone can set up a server/app platform (software service) to operate on HydroShare resources through iRODS and API Distributed file storage HydroShare Data Store Federated Data Store SWATShare (Hubzero) CyberGIS Unidata - THREDDS, Hyrax Landlab
Only 9% of users have indicated type in their profile Audience and Use Statistics as of 12/9/17 Primary audience is US Hydrologic Research community (NSF funding) but open to international use and use by water resource professionals, educators and citizen scientists Other Specified Grad Student Undergrad student Post Doc Government Official University Professional Faculty Commercial Professional Only 9% of users have indicated type in their profile
Summary A web-based system for data and model sharing Access multiple types of hydrologic data using standards compliant data formats and interfaces Flexible discovery functionality Model sharing and execution Facilitate and ease access to use of high performance computing Social media and collaboration functionality Links to other data and modeling systems Enable more rapid advances in hydrologic understanding through collaborative data sharing, analysis and modeling
Thanks to the HydroShare team! OAC-1664061 OAC-1664018 OAC-1664119 2017-2021 HydroShare is operated by CUAHSI with ongoing development through a collaborative project among Utah State University, RENCI University of North Carolina, CyberGIS Center University of Illinois, Tufts, University of Virginia, Brigham Young University, National Center for Atmospheric Research and the University of Washington. To learn more Publications https://help.hydroshare.org/about- hydroshare/publish/all/ Online Help https://help.hydroshare.org/ The HydroShare project is part of a broad effort in CUAHSI in the area of Hydrologic Information Systems. We have a team of developers and domain scientists from eight universities working on HydroShare. This is part of the even broader focus in NSF on data management, Cyberinfrastructure and sustainable software. http://www.hydroshare.org