DSET Overview WAG Meeting, Aug. 10, 2017 Matt Mayernik mayernik@ucar.edu https://ncar.ucar.edu/data-stewardship-engineering-team-dset.

Slides:



Advertisements
Similar presentations
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Advertisements

Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
ADC Meeting ICEO Standards Working Group Steven F. Browdy, Co-Chair ADC Workshop Washington, D.C. September, 2007.
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
Demystifying the Business Analysis Body of Knowledge Central Iowa IIBA Chapter December 7, 2005.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Comprehensive Cross-organizational Data Management NCAR Executive Committee April 24, 2014 Contributors: Matt Mayernik – IIS Eric Nienhouse – CISL Steve.
Crux flexible, structured data reporting for funding agencies.
Building and Recognizing Quality School Systems DISTRICT ACCREDITATION © 2010 AdvancED.
EPA Enterprise Data Architecture Metadata Framework Assessment Kevin J. Kirby, Enterprise Data Architect EPA Enterprise Architecture Team
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Find Research Data b2find.eudat.eu B2FIND User Training How to find data objects and collections using EUDAT’s B2FIND This work is licensed.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
The Data Sharing Working Group 24 th meeting of the GEO Executive Committee Geneva, Switzerland March 2012 Report of the Data Sharing Working Group.
SOFTWARE ARCHIVE WORKING GROUP (SAWG) REPORT TODD KING PDS MANAGEMENT COUNCIL MEETING FEB. 4-5, 2016.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Digital Asset Management & Storage Program Program Summary
Process 4 Hours.
Sample Fit-Gap Kick-off
RDA 9th Plenary Breakout 3, 5 April :00-17:30
Digital Repository Certification Schema A Pathway for Implementing the GEO Data Sharing and Data Management Principles Robert R. Downs, PhD Sr. Digital.
Strategies for NIS Development
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
WHY? - Found initiative while case statement preparation
OpenAIRE in 8 Minutes Tony Ross-Hellauer State and University Library,
EOSC MODEL Pasquale Pagano CNR - ISTI
Usage scenarios, User Interface & tools
D33.1B PEER REVIEW OF DIGITAL REPOSITORIES
Repository Cross-Linking
eInfraCentral Portal User requirements and features
Trustworthiness of Preservation Systems
Progress Collaborations FUTURE
The Challenge.
Steering Group Member, Link Digital
Active Data Management in Space 20m DG
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Exploitation of ISS Scientific data - sustainability
VI-SEEM Data Repository
API Documentation Guidelines
Setting Actuarial Standards
Data Stewardship Interest Group WGISS-45 Meeting
Reflection on OAC Manual Quality Audit- Learning By Sharing
OneStop Metadata Team Lead
Prepared by: Jennifer Saleem Arrigo, Program Manager
1/18/2019 Transforming the Way the DoD Manages Data Implementing the Net Centric Data Strategy using Communities of Interest Introduction
HingX Project Overview
Twin Cities Business Architecture Forum 1/19/2016
2/15/2019 Transforming the Way the DoD Manages Data Implementing the Net Centric Data Strategy using Communities of Interest Introduction
Capacity Building for HMIS Leads
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Dataverse for citing and sharing research data
Data Management Components for a Research Data Archive
NIEM Tool Strategy Next Steps for Movement
The Digital Library for Earth System Education (DLESE):
Successful Data Curation for Large Data Archives
Australian and New Zealand Metadata Working Group
NOAA OneStop and the Cloud
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

DSET Overview WAG Meeting, Aug. 10, 2017 Matt Mayernik mayernik@ucar.edu https://ncar.ucar.edu/data-stewardship-engineering-team-dset

DSET Vision Satisfy user needs by developing complete organization-wide data discovery and access for scientific research community data.ncar.edu data.ucar.edu Single front door to ALL community data

Working Definition Data Digital assets intended for scientific community use, including files and metadata, publications, reports, images, software (visualization, analysis, model codes), and related data services. Why do we say “science”. Our best chance for success is to limit the scope, and satisfy our “primary” users and their needs. It is not to say the the development stops there. Better serving the public is important for NCAR/UCAR – this is a future objective. Just trying be practical and not over reaching at the start. Services include: helping with NSF DMP, EOL project planning, supporting scientific inquiry, consulting (“help me find this”), things like the “Climate Data Guide”,

www2.ucar.edu/research-resources/data-archive-services

Why is this important? Users and colleagues “in the know” are well served New and diverse community of users is easily frustrated Many services sprinkled across the organization Not well coordinated, no over arching consulting Current websites are not comprehensive Need to respond to new requirements from funding agencies and scientific journals

DSET Membership Lab Representatives Leadership Team Rebecca Centeno Elliott (HAO, rce@ucar.edu) Linda Cully (EOL, cully@ucar.edu) Louisa Emmons (ACOM, emmons@ucar.edu) Abby Jaye (MMM, jaye@ucar.edu) Don Kolinski (HAO, kolinski@ucar.edu) Ryan May (Unidata/UCP, rmay@ucar.edu) Matt Mayernik (Lib/UCP, mayernik@ucar.edu) Tor Mohling (RAL, tor@ucar.edu) Eric Nienhouse (CISL, ejn@ucar.edu) David Schneider (CGD, dschneid@ucar.edu) Steven Worley (CISL, worley@ucar.edu) Dan Ziskin (ACOM, ziskin@ucar.edu) Leadership Team Liaison to NCAR Executive Committee: Bill Mahoney (RAL) DSET Chair: Steven Worley Organizing Committee: Steven Worley, Linda Cully, Abby Jaye, Don Kolinski, Matt Mayernik, Eric Nienhouse

DSET Guiding Principles 1) Cross-organizational participation - include all NCAR laboratories in the process, and keep a close relationship with the NCAR executive leadership 2) Science & user-centric development - ensure that the committee includes strong representation from scientists and technical experts 3) Document our processes - keep records of activities, findings, and decisions, for our own benefit and to share with other organizations

Funding for DSET activities Support for DSET Funding for DSET activities Meeting attendance for members Metadata Facilitator - Sfw Engr (Don Stott, EOL) Scientific Data Mgt Development Team (CISL SAGE) One-time lab funds for metadata development Data Stewardship Coordinator/Liaison (Sophie Hou, CISL)

DSET Accomplishments (FY15-FY17) Organized DSET team Inventory of digital assets Metadata evaluation and development Developed cross-cutting Search and Discovery system (in beta now) Developing requirements toward new data repository

Digital Asset Services Hub (DASH) - www2.cisl.ucar.edu/dash Provided by the Data Stewardship Engineering Team (DSET) Initiative DASH Metadata ISO standard for NCAR dialect NMDEdit metadata tool Bulk metadata ingest Lab WAF on GitHub CKAN metadata harvesting Metadata validation DASH Consulting Data Management Plans Preparation Help Samples Digital Object IDs (DOIs) Training seminars In-person assistance DASH DASH Search Built on CKAN Driven by metadata User interface (UI) Cross organization asset search and discovery Application Programming Interface (API) External metadata sharing DASH Repository Under development Trustworthy requirements NCAR governance Operational procedures Technical features User & provider functions Dimmed text indicates future features

DASH Consulting https://www2.cisl.ucar.edu/dash

DASH Metadata Define a NCAR metadata dialect Not a new local standard, blending of two existing standards that serve two distinct purposes DataCite, enables DOI assignment, publication, and general discovery ISO 19115, enables faceted (detailed) data discovery, highly used in the geosciences Challenge: Having standards doesn’t mean everybody interprets or implements the standards in the same way.

DASH Metadata Result: Two separate metadata categories Minimum Required Metadata Basic element to support simple discovery and enough to register a DOI Enhanced Metadata Will support more system features, e.g. more detailed searching and browsing

DASH Search (beta)

Developing the DASH Search How do you establish what the system must do, i.e. capabilities? Defining the system features User stories that imply requirements Personas: data providers and data consumers Synching: user needs with system and software requirements

Ranked System Features 1 Free text search 10 Real time data infrastructure 2 Search result ranking 11 Natural term definitions 3 Human consulting 12 Long term preservation 4 Asset attribution 13 Asset self archiving 5 Faceted search 14 Native metadata translation 6 Use metrics 15 Access control 7 Data format translation 16 Directed download workflows 8 Visual data asset browse 17 Metadata sharing 9 General purpose storage 18 Relationship display This list evolves based on DSET discussions and engineering work.

Search and Discovery (CKAN) DASH Search - Architecture Metadata Entry NMDEdit (Now) GitHub Metadata Repository (Old ISO) Web Tool (Future) Search and Discovery (CKAN) NCAR Repos OpenSky, EOL, RAL, CGD, etc. GitHub Metadata Repository (New ISO) https://github.com/NCAR/dset-web-accessible-folder-iso19115-3-dev https://github.com/NCAR/dset-web-accessible-folder-iso19115-3-prod

DASH Search Technology Considerations Metadata schema support & flexibility Usable by community of users (personas) Contributor community for open source technology Active support organization and documentation Sustainable development cost Sustainable operational cost Peer organization use Technology permanence and longevity

DASH Repository – Currently Developing Requirements Establishing a trusted repository: Governance Collections Scope Acquisition Workflows Lifecycle Storage Monitoring

WAG Participation Web technology best practices Portals, GitHub, Analytics, etc Usability testing/ Interface feedback Repository use cases