The NOAA National Geophysical Data Center And Collocated World Data Service for Geophysics Dan Kowal Data Administrator, Information Services Division NOAA / NESDIS / NGDC GeoData Workshop 2014 Failure to Connect? 1
Technical issues of connecting geodata in and between governmental agencies.
Challenges and Accomplishments Metadata Publication Software Development Data Citation
Metadata Tools
Measurement of Completeness RecordsRubric Scores ValidInvalidCount ≥ 20Count ≥ 25MeanMinMax
Count of Broken URLS ComponentsOther XlinksBroken URLsBroken Xlinks CountReuseCountReuseCountReuseCountReuse
Metadata Publication - Local NGDC Metadata Homepage NGDC Metadata Homepage – Immediately available NGDC Geoportal – synchronized weekly or upon request
Software Challenges ● Wide variety of data types ● Diversity of data providers ● Decreasing staff and funds ● Increasing number of data sets ~ 600 to date ● Legacy code bases ● Lack of communication
Engineering Objectives ● Common framework o standardize on common technologies, shared knowledge, centralization supporting tracking / reporting ● Isolate dataset specific components o share things like file handling, messaging across disparate datasets ● Modular and extensible o ease maintenance and facilitate testing, phasing in new capabilities (incremental improvements), reduce likelihood of system-wide impacts to errors or malfunctions
Engineering Objectives - cont’d ● Industry-standard and best practices and patterns o develop in teams, automated builds, test coverage, leverage industry tools ● Resilient o eliminate single points of failure, be able to restart processes following errors without data loss, secure ● Minimize custom code o reduce software maintenance
New Access Interfaces at NGDC 12
DOI Landing Page 13
DOI Landing Page 14
DOI Readiness Assessment
Data Citation Summary Data Linkage to Publications: – Data Citation Index in Thomson-Reuters’ Web of Knowledge Data Citation Index – Elsevier ScienceDirect – Ongoing discussions. Elsevier ScienceDirect Procedural Directive for Data Citation in the works. – Leverage ESIP Guidance – NCAR’s Data Citation White Paper DataCite – ~ 50 Datasets minted through EZID.
In Summary… Need to fix the catalog publishing disconnect. Enterprise approach to development paying dividends. – Creating opportunities for reuse. – Generic functionality shared across data sets. – Going to take more resources to transition legacy data sets. Collaboration in Data Citation practices across Data Centers bodes well for future consolidation. Begin “Interoperability” discussion early when initiating a new Archive Project.