School of Information Studies, Syracuse University, Syracuse, NY, USA

Slides:



Advertisements
Similar presentations
The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
Advertisements

WMO Core Profile of the ISO Metadata Standard Steve Foreman Chair IPET-Metadata Implementation.
FGDC & ISO: What is the Current Status and Considerations when Moving Forward? Viv Hutchison USGS Core Science Systems November 10, 2010 Salem, OR.
An Leabharlann UCD Órna Roche UCD James Joyce Library Metadata Documenting your data
Oregon Spatial Data Library Partnership Metadata Training OU Knight Library Eugene, Oregon December 3, 2009 Kuuipo Walsh Institute for Natural Resources.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Caro-COOPS Data Management: Metadata. Cast-Net addresses the need for improved connectivity among coastal observing systems by creating a regional framework.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Chinese-European Workshop on Digital Preservation Beijing (China), July.
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
Creating Documentation and Metadata: Metadata for Discovery Lola Olsen 1, Tyler Stevens 2, 1 National Aeronautics and Space Administration (NASA) 2 Wyle.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
material assembled from the web pages at
Creating Documentation and Metadata: Metadata for Discovery Lola Olsen 1, Tyler Stevens 2, 1 National Aeronautics and Space Administration (NASA) 2 Wyle.
Preserving the Scientific Record: Preserving a Record of Environmental Change Matthew Mayernik National Center for Atmospheric Research Version 1.0 [Review.
Page 1 Informatics Pilot Project EDRN Knowledge System Working Group San Antonio, Texas January 21, 2001 Steve Hughes Thuy Tran Dan Crichton Jet Propulsion.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
Preservation of Scientific Data in the Humanities Chinese-European Workshop on digital preservation. Beijing. July Preservation of Scientific.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
Introduction to metadata
Metadata – use data discovery e.g. a library catalog data assessment determine the fitness-for-purpose of a data set data retrieval e.g., format.
Introduction to Metadata Jenn Riley Metadata Librarian IU Digital Library Program.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
Metadata ESA Workshop. In this session we will discuss…  Metadata: what are they? and why should they be created?  Metadata standards  Creating metadata.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Nancy J. Hoebelheinrich, Metadata Coordinator, Stanford University 1 Metadata for the NGDA: Developing a Shared Approach Joint UCSB / Stanford meeting.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
eContentplus 2008 Work Programme
NASA Earth Science Data Stewardship
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
AKA: Contextual Data Needed to Make Data Meaningful to Others
TRSS Terminology Registry Scoping Study
OceanDocs Digital Repository of Marine Science Research Outputs
Flanders Marine Institute (VLIZ)
W. Christopher Lenhardt
Introduction to Metadata
VI-SEEM Data Repository
C2CAMP (A Working Title)
Outline Pursue Interoperability: Digital Libraries
SDMX: A brief introduction
Application of Dublin Core and XML/RDF standards in the KIKERES
Metadata to fit your needs... How much is too much?
Metadata for digital long-term preservation
ESciDoc Introduction M. Dreyer.
CVE.
Session 2: Metadata and Catalogues
Some Options for Non-MARC Descriptive Metadata
Márton Németh – László Drótos How to catalogue a web archive?
Robert Dattore and Steven Worley
Introduction to reference metadata and quality reporting
Presentation transcript:

School of Information Studies, Syracuse University, Syracuse, NY, USA Metadata as the Underpinning of Sustainable and Effective Access to Scientific Data Jian Qin School of Information Studies, Syracuse University, Syracuse, NY, USA jqin@syr.edu

Topics in this presentation Scientific data objects Problems in metadata for scientific data Who is addressing the problems Solutions to the problems Implications for CODATA 4/18/2019 CODATA2006 -- Beijing

Scientific data objects Spreadsheets Plant specimens Oceanographic data DNA sequence Field observations Databases Geospatial data XML Scientific data are referred to the raw data that have been collected or generated in many ways, such as by measurement or observation of the environment, by carrying out an analytical experiment or by running a computer simulation. 4/18/2019 CODATA2006 -- Beijing

Metadata for scientific data “all the information, additional to the raw data itself, which a potential user of the data would need to know to be able to make full and accurate use of the data in a subsequent scientific analysis…” Sufi, S., & Mathews, B. (2004). CCLRC scientific metadata model: version 2. CCLRC Technical Report: DL-TR-2004-001. Retrieved July 27, 2006, from http://epubs.cclrc.ac.uk/bitstream/485/csmdm.version-2.pdf 4/18/2019 CODATA2006 -- Beijing

Data discovery problems Data set archives are becoming increasingly distributed Institutions may be unable to serve their own data and need to send it to a publicly accessible repository Contributing groups need to maintain their own data remotely Security Data discovery cross data servers 4/18/2019 CODATA2006 -- Beijing

Data description problems (1) Insufficient metadata description of models, experiments, etc. that generate data e.g., weather and climate modeling Dissimilarity among metadata from diverse sources e.g., multiple metadata conventions exist for weather and climate models and experiments: NetCDF (Network Common Data Format) Climate and Forecast (CF) COARDS (Cooperative Ocean/Atmosphere Research Data Service) The current situation in metadata available for weather and climate model output data sets is somewhat disheartening but has been improving in recent years. Model documentation is, in general, uneven and difficult to find. One notable recent exception is the Intergovernmental Panel on Climate Change[1] (IPCC) data set for which standards were adopted and enforced for model documentation[2]. The description of experimental designs is likewise in a somewhat primitive state, and the metadata describing fields stored is often insufficient. A uniform representation of model output metadata, across models and experiments, exists only for a few well-coordinated multi-model projects (e.g., AMIP[3], PMIP[4], DEMETER[5] and IPCC). Data discovery tools are similarly available only for limited sets of model experiments. [1] http://www.ipcc.ch/ [2] http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php [3] Atmospheric Model Intercomparison Project - http://www-pcmdi.llnl.gov/projects/amip/index.php [4] Paleoclimate Model Intercomparison Project - http://www-lsce.cea.fr/pmip2/ [5] http://www.ecmwf.int/research/demeter/ Kinter III, J. L. & Taylor, K. E. (2005). Data Issues for WCRP Weather and Climate Modeling, a white paper based on presentations at the First Session of the WCRP Modeling Panel, Exeter, UK, October 6, 2005. http://copes.ipsl.jussieu.fr/Organization/COPESStructure/Reports/WMP1/WMP1_KinterTaylor.doc 4/18/2019 CODATA2006 -- Beijing

Data description problems (2) Metadata descriptions for scientific data are operated under different paradigms have different terminology for similar concepts need to be able to compare data collected in different contexts with different technologies Different metadata needs for current and future users of primary data expert/familiar users and other users Cost/benefit imbalance for metadata creators (who bear the burden) and users (who enjoy the benefits) Gritton, B. (1994). Metadata comments. http://www.llnl.gov/liv_comp/metadata/papers/comments-gritton.html 4/18/2019 CODATA2006 -- Beijing

Who’s addressing the problems? Metadata standards in scientific domains: ISO 19115:2003 Geographic Information – Metadata: http://www.isotc211.org/ FGDC Metadata Standards: http://www.fgdc.gov/metadata/ Extensions: Biological Data Profile: http://www.nbii.gov/datainfo/metadata/standards/ Shoreline Metadata Profile of the Content Standards for Digital Geospatial Metadata: http://ioc.unesco.org/Oceanteacher/OceanTeacher2/02_InfTchSciCmm/02_Meta/06_MetaStds&Form/FGDC/sprofile.pdf NetCDF Climate and Forecast (CF) Metadata Convention: http://home.badc.rl.ac.uk/lawrence/blog/2005/11/09/the_future_of_cf General metadata standards: Dublin Core, Data Documentation Initiative, METS, etc. 4/18/2019 CODATA2006 -- Beijing

What are the solutions? (1) Service-oriented metadata: Directory services of datasets, databases, repositories 4/18/2019 CODATA2006 -- Beijing

What are the solutions? (2) Object-oriented metadata: Description of scientific data objects Documentation of data origins, processing history, collection methodologies, measurement precision, etc. Content description of data objects Administrative and maintenance of data objects 4/18/2019 CODATA2006 -- Beijing

How metadata is created? Manual creation: Collection level Mainly for discovery purpose Automatic or semi-automatic generation Object level Mainly for identifying, locating, and selecting data resources Many approaches exist 4/18/2019 CODATA2006 -- Beijing

Approaches in automatic generation of object-level metadata Extraction: use natural language processing, machine learning, or other methods to extract metadata from data objects Assignment: assign metadata values to data objects against a controlled vocabulary or metadata scheme based on automatic analysis result Combining automatic extraction/assignment with human intervention 4/18/2019 CODATA2006 -- Beijing

Implications to CODATA (1) Service-oriented metadata: Vital for data discovery in distributed, remote access to data archives Vital for data repository administration and maintenance in a distributed environment CODATA metadata directory service: a distributed and coordinated effort is needed to enable data discovery by geographic coverage enable data discovery by temporal coverage enable data discovery by disciplinary or cross-disciplinary domains enable data discovery by cross-sections of all the above 4/18/2019 CODATA2006 -- Beijing

Implications to CODATA (2) Object-oriented metadata: Vital for identifying and using data objects CODATA inventory of domain metadata standards for scientific disciplinary and interdisciplinary fields is needed to: enable data interoperability enable data quality assessment enable long-term preservation and use 4/18/2019 CODATA2006 -- Beijing

Question?