IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF-1115210).

Slides:



Advertisements
Similar presentations
Willing to spend the time! Self motivated! Self responsibility! (If you need something Ask For IT!!!!!) Ability to communicate! (Vocabulary) Write,
Advertisements

February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
Katia Cezón GBIF Spain, Coordination Unit Real Jardín Botánico, Madrid 2014 Mentoring Project 2014 France-Portugal-Spain DATA QUALITY WORKFLOW.
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Entomological Collections Network Meeting, Indianapolis, IN 13 December 2009 Darwin Core Ratified in the Year of Darwin Gail E. Kampmeier Illinois Natural.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Linking collections to related resources: Multi-scale, multi-dimensional, multi-disciplinary collaborative research in biodiversity. Is this a “Big.
This material is based upon work supported by the National Science Foundation under Cooperative Agreement EF Any opinions, findings, and conclusions.
Roles and Goals Greg Riccardi. iDigBio People University of Florida o Larry Page, Jose Fortes, Pamela Soltis, Bruce McFadden, Renato Figueiredo, Reed.
Publishing biodiversity data via GBIF data templates and IPT2 Hsiang-Ying Li, Jason Mai Biodiversity Research Center, Academia Sinica
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
1 Advanced Computer Programming Databases. Overview What is a database? Database Basics Database Components Data Models Normalization Database Design.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Research Data Management At the Smithsonian Using SIdora Nano Tech Working Group May 15, 2014.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer October DarwinCore Archives – Simplified Format for publishing.
University of Florida Florida State University
A CIDOC CRM – compatible metadata model for digital preservation
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
GLOBAL BIODIVERSITY INFORMATION FACILITY TDWG 2009, Montpelier, November 12, 2009 Dag Endresen (NordGen)Samy Gaiji (GBIF) Dag Endresen (NordGen) & Samy.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Aspects for Improving the ABBI Patricia Escalante Instituto de Biología UNAM AOU-Collections Committee member.
GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June Metadata publishing with the IPT.
1 GBIF and Ocean Biodiversity, OBI'07 Conference, Oct 2-4, 2007, Dartmouth, Nova Scotia GBIF and Ocean Biodiversity Building the data web with OBIS Éamonn.
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Audubon Core (AC) Metadata Vocabulary for Biodiversity Multimedia
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
Dag Endresen Knowledge Systems Engineer GBIF New Orleans (Louisiana, USA) 20 October 2011 Biodiversity Information Standards, TDWG.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
U.S. Department of the Interior U.S. Geological Survey The Biological Data Profile Extending the FGDC Metadata Standard Kirsten Larsen.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.
NLBIF The Netherlands Biodiversity Information Facility NLBIF The Netherlands Biodiversity Information Facility Cees Hof Netherlands Biodiversity Information.
Networking Biodiversity Data – Online Access to Distributed Data Sources in GBIF-D Andrea Hahn, A. Kirchhoff & W.G. Berendsohn Botanic Garden and Botanical.
IABIN Species and Specimens Thematic Network (SSTN) IABIN Executive Committee/Coordinating Institution Meeting. Tierras Enamoradas, Costa Rica. February.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
The Global Genome Biodiversity Network (GGBN) Data Portal & ABCDDNA Gabriele Droege Botanic Garden and Botanical Museum Berlin-Dahlem.
Laura Russell VertNet Meherzad Romer NatureServe Canada John Wieczorek
IPT + Darwin Core OBIS XML Schema OBIS Database Schema Explained Mike Flavell OBIS Data Manager OBIS Nodes Training Course, Oostende, Belgium, 6 May 2014.
OBIS IODE PO OBIS INCOIS OBIS- SEAMAP Separate files OBIS Nodes Data providers Separate files GBIFLifeWatchGEOSSEOL,…CBDFAOISA Fail-over mirrorGeo-load.
Sample-based data publication; reflections on semantics and logic 1(1) Hanna - GBIF Finland Lepidoptera collection of Hannu SaarenmaaPublicNo (but DwC.
GB22 TRAINING EVENT FOR NODES – 4 OCTOBER 2015 Session 02: 2015 Data Publishing Landscape Laura Russell.
Getting to know the data, Getting to know all about the data
The IPT user interface and data quality tools
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
GLOBAL BIODIVERSITY INFORMATION FACILITY
Data Management: The Data Repatriation Re-integration Step or …
1B Publishing Primary Biodiversity Data
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
HOW (and why?) DO WE DESCRIBE ?
Presentation transcript:

iDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Standards and sharing complex primary biodiversity data; and what is an extension anyway? Example extensions to DwC: Audubon Media Description (AC), Identification History, and briefly, the Global Genome Biodiversity Network (GGBN) extensions Deb Paul, Laura Russell, Derek Masaki, (David Shorthouse) Data Sharing, Data Standards, and Demystifying the IPT Workshop – Day 1

2 Overview The data landscape (silos) The data landscape (silos) s t a n d a rd ss t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

3 GeneticFunctional Taxonomic/ phylogenetic Molecular -> Ecosystem Tree of Life, phylogenomics Organisms -> species Phenotypic expression Bioactive compounds/chemistry Trophic interactions US National Science Foundation Dimensions of Biodiversity Program Interaction at the intersection of taxonomic, genetic, functional domains

4 s t a n d a rd s

5 Complex 1E: Theory: Complex primary biodiversity data DwC does not provide fields for every possible type of data. But you have lots of other types of data, right? extension Introducing the extension – – There are many! And (no doubt) more to come. 22 registered 23 under development Examples – Audubon Media Description (aka Audubon Core) – Darwin Core Identification History – Global Genome Biodiversity Network (GGBN) extensions iDigBio Uses

6 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? What are extensions, and why does Darwin Core need extensions? One-to-many relationships One-to-many relationships Identifiers are the key Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

7 One specimen, so many kinds of data Determination 3 Determination 1 Determination 2 One-to-many relationships Identifiers are the key

8 Using the IPT software Inside the DwC-A you are creating… sampleoccurrence.txt meta.xml XML samplemultimedia.txt (core) sampledeterminations.txt Determination n Determination 1 Determination 2 (extension) describes extends eml.xml eml. XML metadata about the dataset, collection and contact information

9 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN)

10 Audubon Media Description Sharing media – What’s in the image, recording, video? – Who took the photo, made the recording, created the SEM, CT scan? – Is the media under copyright? Or is it public domain? – Where can more / different formats of the media resource be found?

11

12 Vocabularies Audubon Media Vocabularies Management Attribution Agents Content Coverage Geography Temporal Coverage Taxonomic Coverage Resource Creation Related Resources Service Access Point

13 Audubon Media Description aka the Audubon Core – Vocabulariesmetadatabiodiversity multimedia resources collections – Vocabularies to represent metadata for biodiversity multimedia resources and collections. – Purpose is to represent information that will help to determine whether a particular resource or collection is fit for some particular biodiversity science application before acquiring the media. – Vocabularies address such concerns as the management management of the media and collections, content descriptions of their content, their taxonomic, geographic, and temporal coverage, and the retrieveattributereproduce appropriate ways to retrieve, attribute and reproduce them. Link:

14 Audubon Media Description An example of an extension inside the IPT Got media for your specimens?

15 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Darwin Core Identification History Global Genome Biodiversity Network (GGBN) What’s in a name? That which we call a rose By any other word would smell as sweet. Romeo and Juliet, Act 2, Scene 2

16 Darwin Core Identification History Identification histories Identification histories: sharing the names applied to a given specimen through time identifiedBy – Who applied the name? (identifiedBy) dateIdentified – When? (dateIdentified) identificationRemarksdentificationReferences – With what evidence, resource, or comments? (identificationRemarks, identificationReferences) identificationQualifier – Doubt expressed? (identificationQualifier: cf., near, ?) scientificName – Exact name applied (scientificName)

17

18 Darwin Core Identification History multiple identification/determinations specimens Support for multiple identification/determinations of species occurrences such as specimens. All identifications including the current one should be listed, while the current should also be repeated in the occurrence core for simple access. identificationID identifiedBy dateIdentified identificationReferences identificationRemarks identificationQualifier identificationVerificationStatus typeStatus taxonId taxonConceptID scientificName scientificName scientificNameID namePublishedIn namePublishedInYear … higherClassification kingdom, … vernacularName taxonRemarks, … Identification Taxon Determination n Determination 1 Determination 2

19 Darwin Core Identification History

20 Overview The data landscape (silos) s t a n d a rd s What are extensions, and why does Darwin Core need extensions? One-to-many relationships Identifiers are the key Audubon Media Description Darwin Core Identification History Global Genome Biodiversity Network (GGBN) Global Genome Biodiversity Network (GGBN)

21 Global Genome Biodiversity Network (GGBN) extensions: require a Material Sample Core The category of information pertaining to the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed. Created 2 Apr 2014 with all Simple Darwin Core ratified terms. GGBN Amplification Extension GGBN DNA Cloning Extension GGBN Gel Image Extension GGBN Loan Extension GGBN Material Sample Extension GGBN Permit Extension GGBN Preparation Extension GGBN Preservation Extension

22 Now, back to the DwC-A you are creating,… sampleoccurrence.txt meta.xml XML samplemultimedia.txt (core) sampledeterminations.txt Determination n Determination 1 Determination 2 (extension) describes extends eml.xml eml. XML metadata about the dataset, collection and contact information

23 Global Standards Why standards?

24 Why Darwin Core / Why Standards? My database Your database map to a standard!

25 1E: Theory: DC Extensions Defined and Registered with GBIF registry Allow extension while retaining compatibility Extensions are optional data files linked to core A row in an extension file always references the core id corresponding to a taxon or taxon occurrence