Download presentation
Presentation is loading. Please wait.
Published byDamian Hodge Modified over 9 years ago
1
Laura Russell (larussell@vertnet.org) VertNet Meherzad Romer (mromer@natureserve.ca)mromer@natureserve.ca NatureServe Canada John Wieczorek (tuco@berkeley.edu) Museum of Vertebrate Zoology, UC Berkeley Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition Introduction to the new ways of data publishing
2
Data Publishing Options
3
Terminology Data Publisher, Provider Data Resource, data set Data resource type (e.g., Metadata, Occurrence, Taxon Data record Data record element, term, field, column, property, attribute, concept (e.g., basisOfRecord, scientificName) Data value Standards, Vocabularies
4
Data Publishers Institutions with multiple organisational units, each with multiple data resources. Institutions, groups, or individuals with multiple data resources. Institutions or individuals with a single data resource.
5
Data Resource Types Primary Biodiversity Data (Specimens & Observations, Ecological Data) Core data type is an Occurrence of a organism Taxonomic Catalogues*, and Annotated Species Checklists. Core data type is a Taxon * To distinguish our efforts from COL – GBIF provides the means not the ends Enriched resource metadata – primarily focused on Occurrence and Taxon data sets.
6
Data Records Taxon resource type Occurrence resource type
7
Data Fields Taxon resource type Occurrence resource type
8
Data Values Taxon resource type Occurrence resource type
9
Data Standards Primary Biodiversity Data Taxonomic Data Darwin Core 172 Terms Ratified in 2009 Text files Extensible Metadata Ecological Metadata Language (EML) Rich data set descriptions GBIF Profile
10
Data Publishing Options
12
Suppose TAPIR allows 1000 records per request For a data set of 260 000 records: 260 data exchanges / 500MB total data transfer 2 hours to harvest Only 32MB of the transferred data are "used" for the GBIF network Tapir Example
13
Data Publishing Options
14
For a data set of 260 000 records: 1 data exchanges / 3MB total data transfer seconds to harvest Darwin Core Archive Example Darwin Core Archive
15
For a data set of 260 000 records: 1 data exchanges / 3MB total data transfer seconds to harvest Darwin Core Archive Example Darwin Core Archive Compare to Tapir/DiGIR/BioCASE: 260 data exchanges / 500MB total data transfer 2 hours to harvest
16
Simple format (text files) Efficient storage (compressed) Efficient harvesting (single file) Easy access (no special software required) Extensible (related files in one archive) Darwin Core Archive: Benefits Preferred format for publishing data in the GBIF network
17
Data Discovery
18
GBIF Registry
19
GBIF Data Portal
20
GBIF Online Resource Centre (http://www.gbif.org/orc/)http://www.gbif.org/orc/ Data Publishing Documentation
21
IPT v2 User Manual http://code.google.com/p/gbif- providertoolkit/wiki/IPT2ManualNotes Publishing Using Dropbox http://www.youtube.com/user/gbiffrance References
22
Presenter (email) Role Organization Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition Introduction to the new ways of data publishing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.