Download presentation
Presentation is loading. Please wait.
Published byEmmeline Donna McDonald Modified over 6 years ago
1
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape of biodiversity data publishing Presenter ( ) Role Organization Buenos Aires (Argentina) 28 September 2011
2
Background: Data Exchange
ABCD (TDWG Standard) > 1200 concepts XML Shared via BioCase, Tapir Darwin Core (pre-standard v. 1.2, 47 versions) 48 concepts, specimens Shared via by DiGIR Darwin Core (pre-standard v. 1.4) 46 concepts (plus extensions), specimens Shared via Tapir Darwin Core (TDWG Standard) 172 concepts (156 in Simple Darwin Core), biodiversity data CSV, XML, RDF, JSON, … Shared via Text files, Tapir, Darwin Core Archive… - Reminder about the existing standards
3
Darwin Core Archive Primary Biodiversity Data Taxonomic Data Metadata
4
Darwin Core Archive Complete Package
Standard Darwin Core terms in a single, self-contained dataset Taxon records or Occurrence Records Data set metadata in EML C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file
5
publishing data in the GBIF network
Darwin Core Archive: Benefits Simple format (text files) Efficient harvesting (single file) Efficient storage (compressed) Easy access (no special software required) Extensible (related files in one archive) Simpler data transfer: what takes 500MB of data transfer in Tapir takes 3MB data transfer in DwC-A. Extensible format: more flexible way to map data Preferred format for publishing data in the GBIF network
6
Archives always have a metadata file as EML
Darwin Core Archive: Anatomy Archives always have a metadata file as EML C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file
7
Ecological Metadata Language (EML)
For describing data sets – even unpublished ones Title and Abstract Citation and Attribution Contact and Authors Geographic Scope Sampling Methods Bibliography and more…
8
Archives always have a core data file as text
Darwin Core Archive: Anatomy Archives always have a core data file as text C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file
9
Records based on species occurrences – one per row
Core data file types Records based on taxa – one species per row OR Records based on species occurrences – one per row
10
Archives always have a core data file as text
Darwin Core Archive: Anatomy Archives always have a core data file as text C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file
11
Core contains a “core ID” column, unique for every record in the file
Darwin Core Archive: Anatomy Core contains a “core ID” column, unique for every record in the file
12
Columns are matched to Darwin Core terms
Darwin Core Archive: Anatomy Columns are matched to Darwin Core terms
13
“Wingspan” is not a Darwin Core term
Darwin Core Archive: Anatomy Columns that do not match to a Darwin Core term may be included, but are ignored “Wingspan” is not a Darwin Core term
14
1) Rename columns in text file
Darwin Core Archive: Anatomy Two ways to match columns to Darwin Core terms 1) Rename columns in text file
15
2) Match columns to terms in a separate meta.xml file
Darwin Core Archive: Anatomy Two ways to match columns to Darwin Core terms 2) Match columns to terms in a separate meta.xml file
16
Darwin Core Archive: Anatomy meta.xml matches the columns
in the core data file (species.txt) More on how to make the meta.xml file later…
17
Archives can include extension files
Darwin Core Archive: Anatomy Archives can include extension files Species.txt Extensions link to the core through the core ID Common_names.txt Extensions allow multiple records to be linked to a core record.
18
GBIF hosts extension definitions
19
Multiple extensions files can be linked to the core
Darwin Core Archive: Anatomy Multiple extensions files can be linked to the core
20
Darwin Core Archive: Anatomy All files are stored in a single folder
21
Darwin Core Archive: Anatomy The folder is zipped. Data files
Column matching file Data set documentation This is a Darwin Core Archive
22
Darwin Core Archive: Publishing
/my_data.zip Archives on a web server can be accessed by a URL. Share this URL to “publish” your data!
23
Darwin Core Archive: Publishing Options
24
GBIF Spreadsheet Templates
25
Integrated Publishing Toolkit
26
Data Hosting Centers
27
Darwin Core Mapping Assistant
Metafile
28
Darwin Core Mapping Assistant
29
Darwin Core Archive: Publishing Options
GBIF Darwin Core Archive Spreadsheet Templates: data in a spreadsheet already simple archive authoring IPT: creating/managing archives for multiple data sets managing archives for multiple organisations metadata as GBIF Metadata Profile of EML Make Your Own: automating archive generation customisation Hosting center: economy of scale Infrastructure and support Combinations… Explain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file
30
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape of biodiversity data publishing Presenter ( ) Role Organization Buenos Aires (Argentina) 28 September 2011
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.