Presentation is loading. Please wait.

Presentation is loading. Please wait.

Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.

Similar presentations


Presentation on theme: "Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape."— Presentation transcript:

1 Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape of biodiversity data publishing Presenter ( ) Role Organization Buenos Aires (Argentina) 28 September 2011

2 Background: Data Exchange
ABCD (TDWG Standard) > 1200 concepts XML Shared via BioCase, Tapir Darwin Core (pre-standard v. 1.2, 47 versions) 48 concepts, specimens Shared via by DiGIR Darwin Core (pre-standard v. 1.4) 46 concepts (plus extensions), specimens Shared via Tapir Darwin Core (TDWG Standard) 172 concepts (156 in Simple Darwin Core), biodiversity data CSV, XML, RDF, JSON, … Shared via Text files, Tapir, Darwin Core Archive… - Reminder about the existing standards

3 Darwin Core Archive Primary Biodiversity Data Taxonomic Data Metadata

4 Darwin Core Archive Complete Package
Standard Darwin Core terms in a single, self-contained dataset Taxon records or Occurrence Records Data set metadata in EML C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file

5 publishing data in the GBIF network
Darwin Core Archive: Benefits Simple format (text files) Efficient harvesting (single file) Efficient storage (compressed) Easy access (no special software required) Extensible (related files in one archive) Simpler data transfer:  what takes 500MB of data transfer in Tapir takes 3MB data transfer in DwC-A.  Extensible format: more flexible way to map data Preferred format for publishing data in the GBIF network

6 Archives always have a metadata file as EML
Darwin Core Archive: Anatomy Archives always have a metadata file as EML C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file

7 Ecological Metadata Language (EML)
For describing data sets – even unpublished ones Title and Abstract Citation and Attribution Contact and Authors Geographic Scope Sampling Methods Bibliography and more…

8 Archives always have a core data file as text
Darwin Core Archive: Anatomy Archives always have a core data file as text C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file

9 Records based on species occurrences – one per row
Core data file types Records based on taxa – one species per row OR Records based on species occurrences – one per row

10 Archives always have a core data file as text
Darwin Core Archive: Anatomy Archives always have a core data file as text C'est le schméExplain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file

11 Core contains a “core ID” column, unique for every record in the file
Darwin Core Archive: Anatomy Core contains a “core ID” column, unique for every record in the file

12 Columns are matched to Darwin Core terms
Darwin Core Archive: Anatomy Columns are matched to Darwin Core terms

13 “Wingspan” is not a Darwin Core term
Darwin Core Archive: Anatomy Columns that do not match to a Darwin Core term may be included, but are ignored “Wingspan” is not a Darwin Core term

14 1) Rename columns in text file
Darwin Core Archive: Anatomy Two ways to match columns to Darwin Core terms 1) Rename columns in text file

15 2) Match columns to terms in a separate meta.xml file
Darwin Core Archive: Anatomy Two ways to match columns to Darwin Core terms 2) Match columns to terms in a separate meta.xml file

16 Darwin Core Archive: Anatomy meta.xml matches the columns
in the core data file (species.txt) More on how to make the meta.xml file later…

17 Archives can include extension files
Darwin Core Archive: Anatomy Archives can include extension files Species.txt Extensions link to the core through the core ID Common_names.txt Extensions allow multiple records to be linked to a core record.

18 GBIF hosts extension definitions

19 Multiple extensions files can be linked to the core
Darwin Core Archive: Anatomy Multiple extensions files can be linked to the core

20 Darwin Core Archive: Anatomy All files are stored in a single folder

21 Darwin Core Archive: Anatomy The folder is zipped. Data files
Column matching file Data set documentation This is a Darwin Core Archive

22 Darwin Core Archive: Publishing
/my_data.zip Archives on a web server can be accessed by a URL. Share this URL to “publish” your data!

23 Darwin Core Archive: Publishing Options

24 GBIF Spreadsheet Templates

25 Integrated Publishing Toolkit

26 Data Hosting Centers

27 Darwin Core Mapping Assistant
Metafile

28 Darwin Core Mapping Assistant

29 Darwin Core Archive: Publishing Options
GBIF Darwin Core Archive Spreadsheet Templates: data in a spreadsheet already simple archive authoring IPT: creating/managing archives for multiple data sets managing archives for multiple organisations metadata as GBIF Metadata Profile of EML Make Your Own: automating archive generation customisation Hosting center: economy of scale Infrastructure and support Combinations… Explain the 3 components of an archive: Core data file + optionnal extensions files Metafile Descriptor file

30 Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape of biodiversity data publishing Presenter ( ) Role Organization Buenos Aires (Argentina) 28 September 2011


Download ppt "Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape."

Similar presentations


Ads by Google