Download presentation
Presentation is loading. Please wait.
Published byJoy Richard Modified over 9 years ago
1
Darwin Core Archive (DwC-A) validation: A New Collaborative Effort Christian Gendreau, Université de Montréal / Canadensys David P. Shorthouse, Université de Montréal / Canadensys Marie-Élise Lecoq, GBIF France Tim Robertson, GBIF
2
Darwin Core Archive (DwC-A) DarwinCore standard does not impose strong rules on the content associated with any DarwinCore terms.
3
Current GBIF DwC-A Validator Original goal “… test Darwin Core Archives as specified in the Darwin Core Text Guide.” http://tools.gbif.org/dwca-validator/
4
Current GBIF DwC-A Validator Original target DwC-A are simple and can be created using simple custom scripts. “… make sure GBIF and others can read the information as expected.”
5
Current GBIF DwC-A Validator Validates archive structure Offer web presence – Report viewer – API
6
Next GBIF DwC-A Validator? New goal Extends validation to the content of the archive https://github.com/gbif/dwca-validator
7
Current content validators Atlas of Living Australia sandbox VertNet – Spatial quality GBIF Spain – Darwin Test Encyclopedia of Life – dwc-validator Scratchpads – dwca-validator GlobalNames – dwc-archive ruby gem … much more See Appendix 1 for links
8
What we need? Accommodate different scopes Configuration/customizations – Use more knowledge when available Web access (page and API)
9
Scopes Data entry Desktop software – Scientific Work Flow – Statistical software Integrated Publishing Toolkit (IPT) National nodes Aggregators
10
Configuration/Customization Where the validator will be used? Can we provide more information? – e.g. I know all the dates in my file should be ISO
11
Components Library Web Extension Support
12
Library Define structure for validation process Provide a validation framework enabling sharing Close to DarwinCore specification
13
Web Web page to submit archive or URL Report viewer API
14
Extension Support Include domain knowledge Propose interpreted data
15
Internals Validation types – Structure Metadata – Records : Rows Fields data (e.g. date, coordinates) – Records : Columns ID uniqueness
16
Internals – Record level Validation chain – Composed by chain elements – Possible parallelism
17
Internals – Record level Immutable Chain element – Self contained Never relies on another chain element – Ordering independent Same behaviour wherever the element is used in the chain But what if I need really ordering?
18
Internals - Composition Composed chain element Exposed as one chain element
19
Composition example Mandatory Latitude/Longitude – Check record completion on lat/long – Check decimal lat/long value
20
Configuration example Select mandatory DarwinCore terms – scientificName must be provided Restrict bounding box – decimalLatitude and decimalLongitude must be between
21
Customization example Apply your own controlled vocabulary – Use your own dictionary for a term – ControlledVocabularyEvaluationRule
22
Extension Example Suggester, link to narhwal-processor – Suède –> ISO 3166-2:SE – URI –> http://sws.geonames.org/2661886
23
Collaborative Share configuration Share customization (dictionary) Implement new reusable component – e.g. validation on specific Dwc-A extension
24
Collaboration Where to go? – https://github.com/gbif/dwca-validator https://github.com/gbif/dwca-validator Who can contribute? – Everyone What is needed? – Ideas, constructive comments – Code review, feedback
25
Project status Not yet released Command line interface available Follow the project on GitHub
26
Acknowledgments
27
Special thanks SiB Colombia SiB Brazil Peter Desmet John Wieczorek Dag Endresen …
28
Appendix 1 DwC Content validators Atlas of Living Australia sandbox http://sandbox.ala.org.au/datacheck/ VertNet – Spatial quality Displayed on occurrence pages at http://portal.vertnet.org/search GBIF Spain – Darwin Test http://www.gbif.es/darwin_test/Darwin_Test_in.php Encyclopedia of Life – dwc-validator http://services.eol.org/dwc_validator/
29
Appendix 1 - continue Scratchpads – dwca-validator https://github.com/edwbaker/dwca_validator/ GlobalNames – dwc-archive ruby gem https://github.com/GlobalNamesArchitecture/d wc-archive
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.