Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Presentation: Wouter Addink – ETI Most slides made by: Giorgos Ksouris.

Slides:



Advertisements
Similar presentations
The Biosafety Clearing-House of the Cartagena Protocol on Biosafety Tutorial – BCH Resources.
Advertisements

UDDI v3.0 (Universal Description, Discovery and Integration)
Microsoft Excel 2003 Illustrated Complete Excel Files and Incorporating Web Information Sharing.
Development of a computer information system for wildlife conservation in Louisiana, with a prototype system for fishes Henry L. Bart Jr. and Nelson E.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
1 Adaptive Management Portal April
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2006 Microsoft Corporation.
RSS RSS is a method that uses XML to distribute web content on one web site, to many other web sites. RSS allows fast browsing for news and updates.
PROACTIS: Supplier User Guide Contract Management.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
This chapter is extracted from Sommerville’s slides. Text book chapter
Overview of the ODP Data Provider Sergey Sukhonosov National Oceanographic Data Centre, Russia Expert training on the Ocean Data Portal technology, Buenos.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa Norwegian GBIF meeting Oslo 25 September
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
PLP Guide1 Training Guide for Inzalo PLP Management.
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
CHAPTER 9 DATABASE MANAGEMENT © Prepared By: Razif Razali.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Classroom User Training June 29, 2005 Presented by:
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Conditions and Terms of Use
ITEC224 Database Programming
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Dr. Bhavani Thuraisingham October 2006 Trustworthy Semantic Webs Lecture #16: Web Services and Security.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The european ITM Task Force data structure F. Imbeaux.
BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October.
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
Data Exchange Standards The Power of Being Stupidly Simple Chuck Miller Missouri Botanical Garden TDWG 2008, Fremantle October 24, 2008.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
By Rashid Khan Lesson 6-Building a Directory Service.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
Kemal Baykal Rasim Ismayilov
WEB SERVICE DESCRIPTION LANGUAGE (WSDL). Introduction  WSDL is an XML language that contains information about the interface semantics and ‘administrivia’
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Fábio Lang da Silveira – This talk on behalf of OBIS International Committee and OBIS North & South America Nodes USP – Zoology.
The Korean Bird Information System (KBIS) National Science Meseum of Korea InCoB 2009, Singapore.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa EC CHM & GBIF European Regional Nodes Meeting Copenhagen,
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
TapirLink: Enabling the transition to TAPIR Renato De Giovanni TDWG 2007.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa IABIN/CHM Cancún, Mexico, August
GBIF Governing Board 20 Module 6B: New GBIF Tools II 2013 Portal and NPT Startup Daniel Amariles IT Leader, National Biodiversity Information System of.
OBIS IODE PO OBIS INCOIS OBIS- SEAMAP Separate files OBIS Nodes Data providers Separate files GBIFLifeWatchGEOSSEOL,…CBDFAOISA Fail-over mirrorGeo-load.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
TRIG: Truckee River Info Gateway Dave Waetjen Graduate Student in Geography Information Center for the Environement (ICE) University of California, Davis.
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
GO! with Microsoft Office 2016
Flanders Marine Institute (VLIZ)
GO! with Microsoft Access 2016
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
Wsdl.
Chapter 9 Web Services: JAX-RPC, WSDL, XML Schema, and SOAP

Tutorial 7 – Integrating Access With the Web and With Other Programs
SDMX IT Tools SDMX Registry
Presentation transcript:

Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Presentation: Wouter Addink – ETI Most slides made by: Giorgos Ksouris - GBIF Secretariat Utrecht, 14 January What is DiGIR? What is ABCD?

Global Biodiversity Information Facility “Primary Biodiversity Data” Network l GBIF is concerned with primary biodiversity data: l Specimens l Observations l Names l Species l Literature l Metadata on the above l How will the data be contributed to the GBIF Network?

Global Biodiversity Information Facility GBIF “Data Providers (Nodes)” l Responsible for providing, through standard WEB exchange interfaces, metadata describing themselves and the data services they offer and free access to biodiversity data. l Should use a common data exchange format with a fixed structure which clearly defines how the information is to be shared. l Data should be exchanged in a way which makes it as simple as possible to compare and merge information from different resources. l GBIF therefore needs a simple model which will allow institutions to share their data using structured formats, regardless of what formats they use in their own databases.

Global Biodiversity Information Facility Data exchange standards Models that allow data on individual specimens or observations to be structured and shared as XML documents that can be transmitted across the Internet: l Darwin Core V2(limited set of core data elements) Darwin Core ( l ABCD V1.2(complete set of all possible data elements in specimen and observation data) ABCD (

Global Biodiversity Information Facility Data exchange protocols Defines request and reponse message formats for standardized communication between provider and portal l DiGIR protocol l Uses open protocols and standards, such as HTTP, XML, and UDDI l De-couples protocol, software and semantics l Automates the establishment of a new data provider as much as possible l In use with Darwin Core in a few projects like MaNIS, but cannot be used with a complex XML Schema like ABCD l BioCASE protocol l Based on DiGIR, but with a few improvements, like capability to use ABCD l Not compatible with DiGIR l Still under development l De-couples protocol, software and semantics better then DiGIR but establishment of a new data provider is more complex l SOAP l Generic protocol using HTTP, XML and UDDI, not focussed on specimen and observation data exchange

Global Biodiversity Information Facility data exchange format: ABCD l XML complex schema l Coverage of complete specimen and observation data domain l Schema is used in BioCASE project l Hundreds of concepts (data elements) l Schema includes: l Meta-data: Information about the source, from name of the holding institution to copyrights statements of the hole dataset. l Unit-data: Information regarding the records, specific copyrights, date last modification, facts that don't fix in any other place etc. l Gathering site: Information about the gathering site. Gathering place, altitude, responsible person etc. l Taxon identification: Possible identifications for this unit. Includes the taxon part and data on the identification event like who identified the unit etc. l Taxon name: l Data about the name of the taxon. It is split into different parts for the different biological disciplines like botany, zoology, etc with their own nomenclatural code. Includes data on the Scientific name, higher taxon etc.

Global Biodiversity Information Facility data exchange format: Darwin Core2 l XML schema l In use for some time already (MaNIS project) l Suitable for collections and observations data. l 48 concepts (data elements): DateLastModified *InstitutionCode *CollectionCode *CatalogNumber * ScientificName *BasisOfRecordKingdomPhylum ClassOrderFamilyGenus SpeciesSubspeciesScientificNameAuthorIdentifiedBy YearIdentifiedMonthIdentifiedDayIdentifiedTypeStatus CollectorNumberFieldNumberCollectorYearCollected MonthCollectedDayCollectedJulianDayTimeOfDay ContinentOceanCountryStateProvinceCounty LocalityLongitudeLatitudeCoordinatePrecision BoundingBoxMinimumElevationMaximumElevationMinimumDepth MaximumDepthSexPreparationTypeIndividualCount PreviousCatalogNumberRelationshipTypeRelatedCatalogItemNotes * = required element

Global Biodiversity Information Facility Software for GBIF “Data Providers (Nodes)” GBIF has chosen to use DiGIR software and Darwin Core2 because: l The provider software is stable l Easy to install and easy to use l Used already in the MaNIS network and some other projects l Collection database models are rather easy to map against Darwin Core2 (but dataproviders will often miss data elements that are important for their database) However, BioCASE software and ABCD will also be supported in the near future because: l Will be in use in BioCASE network l BioCASE software has some improvements compared with DiGIR (but is still less easy to install and use) l ABCD has more potential for the future then Darwin Core2

Global Biodiversity Information Facility Data Provider within GBIF Architecture Portal Data provider Provider Services Provider query Request Manager Query Engine Available providers UDDI Registry Institutions Services (Providers) AccessPoints Resource Metadata Resource Metadata Index Metadata and name query Metadata response Data query Data response Metadata and logs Name provider Provider Services Resource Metadata Resource Metadata Synonyms, GUIDs Publish availability Cache Metadata Accounting SOAP DiGIR HTTP

Global Biodiversity Information Facility WEB exchange interface: DiGIR l Distributed Generic Information Retrieval is a client/server protocol for retrieving information from distributed resources. l Uses HTTP as the transport mechanism and XML for encoding messages sent between client and server. l Three type of messages: l Metadata: get metadata information of the provider and the resource(s) that serves. l Search: find specimen and observation records based on search criteria, for example: the name of a species and/or a rectangle defining an area on the earth’s surface and/or … l Inventory: get the set of distinct values associated with a single concept, for instance: Species. l Maps database models of collections to Darwin Core2 (suitable for exchange of specimen and observation data).

Global Biodiversity Information Facility DiGIR: Advantages l Provides a single point of access to one to many distributed information resources. l Resource: a collection of data objects that conform to a common schema. l Enables search & retrieval of structured data. l Makes location and technical characteristics of native resource transparent to the user. l Not the only available software ( BioCASE/ABCD Schema is another candidate ) but stable enough to be launched.

Global Biodiversity Information Facility DiGIR Provider: How it Works Resource WEB Server- DiGIR S/W Server Resource Provider Metadata Resource Metadata HTTP XML Metadata message Search/Inventory message

Global Biodiversity Information Facility GBIF’s DiGIR Provider Package l Encompasses the DiGIR Provider software, Apache2 WEB server and PHP libraries. l Requires from the user only basic knowledge of the operating system. l Two available releases: ( ) l Linux (RedHat 7.3, 8, 9) l MS Windows (2000, XP) l Supported databases: l MySQL l PostgreSQL l MS SQL Server l MS Access (only the MS Windows package) l Offers automatic registration with GBIF UDDI Registry ( ) l Other features: l Caching (cleanup from the startup script) l Rotation of log files (WEB server, DiGIR provider)

Global Biodiversity Information Facility DiGIR Provider Installation l Completed in 4 steps: l Installation of the GBIF’s DiGIR Provider package. l Definition of provider’s metadata. (For a unique RecordIdentifier in the GBIF network: Use the format ParticipantCode:InstitutionCode:CollectionCode)ParticipantCode l Definition of resource(s). l Registration with GBIF UDDI registry.

Global Biodiversity Information Facility Becoming a GBIF Data provider in the Netherlands (1) l Determine which data sets you can provide in structured electronic form (like a database) and whether these data sets contain specimen data, observation data, species data or other biodiversity data. The data also needs to be maintained. l Determine which data may be available for public use. GBIF has decided to make all data in the network publicly available (this may change in the future). There will be no user restrictions like password protection for data, to avoid extra complexity. Data that should not be available for public usage should not be provided. For example: do not provide exact information about locations of endangered species that can be of use for hunters or illegal traders. l Define an IPR (Intellectual Property Rights) policy for each data set. l Information about the data sets (metadata) should be sent to NLBIF and will be kept in a central metadatabase. This information will also be available in the BioCASE metadata network.BioCASE metadata

Global Biodiversity Information Facility Becoming a GBIF Data provider in the Netherlands (2) Required metadata: (The minimum metadata needed is the required metadata for DiGIR and for the BioCASE NoDIT database.) l A name, addres, description and unique code for your organisation (see gbif website for codes already taken) l A name, description and unique code for each dataset in your organisation l The unique identifier to identify a specimen l A last modified date for the dataset l At least one contact name, address and phone number

Global Biodiversity Information Facility Becoming a GBIF Data provider in the Netherlands (3) l Check if you can make your data available in one of the following database formats: l MySQL l PostgreSQL l MS SQL Server l MS Access (only the MS Windows package) l Check if you have a computer with internet access available and l Linux (RedHat 7.3, 8, 9) or l MS Windows (2000, XP) If this is the case: Congratulations: you can maintain your own data node that uses the standard GBIF DiGIR provider software. In all other cases, please contact NLBIF. NLBIF can also provide data storage space for your datasets. With DiGIR you might also be able to use DB2, Interbase, Frontbase, Informix, Visual FoxPro, PostgreSQL, Sybase, other ODBC-compliant database. However, this is currently not supported by GBIF.

Global Biodiversity Information Facility NLBIF Assistance l the complete distribution of the Digir provider (including PHP, Apache webserver and automatic GBIF UDDI registration) provided by GBIF is recommended. The GBIF helpdesk or NLBIF (ETI) can help you with technical installation problems.complete distribution l To use your data source with DiGIR, you need to map the data fields you want to publish 1:1 to Darwin Core V2 Schema elements (the software does not contain translator functions) For this you probably need to create a view (if your database supports this) for some of the fields or a separate database with the needed fields. Contact NLBIF if you need assistance with conversions.Darwin Core V2 Schema l Because GBIF netwerk use caching, it will take a few hours before your data is visible in the netwerk. l In case you want a custom search interface on your dataset, please also contact NLBIF. NLBIF is developing several web modules for this purpose that will be used for collections like those from ZMA. l You may use BioCASE and ABCD instead of DiGIR, for instance if you want to provide data that does not fit in Darwin Core, but it is recommended to start with DiGIR provider.

Global Biodiversity Information Facility GBIF network growth l The global network started end of november 2003 l Currently there are already about 28 dataproviders worldwide connected with about 8.5 million specimen and observation records l The Netherlands are currently connected with 7 collections containing records l With your help this can be … million records next year?!!

Global Biodiversity Information Facility

Darwin Core2 Elements (1) l DateLastModified: ISO 8601 compliant stamp indicating the date and time in UTC(GMT) when the record was last modified. Example: the instant "November 5, 1994, 8:15:30 am, US Eastern Standard Time" would be represented as " T13:15:30Z" l InstitutionCode: A "standard" code identifier that identifies the institution to which the collection belongs. No global registry exists for assigning institutional codes. Use the code that is "standard" in your discipline. l CollectionCode: A unique alphanumeric value which identifies the collection within the institution. l CatalogNumber: A unique alphanumeric value which identifies an individual record within the collection. It is recommended that this value provides a key by which the actual specimen can be identified. If the specimen has several items such as various types of preparation, this value should identify the individual component of the specimen. l ScientificName: The full name of lowest level taxon the Catalogued Item can be identified as a member of; includes genus name, specific epithet, and subspecific epithet (zool.) or infraspecific rank abbreviation, and infraspecific epithet (bot.) Use name of suprageneric taxon (e.g., family name) if Catalogued Item cannot be identified to genus, species, or infraspecific taxon. l BasisOfRecord: An abbreviation indicating whether the record represents an observation (O), living organism (L), specimen (S), germplasm/seed (G), etc. l Kingdom: The kingdom to which the organism belongs l Phylum: The phylum (or division) to which the organism belongs l Class: The class name of the organism l Order: The order name of the organism l Family: The family name of the organism l Genus: The genus name of the organism l Species: The specific epithet of the organism l Subspecies: The sub-specific epithet of the organism l ScientificNameAuthor: The author of a scientific name. Author string as applied to the accepted name. Can be more than one author (concatenated string). Should be formatted according to the conventions of the applicable taxonomic discipline.

Global Biodiversity Information Facility Darwin Core2 Elements (2) l IdentifiedBy: The name(s) of the person(s) who applied the currently accepted Scientific Name to the Catalogued Item. l YearIdentified: The year portion of the date when the Collection Item was identified; as four digits [ ], e.g., 1906, l MonthIdentified: The month portion of the date when the Collection Item was identified; as two digits [01..12]. l DayIdentified: The day portion of the date when the Collection Item was identified; as two digits [01..31]. l TypeStatus: Indicates the kind of nomenclatural type that a specimen represents. In particular, the type status may not apply to the name listed in the scientific name, i.e. current identification. In rare cases, a single specimen may be the type of more than one name. l CollectorNumber: An identifying "number" (really a string) applied to specimens (in some disciplines) at the time of collection. Establishes a links different parts/preparations of a single specimen and between field notes and the specimen. l FieldNumber: A "number" (really a string) created at collection time to identify all material that resulted from a collecting event. l Collector: The name(s) of the collector(s) responsible for collection the specimen or taking the observation l YearCollected: The year (expressed as an integer) in which the specimen was collected. The full year should be expressed (e.g must be expressed as "1972" not "72"). l MonthCollected: The month of year the specimen was collected from the field. Possible values range from inclusive l DayCollected: The day of the month the specimen was collected from the field. Possible value ranges from inclusive l JulianDay: The ordinal day of the year; i.e., the number of days since January 1 of the same year. (January 1 is Julian Day 1.)

Global Biodiversity Information Facility Darwin Core2 Elements (3) l TimeOfDay: The time of day a specimen was collected expressed as decimal hours from midnight local time (e.g = mid day, 13.5 = 1:30pm l ContinentOcean: The continent or ocean from which a specimen was collected. l Country: The country or major political unit from which the specimen was collected. ISO values should be used. Full country names are currently in use. A future recommendation is to use ISO two letter codes or the full name when searching l StateProvince: The state, province or region (i.e. next political region smaller than Country) from which the specimen was collected. l County: The county (or shire, or next political region smaller than State/Province) from which the specimen was collected l Locality: The locality description (place name plus optionally a displacement from the place name) from which the specimen was collected. Where a displacement from a location is provided, it should be in un-projected units of measurement l Longitude: The longitude of the location from which the specimen was collected. This value should be expressed in decimal degrees with a datum such as WGS-84 l Latitude: The latitude of the location from which the specimen was collected. This value should be expressed in decimal degrees with a datum such as WGS-84 l CoordinatePrecision: An estimate of how tightly the collecting locality was specified; expressed as a distance, in meters, that corresponds to a radius around the latitude-longitude coordinates. Use NULL where precision is unknown, cannot be estimated, or is not applicable. l BoundingBox: This access point provides a mechanism for performing searches using a bounding box. A Bounding Box element is not typically present in the database, but rather is derived from the Latitude and Longitude columns by the data provider l MinimumElevation: The minimum distance in meters above (positive) or below sea level of the collecting locality. l MaximumElevation: The maximum distance in meters above (positive) or below sea level of the collecting locality.

Global Biodiversity Information Facility Darwin Core2 Elements (4) l MinimumDepth: The minimum distance in meters below the surface of the water at which the collection was made; all material collected was at least this deep. Positive below the surface, negative above (e.g. collecting above sea level in tidal areas). l MaximumDepth: The maximum distance in meters below the surface of the water at which the collection was made; all material collected was at most this deep. Positive below the surface, negative above (e.g. collecting above sea level in tidal areas). l Sex: The sex of a specimen. The domain should be a controlled set of terms (codes) based on community consensus. Proposed values: M=Male; F=Female; H=Hermaphrodite; I=Indeterminate (examined but could not be determined; U=Unknown (not examined); T=Transitional (between sexes; useful for sequential hermaphrodites) l PreparationType: The type of preparation (skin. slide, etc). Probably best to add this as a record element rather than access point. Should be a list of preparations for a single collection record. l IndividualCount: The number of individuals present in the lot or container. Not an estimate of abundance or density at the collecting locality. l PreviousCatalogNumber: The previous (fully qualified) catalogue number of the Catalogued Item if the item earlier identified by another Catalogue Number, either in the current catalogue or another Institution / catalogue. A fully qualified Catalogue Number is preceded by Institution Code and Collection Code, with a space separating the each subelement. Referencing a previous Catalogue Number does not imply that a record for the referenced item is or is not present in the corresponding catalogue, or even that the referenced catalogue still exists. This access point is intended to provide a way to retrieve this record by previously used identifier, which may used in the literature. In future versions of this schema this attribute should be set-valued. l RelationshipType: A named or coded valued that identifies the kind relationship between this Collection Item and the referenced Collection Item. Named values include: "parasite of", "epiphyte on", "progeny of", etc. In future versions of this schema this attribute should be set-valued. l RelatedCatalogItem: The fully qualified identifier of a related Catalogue Item (a reference to another specimen); Institution Code, Collection Code, and Catalogue Number of the related Catalogued Item, where a space separates the three subelements. l Notes: Free text notes attached to the specimen record.

Global Biodiversity Information Facility DiGIR & Darwin Core2: An Example $Revision: 1.10 $ :33: <content xmlns:darwin=' xmlns:xsd=' xmlns:xsi=' T225000Z bioshare.com pyy 4 Diarsia mendica T220000Z bioshare.com pyy 6 Lycia lapponaria T220000Z bioshare.com pyy 7 Plutella maculipennis false

Global Biodiversity Information Facility Management of Resources – A Training DB l Getting familiar with the training MS Access data base: l Biotella: One of many available observation and specimen datatabase tools l l ”Open source” Microsoft Access Basic application l Can export ABCD and DwC formats to GBIF Data Repository Tool (in upcoming version) l Can act as resource to DiGIR Provider l Training database populated with sample Lepidoptera data

Global Biodiversity Information Facility Biotella Observation Database Schema Main Tables

Global Biodiversity Information Facility Mapping the Database against Darwin Core2 l Alternatives l Mapping within database (faster queries with indexing, conversion of value domains, available in Biotella) l Mapping at DiGIR Provider (no database work needed) l Conversion of value domains l Big issue, let’s leave it as is for time being

Global Biodiversity Information Facility Registration with GBIF UDDI Registry l Universal Description Discovery & Integration is a special directory that provides methods for publishing and finding business & service information / specifications. l UDDI is based on existing standards, such as XML and SOAP. l Four primary data types: l businessEntity: represents business basic information e.g. contact information, categorization, descriptions, etc. l businessService: describes a service provided by the business l bindingTemplate: contains an optional description of the service, the URL of its access point, and a reference to one or more tModel l tModel: abstract description of a particular specification or behaviour to which the Web service conforms businessEntity tModel businessService bindingTemplate businessService bindingTemplate tModel

Global Biodiversity Information Facility Registration with GBIF UDDI Registry (2) l Several steps to make data useful in a UDDI registry: l Companies/organisations/standards bodies define tModels, relevant to an industry/business/science, and register them in UDDI (  DiGIR tModel). l Companies/organisations (  business entities) register descriptions of them (  Data Node) and define the services (  DiGIR provider) they offer. l UDDI taxonomies are used for describing business entities (  connection between GBIF Participant Node and Data Nodes). l Marketplaces, search engines, and business applications (  GBIF portal, GBIF Participant Nodes portals) query the registry to discover services of interest at other companies.

Global Biodiversity Information Facility Registration with GBIF UDDI Registry (3) l Automatic registration with GBIF UDDI registry. l Utilisation of the values of the elements defined as metadata of the provider (plus some extra information). l Business Entity l business name: {the of the institution} l description: {the location (URL) pointing to institution } l Business Service l service name: {the common of the provider} (your.server.name) l description: {the information of } l Binding Template l access point: l description: Access point of { } l Demonstration l Registration of trainees’ DiGIR Providers

Global Biodiversity Information Facility Exploration of GBIF UDDI Registry

Global Biodiversity Information Facility Exploration of GBIF UDDI Registry (2) l Find all business entities correspond to Data Nodes under a Participant Node: l Access the URL l Click on the Browse link under the Taxonomies subtree. l Click on the gbif:nodes link. l Click on the Sweden link in the Categories box. l Press the Find business button.

Global Biodiversity Information Facility Use of a Search Portal

Global Biodiversity Information Facility l Find all records of a database resource where the Darwin Core2 concept Genus contains the word Colias: l Access the URL and press the Build query button. l Click on one of the available resources in the Select data providers section. l Select Genus from the Select a concept selection list in the Select query conditions section. Select like from the Select a comparator selection list and type Colias in the adjacent text box. l Press the Submit query box. Use of a Search Portal (1)