Download presentation
Presentation is loading. Please wait.
Published byBeatrix Mills Modified over 8 years ago
1
TAPIR 1.0 Renato De Giovanni, Markus Döring, Javier de la Torre October 2006
2
Presentation Plan Definition Background Current Scope Basic Notions Overview Current status Future Plans
3
Definition TAPIR = TDWG Access Protocol for Information Retrieval Short Definition: Web Service protocol to perform queries across distributed and heterogeneous data sources. Complete Definition: Stateless, HTTP transmittable, request and response protocol for accessing structured data that may be stored on any number of distributed databases of varied physical and logical structure, returning customizable XML representations of data.
4
Background Initial motivation was to address interoperability issues between DiGIR and BioCASe networks. Unification of DiGIR and BioCASe was considered a priority during the GBIF DADI sub-committee meeting in Oaxaca, 2004. GBIF commissioned a study that resulted in an integration proposal presented during the TDWG 2004 meeting in Christchurch. A data provider reference implementation was developed in the beginning of 2005 as a proof of concept. Work continued with a further meeting promoted by TDWG to refine and revise the protocol (Madrid, 2005).
5
Background A “feature freeze” was declared in the beginning of 2006, but a few changes were proposed later. Documentation of the protocol initiated on May, 2006, contracted by TDWG. PyWrapper (the reference implementation) was updated to work with the new version of TAPIR with funding from IPGRI. TDWG contracted a second implementation for a data provider software on September 2006.
6
Current Scope TAPIR evolved from a specific protocol integration effort to a potential candidate to help exchanging data in other TDWG standards. ABCD and DarwinCore are compatible with TAPIR. TCS and NCD are likely compatible. SDD would require more work on TAPIR. New data standards being proposed still need to be analysed.
7
Conceptual Schemas & Concepts Conceptual Schemas: Provide a formal definition of concepts. In TAPIR concepts are used for mapping and querying. Example: Darwin Core. Are external to TAPIR, so networks are free to create or use existing Conceptual Schemas. Multiple Conceptual Schemas can be used. Concepts: Concepts can potentially represent classes, relationships or properties, although this version of TAPIR limits its use to properties (content elements). Example: scientific name, observation date, locality name, catalogue number, etc.
8
Output Models TAPIR documents defining a specific XML response structure (using a subset of XML Schema) and mapping content nodes in the structure to concepts from conceptual schemas. Output models also indicate an indexing element by pointing to a node in the structure that should be used as a reference for record counting and paging. Output models define what kind of things should be returned and how they should be structured in XML Example of different output models that could be produced from the same concepts: ABCD, RSS, KML, GML, RDF (encoded in XML), etc.
9
Query Templates TAPIR documents representing specific inventory or search queries, usually including parameterized filters, and sometimes additional parameters like nodes to be returned from the response structure and order by conditions (only for search). There can be multiple query templates based on the same output model. Examples: An RSS output model with a parameterized filter based on family name, an inventory template to return a list of specimens (scientific names) according to a parameterized filter based on the country name, etc.
10
Different levels of provider implementation Providers can advertise that they only know specific query templates. –In this case, they don't necessarily need to be able to parse the template definition, as long as responses are valid. Providers can advertise that they only know specific output models, and then accept arbitrary queries that are based on those output models. –In this case providers don't necessarily need to be able to parse the output model definition, as long as responses are valid. Providers can only advertise the concepts that they mapped, and then accept arbitrary output models and query templates based on them. –Need to dynamically parse output models and query templates.
11
So how things work? Data providers map their local databases to one or more conceptual schemas defined by a network/community. Output models define the desired XML response structures which are mapped against concepts from the same conceptual schemas. Query templates can be defined on top of the output models. => Requests can then be formulated using the query templates, output models and mapped concepts, depending on the design of the network.
12
TAPIR Operations and Message Encodings Metadata: Default operation to retrieve basic information about the service. Capabilities: Used to retrieve the essential settings to properly interact with the service. Inventory: Used to retrieve distinct values of one or more concepts. Search: Main operation to search and retrieve data. Ping: Used for monitoring purposes to check service availability. All requests can be formulated with XML or simple Key-Value Pair (URL-based) parameters. Responses are always in XML.
13
Current Status Working draft of the protocol specification is available (check the TAPIR page on the TDWG website). Written by Charles Copp. The first fully functional TAPIR data provider software is available (PyWrapper) and has the ability to easily migrate BioCASe configurations. A second TAPIR data provider software (based on the DiGIR PHP provider) should be ready by the end of this year. It will also include migration facilities from DiGIR configuration. First TAPIR network should start to be deployed by the end of this year (Plant Genetic Resources Community – CGIAR – Generation Challenge Programme). TAPIR clients being developed.
14
Resources Using the new TDWG infrastructure (Wiki still separate). XML Schema and other documents are stored in a subversion repository. Public mailing list: tdwg-tapir@lists.tdwg.org
15
Future Plans Start migrating DiGIR / BioCASe networks (synchronize migration with DarwinCore / ABCD versions). Prepare more documentation (TAPIR Network Designers and Users Guide). Develop TAPIR test suites for data provider implementations. Become an official TDWG Interest/Task Group? Obtain final blessing as a new TDWG standard.
16
Special Thanks TDWG & GBIF & IPGRI Collaborators: Anton Güntsch Charles Copp Dave Vieglais Donald Hobern John Wieczorek Robert Gales Stan Blum Steven Perry
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.