The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.

Slides:



Advertisements
Similar presentations
A vision for the future of taxonomic databases David Eades Illinois Natural History Survey Presented at the Natural History Museum, London, 17 January.
Advertisements

The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
What is a Flora? Peter Hovenkamp. What is not a Flora? Labwork/ecology paper Species selection on non-taxonomic criteria No identification tool Character.
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
To share data, all providers must agree upon a data standard.
Virtualizing Entomology Collection Student: Di Wang (Alan) Sponsors: John Marris: Curator, Entomology Research Museum Stuart Charters: Department of Applied.
OVERVIEW OF DATA FLOW IN NVC PROCESS Field sheets NVC Proceedings.
Taxonomic data issues: An ecologist’s experience R.K. Peet The University of North Carolina Adapted by J Kennedy.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
VegBank.org: a Permanent, Open-Access Archive for Vegetation Plot Data. Michael T. Lee 1, Michael D. Jennings 2, Robert K. Peet 1. Interacting with the.
Integrated Taxonomic Information System Janet Gomon, Deputy Director, ITIS Smithsonian Institution Museum of Natural History The.
Scaling up The International Plant Names Index (IPNI) James A. Macklin Harvard University Herbaria Paul J. Morris Harvard University Herbaria & Museum.
Vegetation databases Lessons from VegBank, SEEK, TDWG, IAVS, & NCEAS Robert Peet University of North Carolina.
Transition to taxon concepts from a world of legacy data --- R.K. Peet 1, A.S. Weakley 1,2, X. Liu 1,3, & N. Franz 4,5 1 The University of North Carolina.
Plant Systematics databases: Users perspectives Robert K. Peet, University of North Carolina In collaboration with The National Center for Ecological Analysis.
Names are not sufficient: the challenge of documenting organism identity R.K. Peet, J.B.Kennedy, and N.M. Franz and The Ecological Society of America Vegetation.
Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,
VegBank A vegetation field plot archive Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center.
EcoInformatics & Vegetation Science. The symposium message Plant community ecology is on the brink of a dramatic transformation that will be made possible.
North American initiatives in Ecoinformatics: Vegbank and SEEK Robert K. Peet and The Ecological Society of America Vegetation Panel The SEEK development.
The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Considerations for the Construction of Lichen Databases Data Management.
Vegetation Plot Management: A National Plots Database Demo Funding: National Science Foundation (DBI ) John Harris - NCEAS Robert K. Peet - University.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
Species Banks a GBIF mechanism to provide electronic access to quality species information Peter H. Schalk, Marc Brugman ETI, University of Amsterdam Tinde.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Use case lessons: Components of the SEEK architecture Robert K. Peet University of North Carolina.
A new floristic atlas for the Southeast based on taxon concept relationships Robert K. Peet 1, Alan S. Weakley 1,2 & Xianhua Liu 1,3 1 The University of.
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Indexing the Species Names of the World - for the World Frank Bisby (Species 2000), Michael Ruggiero (ITIS) Per de Place Bjørn (GBIF - ECAT)
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
The National Park Service's Information Management Strategy, Infrastructure, and Software Applications.
GLOBAL BIODIVERSITY INFORMATION FACILITY Cataloging and using Taxonomic Data The Global Names Architecture David Remsen Senior Programme Officer, ECAT.
[] Where Did Those GBIF Occurrences Come From? Providing Digital Access to NatureServe's Reference Database: Report on a Project in the Early Stages of.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
Experience from Mapping Existing Models to the Transfer Schema Robert Kukla.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Larry Speers Global Biodiversity Information Facility Arthur Chapman.
Vegetation Data Management: VegBank Funding: National Science Foundation (DBI ) January 8, 2002 John Harris - NCEAS.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
The VegBank taxonomic datamodel Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center for.
Collections. Vegetation sampling We observe and collect data on soil.
IDigBio is funded by a grant from the National Science Foundation’s Advancing Digitization of Biodiversity Collections Program (Cooperative Agreement EF ).
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Taxonomic verification: Species 2000 and the Catalogue of Life Frank Bisby.
The VegBank Data Model. Biodiversity data structure Taxonomic database Plot/Inventory database Occurrence database Plot Observation/ Collection Event.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
Transition to taxon concepts from a world of legacy data --- R.K. Peet 1, A.S. Weakley 1,2, X. Liu 1,3, & N. Franz 4,5 1 The University of North Carolina.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
AUSTRALIA’S VIRTUAL HERBARIUM A national collaborative model for integrated access to distributed biological information Australian National Herbarium.
The challenge of organism identity --- The flora of the Southeast The flora of the Southeast as a case study Robert K. Peet University of North Carolina.
VegBank and the ESA Cyber-infrastructure for Vegetation Science R.K. Peet, Don Faber-Langendoen, Michael Jennings, & Michael Lee Ecological Society of.
Dave Thau - Mills College - Technology for a Better World111/24/2009 Biodiversity Informatics Dave Thau.
Laura Russell VertNet Meherzad Romer NatureServe Canada John Wieczorek
Globally Unique Identifiers: What, why, when, which and what now? Dave Thau University of Kansas
A vision for community involvement and integration Robert K. Peet & Alan S. Weakley Alan S. Weakley.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
Data sharing and exchange: Experiences within the
Vegetation Data Management:
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
Taxonomic and Community Classification Resources and Standards
Data Management: The Data Repatriation Re-integration Step or …
HOW (and why?) DO WE DESCRIBE ?
Presentation transcript:

The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee John Harris NCEAS

A case study: The US National Plots Database Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program ABI / The Nature Conservancy Project organized and directed by: Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / ABI Michael Jennings, USGS-BRD & UCSB

Observation/Collection Event Object or specimen Taxon Locality Biodiversity data structure Taxonomic databases Plot databases Specimen databases

Taxonomic database challenge The problem: Integration of data potentially representing different times, places, investigators and taxonomic standards The traditional solution: A standard list of kinds of organisms.

Current standards Biological organisms are names following international rules of nomenclature. Database standards are being developed by TDWG, GBIF, IOPI, etc. Metadata standards have been developed. For example, the Darwin Core is a profile describing the minimum set of standards for search and retrieval of natural history collections and observation databases. (

There exist numerous compilations of organism names. For example: Species 2000http:// (Composed of 18 participant databases) All Specieshttp:// ITIShttp:// (The US government standard list) Index to organism names

Taxon-specific standard lists are available. Representative examples for higher plants include: North America USDA Plantshttp://plants.usda.gov/ ITIShttp:// ABIhttp:// World IPNI International Plant Names Checklist IOPI Global Plant Checklist

Most standardized plant lists fail to allow effective integration of datasets. The reasons include: The user cannot reconstruct the database as viewed at an arbitrary time in the past, Taxonomic concepts are often not defined, Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled.

Carya ovata (Miller)K. Koch Carya carolinae-sept. (Ashe) Engler & Graebner Carya ovata (Miller)K. Koch sec. Gleason 1952sec. Radford et al Three concepts of shagbark hickory Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies.

R. plumosa R plumosa v. intermedia R. plumosa v. plumosa R. intermedia R. plumosa v. interrupta R. pineticola R. plumosaR. sp. 1 R. plumosa v. plumosa R. plumosa v. pineticola Multiple concepts of Rhynchospora plumosa s.l. Elliot 1816 Gray 1834 Kral 1998 Peet Chapman 1860

NameReferenceAssertion An assertion represents a unique combination of a name and a reference Assertion is equivalent to Potential taxon & taxonomic concept

Names Carya ovata Carya carolinae-septentrionalis Carya ovata var. australis Assertions (One shagbark) C. ovata sec Gleason ‘52 (Southern shagbark) C. carolinae-s. sec Radford ‘68 C. ovata australis sec FNA ‘97 (Northern shagbark) C. ovata sec Radford ‘68 C. ovata sec FNA ‘97 References Gleason 1952 Britton & Brown Radford et al Flora Carolinas Stone 1997 Flora North America Five shagbark hickory assertions Possible taxonomic synonyms are listed together

NameTaxonUsage A usage represents a unique combination of a taxon and a name. Usages can be used to track nomenclatural synonyms

1. Carya ovata 2. C. carolinae-septentrionalis 3. C. ovata var. australis A. One shagbark B. Southern shagbark C. Northern shagbark 1-A 1-C 2-B 3-B Published names S pecies concepts Usage An example of a nomenclatural synonym is the linkage of the assertion “Carya ovata var. australis sec. FNA 1997” with the name “Carya carolinae-septentrionalis” by both ITIS and ABI.

NameAssertionUsage A usage (name assignment) and assertion (taxon concept) can be combined in a single model Reference

Party Perspective The Party Perspective on an Assertion includes: Status – standard, nonstandard, undetermined Correlation with other assertions – Equal, Greater, Lesser, Overlap, Undetermined. Lineage – Predecessor and Successor assertions. Start & Stop dates.

ITIS FNA Committee ABI Carya ovata sec Gleason 1952 Carya ovata sec Radford 1968 Carya carolinae sec Radford 1968 Carya ovata sec FNA 1997 Carya ovata australis sec FNA 1997 PartyAssertion PartyAssertionStatusStart Name ITIS ovata – G52 S1996 ITIS ovata – R68 A1996ovata ITIScarolinae – R68 A1996carolinae ITIScarolinae – R68 S2000 ITISovata aust – FNA A2000carolinae Status

Concept-based taxonomy is coming soon All organisms in databases should be identified by linkage to an assertion = name and reference! Various standards are being developed by FGDC, TDWG, IOPI, GBIF, etc. Most major databases are working toward inclusion of assertions (e.g. ITIS, IOPI, ABI). Until standard assertion lists are available, databases that track organisms should include couplets containing both a scientific name and a reference.

National Taxonomic Database? Concept-based Party-neutral Synonymy and lineage tracking Perfectly archived An upgrade for ITIS & Species 2000?

Specimen/object databases Information on specimens/objects should be tracked by reference to Place (place or collection) Unique identifier (accession number) Time A museum is a place

Database systems for tracking specimens The following are a few of the many available BioLink Specify Biota Taxis TDWG maintains links to multiple software systems

Project Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation Core elements of the National Plots Database

Support multiple interpretations of which concept applies to an organism or community. Various observers will associate different taxonomic concepts with records in a database Provision must be made for inclusion of these taxonomic interpretations. Minimal attributes include Concept applied Date applied Who made the interpretation Links to supporting information

Interface tools Desktop version for data preparation and local use. Loaders for legacy data. Data export. Tools for linking taxonomic concepts. Standard query, flexible query, SQL query. Flexible export. Local data refresh Easy web access with consistent interface

Conclusions for database designers 1.Records of organisms should always contain (or point to) couplets consisting of a scientific name and a reference where the name was used. 2.Design for future annotation of organism concepts. 3.Track specimens/objects by location, unique identifier & time. 4.Design for reobservation! Separate permanent from transient attributes. 5.Archival databases should provide time-specific views.