Data Integration Issues in Biodiversity Research Jessie Kennedy Shawn Bowers, Matthew Jones, Josh Madin, Robert Peet, Deana Pennington, Mark Schildhauer,

Slides:



Advertisements
Similar presentations
TDWG GUID-2 June 10, 2006Jessie Kennedy/Rob Gales LSID Resolution In SEEK Taxon.
Advertisements

1 A Case Study in E- Science: Building Ecological Informatics Solutions for Multi-Decadal Research ARL/CNI 2008 Conference Washington, DC 16 October 2008.
The Library of Life Federated Description Services and the Library of Life or What can we do with SDD anyway? Kevin Thiele Centre for Biological Information.
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
Diana Hernandez Integrating the catalogue of Mexican biota: different approaches for different client perspectives.
Taxonomic data issues: An ecologist’s experience R.K. Peet The University of North Carolina Adapted by J Kennedy.
I: The Lineage of Taxonomic Revisions The taxonomic history of Aus L. 1758, first described by Linnaeus in 1758 (i), is shown through four subsequent revisions.
VegBank.org: a Permanent, Open-Access Archive for Vegetation Plot Data. Michael T. Lee 1, Michael D. Jennings 2, Robert K. Peet 1. Interacting with the.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 1. Systems, Science and.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
Vegetation databases Lessons from VegBank, SEEK, TDWG, IAVS, & NCEAS Robert Peet University of North Carolina.
Plant Systematics databases: Users perspectives Robert K. Peet, University of North Carolina In collaboration with The National Center for Ecological Analysis.
SwissEx WIKI. Motivation for WIKI re-use of measurements –collaborative effort –semantics organization of measurements –temporal and spatial reference.
Names are not sufficient: the challenge of documenting organism identity R.K. Peet, J.B.Kennedy, and N.M. Franz and The Ecological Society of America Vegetation.
Geographic Information Systems and Science SECOND EDITION Paul A. Longley, Michael F. Goodchild, David J. Maguire, David W. Rhind © 2005 John Wiley and.
Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Taxonomic History of the Imaginary Genus Aus L Jessie Kennedy Napier University.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Data Integration, Analysis, and Synthesis Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
A Proposal for a Distributed Earth Observation Data Network Matthew B Jones UC Santa Barbara National Center for Ecological Analysis and Synthesis (NCEAS)
Use case lessons: Components of the SEEK architecture Robert K. Peet University of North Carolina.
A new floristic atlas for the Southeast based on taxon concept relationships Robert K. Peet 1, Alan S. Weakley 1,2 & Xianhua Liu 1,3 1 The University of.
1. Systems, Science, and Study. Outline What is geographic information? Definition of data, information, knowledge and wisdom Kinds of decisions that.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Semantic Mediation in SEEK/Kepler: Exploiting Semantic Annotation for Discovery, Analysis, and Integration of Scientific Data and Workflows Bertram Ludäscher.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Science Environment for Ecological Knowledge Jessie Kennedy School of Computing, Napier University, Edinburgh.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
SERONTO A Socio-Ecological Research and Observation Ontology Bert van der Werf Mihai AdamescuMinu Ayromlou Nicolas BertrandJakub Borovec Hugues BoussardConstatin.
Collections. Vegetation sampling We observe and collect data on soil.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Fábio Lang da Silveira – This talk on behalf of OBIS International Committee and OBIS North & South America Nodes USP – Zoology.
Biodiversity Data Exchange Using PRAGMA Cloud Umashanthi Pavalanathan, Aimee Stewart, Reed Beaman, Shahir Shamsir C. J. Grady, Beth Plale Mount Kinabalu.
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
The role of persistent identifiers in tracking taxon changes Andrew C. Jones, Richard J. White, Ewen R. Orme, School of Computer Science, Cardiff University,
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Taxonomic Workflow in the EDIT Platform for Cybertaxonomy Andreas Kohlbecker, Pepe Ciardelli, Niels Hoffmann, Katja Luther, Andreas Müller Botanic Garden.
Ecological Niche Modeling Conceptual Workflows Deana Pennington University of New Mexico December 16, 2004.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
General Requirements for GUIDs for Taxonomic Names and Concepts Jessie Kennedy.
The challenge of organism identity --- The flora of the Southeast The flora of the Southeast as a case study Robert K. Peet University of North Carolina.
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
E-SI Theme: Exploiting Diverse Sources of Scientific Data Re-use or Re-invention - a Roadmap for Data Integration 27 th -28th November 2006 Prof. Jessie.
A vision for community involvement and integration Robert K. Peet & Alan S. Weakley Alan S. Weakley.
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
1 The Metadata Groups - Keith G Jeffery. 2 Positioning  Raise profile of metadata  Data first  Also software, resources, users  Achieve outputs/outcomes.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
Data sharing and exchange: Experiences within the
Joslynn Lee – Data Science Educator
Jessie Kennedy Rob Gales, Robert Kukla
Taxonomic and Community Classification Resources and Standards
Bringing Organism Observations Into Bioinformatics Networks
An ecosystem of contributions
Presentation transcript:

Data Integration Issues in Biodiversity Research Jessie Kennedy Shawn Bowers, Matthew Jones, Josh Madin, Robert Peet, Deana Pennington, Mark Schildhauer, Aimee Stewart

Visual Tools for Managing Taxonomic Concepts SEEK  Science Environment for Ecological Knowledge  Research and develop information technology to radically improve the type and scale of ecological science that can be addressed

Visual Tools for Managing Taxonomic Concepts Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Geography Ecology Science and Scientific Data are Complex

Visual Tools for Managing Taxonomic Concepts Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Ecology Geography Organism Name Taxon concept Gene sequence Pathway Protein Location Temperature Depth

Visual Tools for Managing Taxonomic Concepts Individual Scientist Small Scientific Community Large Scientific Community Scientific Laboraotory Scientific Community: complex

Visual Tools for Managing Taxonomic Concepts Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Ecology Geography Organism Name Taxon concept Gene sequence Pathway Protein Location Temperature Depth Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Ecology Geography Organism Name Taxon concept Gene sequence Pathway Protein Location Temperature Depth Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Ecology Geography Organism Name Taxon concept Gene sequence Pathway Protein Location Temperature Depth Biochemistry Climatology Taxonomy Meteorology Nomenclature Paleontology Genomics Proteomics Hydrology Morphology Geology Oceanography Ecology Geography Organism Name Taxon concept Gene sequence Pathway Protein Location Temperature Depth

Visual Tools for Managing Taxonomic Concepts Science & Scientific Data are Continually Changing  Conclusions become foundations for new hypotheses  New experiments invalidate existing knowledge  Knowledge is open to interpretation  Different opinions  Need to build this into our technological solutions observation experiment hypothesis conclusion

Visual Tools for Managing Taxonomic Concepts Exploiting Scientific Data  To support scientists in  Discovery  Access  Sharing  Integration/Linking  Analysis  Scientists can then improve their potential for new scientific discovery

Visual Tools for Managing Taxonomic Concepts Data Integration/Linking: approaches  Metadata  to describe the data sets and know how to interpret the data sets  Ontologies  to define the terminology used and know how data might be related and to aid automatic transformation of the data  Standardisation of formats  for exchange of data + to ease integration  LSIDs  to uniquely identify things; know when 2 things are the same  Workflows  to enable specification, refinement and repetition of integration/analysis  Provenance of data  to record where the data has come from and what has happened to it en route.

Visual Tools for Managing Taxonomic Concepts Projects in most sciences: ESG

Visual Tools for Managing Taxonomic Concepts Ecological Science - Analysis  Ecological niche modeling of species distributions Where do species occur now? Image from Where will they occur in the future?

Visual Tools for Managing Taxonomic Concepts Ecological Niche Modeling Environmental Characteristics from gridded GIS layers Known Species Locations Temperature layer Many other layers Environmental Change Prediction Future Scenarios Of Environmental Characteristics Invasion Area Prediction Environmental Characteristics Of Different Geographic Area Native Distribution Prediction Environmental Characteristics Of Surrounding Geographic Area Develop Model Multidimensional Ecological Space D 1 = Temperature D2D2 DnDn

Visual Tools for Managing Taxonomic Concepts Sources of Scientific Data  Data are massively dispersed  Ecological field stations and research centers (100’s)  Natural history museums and biocollection facilities (100’s)  Agency data collections (10’s to 100’s)  Individual scientists (1000’s)  Data are heterogeneous  Syntax (format)  Schema (model)  Semantics (meaning)

Visual Tools for Managing Taxonomic Concepts Challenge: Data Integration

Visual Tools for Managing Taxonomic Concepts SEEK Components

Visual Tools for Managing Taxonomic Concepts Semantic Annotation – SEEK ontologies  Integration/merge  Concept mapping  Units conversion  Spatial & temporal scaling  Data discovery  Finding relevant data sets  Understanding data set content

Visual Tools for Managing Taxonomic Concepts Smart (Data) Integration: Merge  Discover data of interest  … connect to merge actor  … “compute merge”

Visual Tools for Managing Taxonomic Concepts Smart Merge …  Semantic type annotations and ontology definitions used to find mappings between sources  Executing the merge actor results in an integrated data product (via “outer union”) a1 a2 a3 a4 a 5 10 b 6 11 a1 a2 a3 a4 a 5 10 b 6 11 a5 a6 a7 a8 0.1 a 0.2 c 0.3 d a5 a6 a7 a8 0.1 a 0.2 c 0.3 d a3a3 a6a6 a1a1 a8a8 a4a4 Merge a1a8 a3a6 a4 Biomass Site a1 a3 a4 a b a 0.1 c 0.2 d 0.3 a1 a3 a4 a b a 0.1 c 0.2 d 0.3 Merge Result

Visual Tools for Managing Taxonomic Concepts Challenges of Taxonomic Data Scientific names change in meaning over time + geographical region  conclusions being drawn from analysis of data integrated on names.

Visual Tools for Managing Taxonomic Concepts Flora North America SubAlpine Fir USDA Plants & ITIS Abies lasiocarpa Abies bifolia Abies lasiocarpa var. arizonica var. lasiocarpa What is Abies lasiocarpa?

Visual Tools for Managing Taxonomic Concepts Aus L.1758 Aus aus L.1758 Linneaus 1758 Aus aus L.1758 Tucker 1991 Aus L.1758 Aus cea BFry 1989 Aus aus L.1758 Aus L.1758 Aus bea Archer 1965 Aus aus L.1758 Aus L.1758 Aus bea Archer 1965 Aus cea BFry 1989 Fry 1989 Aus L.1758 Xus beus (Archer) Pargiter Aus ceus BFry 1989 (vi) Xus Pargiter 2003 Pargiter 2003 Aus aus L Changes in meaning of names Aus bea and Aus cea noted as invalid names and replaced with Aus beus and Aus ceus. Pyle Revisions of Aus 1 name spelling change Taxonomic history of imaginary genus Aus L. 1758

Visual Tools for Managing Taxonomic Concepts Aus L.1758 Aus bea Archer 1965 Aus aus L.1758 Archer 1965 Aus L.1758 Aus aus L.1758 Linneaus 1758 Aus aus L.1758 Aus L.1758 Xus beus (Archer) Pargiter Aus ceus BFry 1989 (vi) Xus Pargiter 2003 Pargiter 2003 Aus aus L Aus bea and Aus cea noted as invalid names and replaced with Aus beus and Aus ceus. Aus aus L.1758 Tucker 1991 Aus L.1758 Aus cea BFry 1989 Aus L.1758 Aus bea Archer 1965 Aus cea BFry 1989 Fry 1989 Changes in meaning of names Pyle Names 2 genus 6 species

N4 - Aus beus Archer 1965 N1 - Aus aus L.1758 N1 C1.5 C1.4 C1.3 C1.2 C1.1 C1.1 - Aus aus L.1758 sec. Linneaeus 1758 C1.2 - Aus aus L.1758 sec. Archer 1965 C1.3 - Aus aus L.1758 sec. Fry 1989 C1.4 - Aus aus L.1758 sec. Tucker 1991 C1.5 - Aus aus L.1758 sec. Pargiter 2003 N2 - Aus bea Archer 1965 N5 C5.5 N5 - Aus ceus Fry 1989 C5.5 - Aus ceus Fry 1989 sec. Fry 1989 C6.5 N6 N6 - Xus beus Pargiter 2003 C6.6 - Xus beus Pargiter 2003 sec. Pargiter 2003 N2 C2.3 C2.2 C2.2 - Aus bea Archer 1965 sec. Archer 1965 C2.3 - Aus bea Archer 1965 sec. Fry 1989 N3 N4 C3.4 C3.3 N3 - Aus cea Fry 1989 C3.3 - Aus cea Fry 1989 sec. Fry 1989 C3.4 - Aus cea Fry 1989 sec. Tucker 1991 N0 - Aus L.1758 N0 C0.5 C0.4 C0.3 C0.2 C0.1 C0.1 - Aus L.1758 sec. Linneaeus 1758 C0.2 - Aus L.1758 sec. Archer 1965 C0.3 - Aus L.1758 sec. Fry 1989 C0.4 - Aus L.1758 sec. Tucker 1991 C0.5 - Aus L.1758 sec. Pargiter 2003 C7.5 N7 N7 - Xus Pargiter 2003 C7.6 - Xus Pargiter 2003 sec. Pargiter Names 17 Concepts Each name has many concepts or meanings

Visual Tools for Managing Taxonomic Concepts Find data sets containing Aus aus  Many possible interpretations of Aus aus (N1)  Original concept: C1.1  Most recent concept: C1.5  Preferred Authority (e.g. Fry 1989): C1.3  Everything ever named N1: Union(C1.1,C1.2,C1.3,C1.4,C1.5)  Best fit according to some matching algorithm Best(C1.1,C1.2,C1.3,C1.4,C1.5)  New concept containing only those features common to all concepts with the name N1: Intersection(C1.1,C1.2,C1.3,C1.4,C1.5)  Is it appropriate to link or merge data sets returned on the scientific names?  Depends on the user’s purpose  Level of precision required N1 - Aus aus L.1758 N1 C1.5 C1.4 C1.3 C1.2 C1.1

Visual Tools for Managing Taxonomic Concepts C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 Information from literature on synonymy Taxonomists record which names their concepts are synonymous with and any name changes Parent child relationships in 5 revisions Names for each of the concepts

Visual Tools for Managing Taxonomic Concepts Find data sets with Aus aus (N1) C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 N1 C1.1 C1.2C1.3 C1.5 C1.4 N1

Visual Tools for Managing Taxonomic Concepts Find data sets with Aus aus (N1) C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 N1 N2 C1.1 C1.2 C2.2 C1.3 C2.3 C1.5 C1.4 N1

Visual Tools for Managing Taxonomic Concepts Find data sets with Aus aus (N1) C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 N1 N2 C1.1 C1.2 C2.2 C1.3 C2.3 C1.5 C1.4C3.4C3.3 N1 N2 N3

Visual Tools for Managing Taxonomic Concepts Find data sets with Aus aus (N1) C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 N1 N2 C1.1 C1.2 C2.2 C1.3 C2.3 C1.5 C1.4C3.4C3.3 C6.5 N6 N3 N4 N1 N2

Visual Tools for Managing Taxonomic Concepts Find data sets with Aus aus (N1) C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N0 N7 N1 N2 N5 N6 N3 N4 N1 N2 C1.1 C1.2 C2.2 C1.3 C2.3 C1.5C5.5 C1.4C3.4C3.3 C6.5 N5 N6 N3 N4 N1 N2 N3 Results in everything returned for Aus aus by traversing the synonymy and name links

Visual Tools for Managing Taxonomic Concepts C1.5C5.5 C0.5 C1.4C3.4 C0.4 C1.1 C0.1 C1.2 C2.2 C0.2 C1.3 C2.3 C3.3 C0.3 C6.5 C7.5 N1 N5 N6 N2 N3 N4 N0 N7 == Information to improve data sets returned             Minimally what we need are set relationships from concepts in any taxonomy to earlier concepts and name changes related to earlier names We can build systems to return data suit for purpose     

Visual Tools for Managing Taxonomic Concepts Real Biological Taxonomies  Larger and change more frequently than the Aus example  German mosses  14 classifications in 73 years  covering 1548 taxa  only 35% thought to be stable concepts  65% of names used in legacy data sets are ambiguous  Taxonomic Revisions of genus Alteromonas 34 years: from 1972 to 2006  At the species level  18 “emendations”  19 species reassigned to 4 genera  3 new combinations  6 synonyms  2 species to subspecies  2 subspecies to species  21 new species

Visual Tools for Managing Taxonomic Concepts SEEK Taxon Approach  Use Taxon Concepts for referring to organisms  Aus aus L sec. Tucker 1991  Abies lasiocarpa (Hook) Nutt. sec FNA 1997  Taxon Concept/Name Resolution  International data exchange schema  TCS (Taxonomic Concept Schema)  Concept Repository and Resolution web service  Linked to Kepler workflow system  Globally unique identifiers (LSIDs)  Visualization software for comparing Taxonomies and Asserting Concept Relationships

Visual Tools for Managing Taxonomic Concepts Taxon Object Server Mammal Species of the World Taxonomic Literature Taxonomic Data Providers TOS SEEK Cache Database to TCS Mapping Tool Concept Extraction Tool TCS Concept Mapper

Visual Tools for Managing Taxonomic Concepts Taxonomic Object Service: SEEK Concept Mapper TCS Find All Concepts Get Synonymous Concepts Get Best Concept TOS SEEK Cache LSID Authority Morpho Data Analysis EML Datasets Identify species EML(TCS) Mark up datasets

Visual Tools for Managing Taxonomic Concepts Recap…  Re-emphasised the problems with Taxonomic Names  not good identifiers for organisms  problem extends to most areas  characters, countries, habitats, vegetation types, genes…..  Shown that Taxonomic concepts are better for referring to organisms, specimens, observations…  but  Need better systems for resolving taxonomic names/concepts  Which require better information

Visual Tools for Managing Taxonomic Concepts Provide better tools for users  To help taxonomists create better quality data  Better access to reference/legacy data  Explore differences/similarities in existing taxonomies  To create relationships between concepts  Improved data can be made available to the general biology community for incorporating into bio-referenced databases.  To help end users understand and use the data  and its limitations  Biologists can use tools to understand the impact of using particular data on their analysis

Visual Tools for Managing Taxonomic Concepts Conclusion  Science is complex (and therefore split into specialisms)  Identify the overlaps/linkages in the different domains  Need useful approximations of things to simplify linked domain  Need to understand the approximations or linking points well  Support re-composition, linking or building on the components  Science is inherently changing  Science is full of legacy data  Today’s scientific research is tomorrow’s legacy data  Track the changes in the data  know when components or links have changed  Provide long-term persistent storage  Any published scientific discovery should store the data as evidence  Data needs to be accurately annotated  Sufficient to repeat analyses to test hypotheses

Visual Tools for Managing Taxonomic Concepts Acknowledgements  Colleagues on the SEEK project  NSF and EPSRC funding  e-Science Centre funding  Colleagues in TDWG

Thank You Questions…