Growing challenges for biodiversity informatics Utility of observational data models Multiple communities within the earth and biological sciences are.

Slides:



Advertisements
Similar presentations
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Semantic annotation on the SONet and Semtools projects: Challenges for broad multidisciplinary exchange of observational data Mark Schildhauer, NCEAS/UCSB.
Dan Bunker TraitNet RCN: Foster the curation, discovery, and sharing of ecological trait data.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of Environmental and Ecological Data Mark Schildhauer 1,
ODM2: Developing a Community Information Model and Supporting Software to Extend Interoperability of Sensor and Sample Based Earth Observations Jeffery.
Vegetation databases Lessons from VegBank, SEEK, TDWG, IAVS, & NCEAS Robert Peet University of North Carolina.
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Ontology Notes are from:
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Informatics and Computational Challenges for Satellite Monitoring of Global Biodiversity Mark SchildhauerRyan Pavlick NCEAS, UCSBNASA/JPL NASA/NCEAS Workshop,
Crosscutting Concepts and Disciplinary Core Ideas February24, 2012 Heidi Schweingruber Deputy Director, Board on Science Education, NRC/NAS.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Using observational data models to enhance data interoperability for integrative biodiversity and ecological research Mark Schildhauer*, Luis Bermudez,
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
Data Integration, Analysis, and Synthesis Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
A Proposal for a Distributed Earth Observation Data Network Matthew B Jones UC Santa Barbara National Center for Ecological Analysis and Synthesis (NCEAS)
Observations and Ontologies Achieving semantic interoperability of environmental and ecological data Mark Schildhauer 1, Shawn Bowers 2, Josh Madin 3,
Advancing an Information Model for Environmental Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Richard P. Hooper, Kerstin Lehnert, Kim Schreuders,
SONet: Scientific Observations Network Semtools: Semantic Enhancements for Ecological Data Management Mark Schildhauer, Matt Jones, Shawn Bowers, Huiping.
Cyberinfrastructure Overview Core Cyberinfrastructure Team Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of.
Patterns and Conventions for Defining OBOE-Compatible Ontologies … Based on OBOE 1.0, June, 2010.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
Opportunities for earth science data interoperability through coordinated semantic development, using a shared model for observations and measurements.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
Directions in observational data organization: from schemas to ontologies Matthew B. Jones 1 Chad Berkley 1 Shawn Bowers 2 Joshua Madin 3 Mark Schildhauer.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
What is Information Modelling (and why do we need it in NEII…)? Dominic Lowe, Bureau of Meteorology, 29 October 2013.
Semantic Mediation in SEEK/Kepler: Exploiting Semantic Annotation for Discovery, Analysis, and Integration of Scientific Data and Workflows Bertram Ludäscher.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
Extensible Markup Language (XML) Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).ISO 8879 XML is a.
Data, Metadata, and Ontology in Ecology Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of California Santa Barbara.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Local global disambiguation of terms and concepts The BCO-DMO metadata database uses controlled vocabularies to record many of the important pieces of.
Subgroup 1 Collect interoperability requirements Define common, unified data model Engage tool & data providers, data consumers Subgroup 2 Identify and.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
A Provisional Observational Data Standard to Facilitate Data Sharing and Aggregation Lynn Kutner, Bruce Stein, and Donna Reynolds TDWG Annual Meeting,
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Proof of concept study of the Socio-Ecological Research and Observation oNTOlogy (SERONTO) for integrating multiple ecological databases. Introduction.
The Plant Ontology: Development of a Reference Ontology for all Plants Plant Ontology Consortium Members and Curators*: Laurel D.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
OGC ® ® OGC HY_Features model - progress report, next steps - 5 th, WMO/OGC Hydrology DWG New York, CCNY, August 11 – 15, 2014 Irina Dornblut, GRDC of.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Computational Tools for Population Biology Tanya Berger-Wolf, Computer Science, UIC; Daniel Rubenstein, Ecology and Evolutionary Biology, Princeton; Jared.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
OBOE Model Changes SONet Meeting June 7-9, Motivation for Changes Remove redundancy in the model –Mainly in Dimension (characteristics) Make it.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Trait ontology approach Marie-Angélique LAPORTE NCEAS June 7 th 2010.
Context Observation Measurement Relationship Entity Characteristic Value Standard hasContextRelationship ofEntity hasValue ofCharacteristic usesStandard.
Human-Aware Sensor Network Ontology (HASNetO): Semantic Support for Empirical Data Collection Paulo Pinheiro 1, Deborah McGuinness 1, Henrique Santos 1,2.
OBOE v.s. OGC O&M SONet June 8,2010. OBOE Entity Context Characteristic Measurement Observation Standard hasCharacteristic hasMeasurement ofEntity hasContext.
February 2010 OBO Foundry Meeting Hilmar Lapp Nescent Comparative Data Analysis Ontology.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
Harmonizing Measurements for Marine Biodiversity Observation Networks
Improving Data Discovery Through Semantic Search
SONet: A Community-Driven Scientific Observations Network to achieve Semantic Interoperability of Environmental and Ecological Data Mark Schildhauer1,
An ecosystem of contributions
Existing Designs and Prototypes at RPI
Annotation Examples (12/18/2009)
COMPASS: A Geospatial Knowledge Infrastructure Managed with Ontologies
Measurement Semantics: “MEASEM”
Presentation transcript:

Growing challenges for biodiversity informatics Utility of observational data models Multiple communities within the earth and biological sciences are converging on the use of observational data models (e.g., ecology, evolution, oceanography, geosciences) to enable cross-disciplinary data discovery, interpretation and integration. Observational data models provide a powerful, general, high level abstraction or “template” for describing a broad range of scientific data Controlled vocabularies can be linked to data through observational data models via semantic annotation, allowing for enhanced cross-disciplinary interpretation of scientific terminologies. Reasoning capabilities such as hierarchy traversal, consistency checks, and equivalence determinations are enabled via semantic formats such as OWL (Web Ontology Language) Semantic interoperability can be facilitated if observational data models and their controlled vocabularies are developed using compatible semantic and syntactic approaches Collaborative development of observational data models is progressing through the “open” SONet effort (AKN, CUAHSI, OGC, SEEK/Semtools, SERONTO, TraitNet, VSTO), and the newly constituted “Joint Working Group on Observational Data Models and Semantics” (including SONet, Data Conservancy, Data ONE, and Phenoscape projects). Prototype Architecture for Applying Semantic Annotation ID# IN51B-1046 Observational data models as a unifying framework for biodiversity information M. Schildhauer* 1, S. Bowers 2, M. Jones 1, Steve Kelling 3, Hilmar Lapp 4 1. NCEAS, Santa Barbara, CA, USA2. Gonzaga University, Spokane, WA, USA 3. Cornell University, NY, USA 4. NESCent, Durham, NC, USA * Contact: References: [1] Shawn Bowers, Joshua S. Madin and Mark P. Schildhauer, A Conceptual Modeling Framework for Expressing Observational Data Semantics. In ER 2008, [2] OpenGIS observations and measurements encoding standard (O&M): SiteSpeciesIndWt… GCE6Picea Rubens175.13… GCE6Picea Rubens … GCE7Picea Rubens … …………… observation “o1” entity “Point_Location” measurement “m1” key yes Characteristic “GCE_Local-Code” Standard “Nominal” observation “o2” entity “Tree” measurement “m2” key yes Characteristic “SpeciesName” Standard “TaxonomicName” measurement “m3” key yes Characteristic “Local-ID” Standard “Nominal” measurement “m4” Characteristic “Mass” Standard “Ratio” context identifying yes “o1” map “Site” to “m1” map “Species” to “m2” map “Ind” to “m3” Map “Wt” to “m4” observation “o1” entity “Point_Location” measurement “m1” key yes Characteristic “GCE_Local-Code” Standard “Nominal” observation “o2” entity “Tree” measurement “m2” key yes Characteristic “SpeciesName” Standard “TaxonomicName” measurement “m3” key yes Characteristic “Local-ID” Standard “Nominal” measurement “m4” Characteristic “Mass” Standard “Ratio” context identifying yes “o1” map “Site” to “m1” map “Species” to “m2” map “Ind” to “m3” Map “Wt” to “m4” (b) Semantic annotation to dataset (a) (a) Dataset Acknowledgements: NSF OCI INTEROP Prototype Architecture ProductivityMassMass Unit Kilogram Biomass usesStandard Gram is-a Bio.Entity Tree Leaf LitterTree LeafWet WeightDry Weight Observation Measurement SiteSpeciesIndWt GCE6Picea Rubens GCE6Picea Rubens GCE7Picea Rubens ………… PlaceTreatPlotLL sthC sthC sthN ………… Data Structural Metadata (e.g. EML) Wt LL hasMeasurement Semantic Annotation (e.g. OBOE) Domain-Specific Ontology is-a has-part hasCharac- teristic is-a part-of is-ahas-multiplier usesBase Standard ofEntity ofCharacteristic usesStandard ofCharacteristic Similarities among Observational Data Models Entity Context (other Observation) Characteristic Observation Standard hasCharacteristichasMeasurement ofEntity hasContext usesStandard Protocol usesProtocol FeatureOfInterest ObservationContext ObservedProperty OM_Observation Result carrierOfCharacteristic forProperty relatedContext Observation hasResult OM_Process usesProcedure PrecisionValue hasPrecision ofCharacteristic hasValue SEEK/Semtools Extensible Observation Ontology (OBOE) [1] OGC’s Observations and Measurements (O&M) [2] EntityFeatureOfInterest CharacteristicObservedProperty MeasurementOM_Observation Protocol OM_Process Result Standard Value Precision Context ObservationContext Measurement ofFeature Investigations in biodiversity science often require integrating information from multiple scientific disciplines. While representing and organizing taxonomic names and concepts constitutes a significant challenge, there is also a critical need to integrate biodiversity information with relevant measurements and observations from other earth and life science domains. Observational data models show great promise for facilitating this integration. Biodiversity use case Investigator wants to explore relationship between biodiversity and ecosystem functioning (e.g. primary productivity) in forest trees. Data needs might include: Vegetation plot information from repositories – Vegbank, Salvias, NVS --provide in situ association information, taxonomic, spatiotemporal, and other metadata ( *VegX schema; references *EML and *Darwin Core/ABCD) Vouchered specimen information from plots and supplementary collections (*Darwin Core/ABCD) enrich understanding of local associations and variation in forest composition Functional trait information from e.g. LEDA/Salvias/TRY -- associated with taxonomic identities in plot data (*CNRS/TraitNet ontologies) Phenotypic data from e.g. GenBank annotations (*GO, *EQ/PATO, *TO, *PO) Phylogenetic relationships among taxa drawn from Treebase (*CDAO) Climatic, geospatial, sensor data and in situ human observations from e.g. NCAR, NASA, misc independent researchers/citizens (* O&M, *SWE, *SWEET/VSTO, * EML) * Represent de facto and emerging formalizations, as ontologies and other controlled vocabularies, of relevant concepts for interpreting how data are defined and inter-related