Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.

Slides:



Advertisements
Similar presentations
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Advertisements

GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
IPY and Semantics Siri Jodha S. Khalsa Paul Cooper Peter Pulsifer Paul Overduin Eugeny Vyazilov Heather lane.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Next Generation Environmental Informatics as exemplified by the Tetherless World Semantic Water Quality Portal Ping Wang 1 Jin Guang.
A Semantic Sommelier as an Ontology-powered Mobile Social Application and a Pedagogical Tool Deborah L. McGuinness and Evan W. Patton.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines James Michaelis ( ), Deborah L. McGuinness
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
Domain Modelling the upper levels of the eframework Yvonne Howard Hilary Dexter David Millard Learning Societies LabDistributed Learning, University of.
Semantic Web Bootcamp Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
SemantAqua: A Semantically-Enabled Provenance-Aware Water Quality Portal Evan W. Patton, Ping Wang, Jin Guang Zheng, Timothy Lebo, Li Ding, Joanne Luciano,
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Catalog/ ID Selected Logical Constraints (disjointness, inverse, …) Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
A Semantically-Enabled Provenance- Aware Water Quality Portal Joint work with: Jin Guang Zheng, Ping Wang, Evan Patton, Timothy Lebo, Joanne Luciano Deborah.
Motivations and Challenges: Proper data management hinges on recording and maintaining “steps” applied to create data. Consumers require methods to assess.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Prof. Peter #twcrpi) Tetherless World Constellation Chair, Earth and Environmental Science/ Computer Science/ Cognitive.
Applying Provenance Extensions to OPeNDAP Framework Patrick West, James Michaelis, Tim Lebo, Deborah L. McGuinness Rensselaer Polytechnic Institute Tetherless.
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
Resource Discovery for Extreme Scale Collaboration Benno Lee Patrick West 1 William Smith 2
SemantEco Annotator for Linked Data Generation and Generalized Semantic Mapping Session: Technologies, Reasoning, and Annotation Methods of the Semantics.
TWC-SWQP: A Semantically-Enabled Provenance-Aware Water Quality Portal Ping Wang, Jin Guang Zheng, Linyun Fu, Evan W. Patton, Timothy Lebo, Li Ding, Joanne.
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Using Open Data to Create Value for Citizens. Data.gov Provides instant access to ~400,000 datasets in easy to use formats Contributions from UN, World.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Hierarchical Search in SemantEco Support Varied Ontology Design Patterns Session: "Semantics for Biodiversity: Interoperability with genomic and ecological.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data Ying Ding et al. Jin Guang Zheng, Tetherless World Constellation.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.
Semantic Web 06 T 0006 YOSHIYUKI Osawa. Problem of current web  limits of search engines Most web pages are only groups of character strings. Most web.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
RDFa Primer Bridging the Human and Data webs Presented by: Didit ( )
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
Of 24 lecture 11: ontology – mediation, merging & aligning.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Scaling the Wall: Experiences adapting a Semantic Web application to utilize social networks on mobile devices Evan W. Patton 1 ( ) &
Existing Designs and Prototypes at RPI
LOD reference architecture
Modeling Data Set Versioning Operations
Modeling Data Set Versioning Operations
Australian and New Zealand Metadata Working Group
Presentation transcript:

Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed 1,2, Katherine Chastain 1, and Deborah McGuinness 1 1 Tetherless World Constellation, Rensselaer Polytechnic Institute, th Street, Troy, NY DataONE, University of New Mexico, 1 University Boulevard N.E., Albuquerque, NM 87131

Overview Introduction Semantics and Linked Data Use Case: SemantEco SemantEco Annotator –Concept –Getting started –Overview Ontologies Capabilities Integration with Semantic Applications Future Work Quick Look Video Summary 1

Introduction How can we take datasets from different sources and make them –Easy to search and to discover? –Easy to use and to re-use? –Easy to integrate with each other for visualization and other applications? 2

Semantics and Linked Data We need a way to describe the relationships between tabular data columns… Linked-data formats such as the Resource Description Framework (RDF) capture such relationships in subject- predicate-object triples. … and we need a method of description that is both standardized and machine-readable. Communities can develop, use, and reuse common vocabulary with ontologies, expressed in a computer- readable format: the Web Ontology Language (OWL) 3

Semantics and Linked Data Linked format aids interoperability, making it easier to share. Use existing URI’s to refer to well-defined entities and concepts: –How do you make sure that everyone using your data understands that the string “NY” refers to the US state of New York? –What more can you learn if you can easily discover other datasets that also refer to the US state of New York? 4

Use Case: SemantEco SemantEco is a data visualization environment that allows a user to explore ecological data through a map- based interface. Data comes from a variety of sources: –Federal, such as the USGS, EPA. –Local, such as the Darrin Freshwater Institute of Upstate New York. –… each with different notations and best- practices for gathering and recording. 5

Conceptually.... Represent data independent of the schema by which it was recorded This enables comparisons across data from different sources 6 In SemantEco, we look at Measurements: Water quality Air quality Birds Fish

SemantEco Annotator Allows a user to: Translate data into linked-data formats such as RDF: –Linked data triples describe how columns in a data table relate to each other, and to the data in that column. –OWL ontologies provide standard vocabularies for describing data these relationships. –Resulting enriched RDF data can be used immediately within RDF stores / hosted as LD. OR to utilize semantics to annotate data: –Column headers correspond to OWL properties –Data cell values can correspond to OWL classes or datatypes –Organizational best-practices and terminology can be defined in the data files themselves. 7

SemantEco Annotator: Getting Started 8

Provenance and Metadata 9 Annotator asks the user to provide metadata about the dataset. This is also becomes part of the final RDF, facilitating the dataset’s discoverability.

SemantEco Annotator Tabular data view

SemantEco Annotator Ontology loader -- Ontology facets

SemantEco Annotator Global settings

SemantEco Annotator Drag-and-drop to make assignments -- Work directly on tabular data

Ontologies 14 Load one or more ontologies from the dropdown menu. Or import from a URI. Annotator also maintains a list of recent imports for re-use.

Capabilities 15 Provide a definition for “Accession Code” Specify which standard was used to record the Date Group “Lake Name”, “Z Max” and “Sample Z” together as a single entity: the location where the sample was taken Make explicit that “NH4+” is the same thing as “Ammonium”, and that the units (mg/L) apply to each number in that column.

Integration with Semantic Applications 16 Identify application’s requirements: Eg., a piece of data with lat-long coordinates can be plotted on a map. We brought in data from the Darrin Freshwater Institute containing water quality data for lakes in Upstate New York, augmenting existing data from the U.S. Geological Survey. “Big Moose Lake”

Integration with Semantic Applications Linking data to well-defined entities and concepts by URI enhances searchability. 17 dbpedia: New_York “New York”“New York State” “NY” dbpedia: New_York_City

Future Work 18 Automatic mappings directed to a particular graph closed under a predicate/object pair, use of OWL domain and range restriction axioms to guide the user in vocabulary selection decisions Use of OWL class definitions to enable a top-down approach for modeling data Ability to load enhancement files, both to facilitate translation of multiple similar datasets, and to make corrections easier. Construction of a platform for better management of linked data, within which the Annotator plays a vital role. Use of application requirements to create “templates” for new data sources to be integrated more easily.

Summary 19 “SemantEco Annotator” component for ease of translation into RDF Multi-purposed for translation, annotation, and generalized mapping. A Part of a Future “Suite” that couples Annotation and Search

SemantEco Annotator Project Page Want more info? Interested in collaborating? See Evan Patton or Deborah McGuinness We also have a project page with screenshots and demonstration videos: 20

Acknowledgements Rensselaer Polytechnic Institute Tetherless World Constellation at RPI DataONE 21

SemantEco: More Info For additional information about SemantEco: “Addressing the Challenges of Multi-Domain Data Integration with the SemantEco Framework” 10:35am, IN52B-02. E.W. Patton; P. Seyed; D.L. McGuinness 22