Next Generation Environmental Informatics as exemplified by the Tetherless World Semantic Water Quality Portal Ping Wang 1 Jin Guang.

Slides:



Advertisements
Similar presentations
Towards Next Generation Integrative Mobile Semantic Health Information Assistants Evan W. Patton John Sheehan Yue.
Advertisements

Addressing the Challenges of Multi-Domain Data Integration with the SemantEco Framework Evan W. Patton, Patrice Seyed, Deborah L. McGuinness Presented.
Water Quality Portal Semantic e-Science Evan Patton Jin Guang Zheng Ping Wang Theodora Kampelou 11/22/2010.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
A Semantic Sommelier as an Ontology-powered Mobile Social Application and a Pedagogical Tool Deborah L. McGuinness and Evan W. Patton.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Data.gov Wiki: A Semantic Web Approach to Government Data Li Ding, Dominic DiFranzo, Sarah Magidson, Alvaro Graves, James R. Michaelis, Xian Li, Deborah.
Knowledge Provenance in Semantic Wikis Li Ding, Jie Bao, and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute Troy,
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
Applying Semantics in Dataset Summarization for Solar Data Ingest Pipelines James Michaelis ( ), Deborah L. McGuinness
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Provenance-Aware Faceted Search Deborah L. McGuinness 1,2 Peter Fox 1 Cynthia Chang 1 Li Ding 1.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,
Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
SemantAqua: A Semantically-Enabled Provenance-Aware Water Quality Portal Evan W. Patton, Ping Wang, Jin Guang Zheng, Timothy Lebo, Li Ding, Joanne Luciano,
Global Change Information System: Information Model and Semantic Application Prototypes (GCIS-IMSAP) Status 01/08/2013 Stephan Zednik 1, Curt Tilmes 2,
Provenance Capture in Data Access And Data Manipulation Software Patrick West 1 Peter Fox
References: [1] [2] [3] Acknowledgments:
Semantic Web Applications GoodRelations BBC Artists BBC World Cup 2010 Website Emma Nherera.
First they have to find it: Getting Government Data Discovered and Used Adapted from: John S. Erickson, Ph.D. Tetherless World Constellation Rensselaer.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Discovering accessibility, display, and manipulation of data in a data portal Nancy Hoebelheinrich Patrick West 2
A Semantically-Enabled Provenance- Aware Water Quality Portal Joint work with: Jin Guang Zheng, Ping Wang, Evan Patton, Timothy Lebo, Joanne Luciano Deborah.
Motivations and Challenges: Proper data management hinges on recording and maintaining “steps” applied to create data. Consumers require methods to assess.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
DOAP – Description of a Project Ontology DOAP provides us with the ability to represent software, software projects, releases of software, licensing information,
Citation and Recognition of contributions using Semantic Provenance Knowledge Captured in the OPeNDAP Software Framework Patrick West 1
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Prof. Peter #twcrpi) Tetherless World Constellation Chair, Earth and Environmental Science/ Computer Science/ Cognitive.
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
Applying Provenance Extensions to OPeNDAP Framework Patrick West, James Michaelis, Tim Lebo, Deborah L. McGuinness Rensselaer Polytechnic Institute Tetherless.
Linking Open Government Data (TWC LOGD) Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute.
TWC-SWQP: A Semantically-Enabled Provenance-Aware Water Quality Portal Ping Wang, Jin Guang Zheng, Linyun Fu, Evan W. Patton, Timothy Lebo, Li Ding, Joanne.
VIVO Conference 2013 Panel on VIVO Use-Cases for Collaborative Science: From Researcher Networks to Semantic User Interfaces for Data Patrick West – Tetherless.
References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2]
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
1 Advanced Semantic Technologies Deborah McGuinness CSCI , 97543, CSCI , 97014, ITWS , 98113, ITWS , TA: Abigail.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
1/6/2016Cyber SMW developers meetup1 Semantic RPI Jie Bao and Li Ding Tetherless World Constellation Rensselaer Polytechnic Institute April 2, 2009.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
U.S. Department of the Interior U.S. Geological Survey Decision Support Tools and USGS Data Management Best Practices Cassandra Ladino USGS Chesapeake.
Hierarchical Search in SemantEco Support Varied Ontology Design Patterns Session: "Semantics for Biodiversity: Interoperability with genomic and ecological.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
Semantic Web Portal: A Platform for Better Browsing and Visualizing Semantic Data Ying Ding et al. Jin Guang Zheng, Tetherless World Constellation.
Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.
Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Open Government Data Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
Update on Ecoinformatics Technical Working Group Activities Larry Fitzwater Computer Scientist US Environmental Protection Agency Rome, Italy – 17 May.
Tetherless World Constellation Open Government Data Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
Scaling the Wall: Experiences adapting a Semantic Web application to utilize social networks on mobile devices Evan W. Patton 1 ( ) &
Existing Designs and Prototypes at RPI
Presentation transcript:

Next Generation Environmental Informatics as exemplified by the Tetherless World Semantic Water Quality Portal Ping Wang 1 Jin Guang Zheng 1 Linyun Fu 1 Evan W. Patton 1 Timothy Lebo 1 Li Ding 1 Joanne S. Luciano 1 and Deborah L. McGuinness 1 ( 1 Rensselaer Polytechnic Institute th St., Troy, NY, United States) Poster: IN31B-1438 Glossary: EPA – U.S. Environmental Protection Agency MPN – Most Probable Number PML 2 – Proof Markup Language (PML) version 2 RPI – Rensselaer Polytechnic Institute TWC – Tetherless World Constellation at Rensselaer Polytechnic Institute USGS – United States Geological Survey Motivation In late 2009 in Bristol County, RI there was a case of E. Coli contaminating the public water supply resulting in illnesses in the population, particularly young children. residents requested information concerning when the contamination began, how it happened, and what measures were being taken to monitor and prevent future occurrences. That event reflected the increasing demand for direct and transparent access to ecological and environmental information, and inspired the Semantic Water Quality Portal (SemantAqua) project. Next Generation Environmental Informatics Starting with the domain of water quality, we are investigating a general framework called SemantEco that can support dynamic environmental informatics portals via semantically-enabled approaches, including: capture of the semantics of domain knowledge using a family of modular simple OWL2 ontologies, integration of environmental monitoring and regulation data from multiple sources following Linked Data principles preservation of provenance metadata using the Proof Markup Language (PML) version 2 inference of environment pollution events using OWL2 inference Combined with distributed sensor networks and incremental OWL2 classification, this work could provide a scaffold for deploying near real-time reporting of pollution events in communities. SemantAqua Workflow Location-based Information Retrieval Users input a ZIP Code™ to identify the area for their search. SemantAqua uses Geonames to look up additional information, e.g. city and state, to generate location-based query over the USGS and EPA datasets. The mobile interface also takes advantage of the W3C geolocation APIs to find polluted sites near the user. Enabling Context-Sensitive Actions In order to help users take an active role in monitoring water quality where they live, SemantAqua attempts to identify useful links where users can report problems with their local water supplies. Currently, the portal supports reporting to the EPA and some state departments that are related to environmental preservation and protection (e.g. the California Department of Fish and Game). Work to identify the appropriate links to external authorities that accept reports within their jurisdictions is still ongoing. Provenance-based Query SemantAqua captures provenance information during the data integration stages and encodes them in the Proof Markup Language (PML) version 2 Provenance Interlingua. The provenance information is being used to support provenance-based queries. For example the system allows users to select and inspect data source information so users can choose to rely only on data from sources they trust. This will be particularly important as the portal expands to include other more varied sources of data (see Future Work). Using Ontologies as Facets Regulations are encoded as ontologies, thus making an ontology a potential view of the world. Users can select from a number of different regulation ontologies to classify the data, allowing them to see differences between state regulations and the federal regulations set forth by the EPA. In addition, type information from the water ontology that describes the different types of measurement sites and their polluted counterparts gives the user some control over what information is displayed on the map. More customized Queries The Characteristic, Health Concern and Time Frame facets enable the user to further customer his/her query. The user can issue queries that are the most relevant to his/her interests. What sites/facilities in this area are polluted with these specific contaminants, e.g. fecal coliform, lead? What polluted sites/facilities are contaminated with pollutants that could cause the following symptoms or health problems, e.g. Diarrhea? What sites/facilities were polluted in the past two years? Data Presentation Different icons are used to differentiate polluted sites from clean sites. Clicking on one of these polluted sites will display a popup window that provides more details about the pollution events: names of contaminants, measured values, limit values, time of measurement, and health effects. Archive CSV2RDF4LOD Enhance CSV2RDF4LOD Enhance derive integrate archive Publish CSV2RDF4LOD Direct CSV2RDF4LOD Direct Visualize Reason derive Connecting to Health Issues Aiming at helping citizens investigate health impacts of water pollution, SemantAqua links water quality data to some known health considerations. We have generated an initial ontology describing potential health impacts of overexposure to contaminants. Initial content came from EPA. For example, exposure to E. Coli results in abdominal cramping and diarrhea, and if left untreated can result in high blood pressure and kidney damage. This health information is presented to the user together with the pollution details (see Data Presentation) and also used to customize information retrieval (see More Customized Queries). Time Series Visualization The time series visualization retrieves water quality data related to a selected water site or facility by querying the triple store and displays the water quality data as a time series. The user selects a particular permit for a facility, the characteristic of the water, and the test type (if any) associated with that particular characteristic. For the EPA data there are up to five different test types that take measurements in different ways and compute the limits differently: Quantity Average, Quantity Max, Concentration Min, Concentration Average, Concentration Max. The visualization on the right is about the quality of the water released by the Southeast Water Pollution Control Plant located in San Francisco. The plot shows the enterococci measurements in green and the regulation defined limit in blue. We can see that there are three severe violations (in red) happened during 2009 and Access to such information can help citizens be more informed and make requests to the state administrator to improve the handling of the water at the local facilities. Future Work Currently, twenty-seven states out of fifty have been encoded in RDF using the SemantEco and SemantAqua ontologies and work continues on converting the remaining states. The current portal contains the regulatory information of four of the fifty states. An effort is underway to encode additional regulatory information from different states as well as identify what states simply defer to the EPA on different pollutants as the EPA regulations have already been encoded. In addition, work on linking contaminants to external resources such as DBpedia and symptom and health effect information from sources such as WebMD will provide the data needed to answer the more interesting questions regarding the health impacts of pollution. We also have initiated work on linking to reporting systems at the federal and state levels so that users can report potential issues in their neighborhoods, thus making this portal a helpful tool for enacting environmental change. Lastly, we plan to augment the portal to generate data reports of user's query results, which could contain query specification, identified pollution events, relevant converted and source data and provenance of these data. These data reports can be useful when users report their findings to authorities or environmental organizations. Sponsors: Visit our project page at: Try it out: