Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data Matthew Perry 1, Amit P. Sheth 1, Farshad Hakimpour 2, Prateek Jain 1.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

The Integration of Biological Data Using Semantic Web Technologies Susie Stephens Principal Product Manager, Life Sciences Oracle
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Applied Temporal RDF: Efficient Temporal Querying using SPARQL Jonas Tappolet and Abraham Bernstein ESWC 2009.
Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.
1. 2 Semantic Sensor Web Talk at: Semantic Interoperability Community of Practice (SICoP) Sensor Standards Harmonization WG January 15, 2008 Cory Henson.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice RDF and SOA David Booth, Ph.D. HP.
Dr. Jim Bowring Computer Science Department College of Charleston CSIS 690 (633) May Evening 2009 Semantic Web Principles and Practice Class 5: 27 May.
Linked Sensor Data Harshal Patni, Cory Henson, Amit P. Sheth Ohio Center of Excellence in Knowledge enabled Computing (Kno.e.sis) Wright State University,
1. 2 Semantic Sensor Web Invited Talk Advancing Digital Watersheds and Virtual Environmental Observatories II AGU Fall Meeting, San Franscisco, December.
Dr. Alexandra I. Cristea RDF.
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
More RDF CS 431 – Carl Lagoze – Cornell University Acknowledgements: Eric Miller Dieter Fensel.
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
JOSH FLECK Semantic Web. What is Semantic Web? Movement led by W3C that promotes common formats for data on the web Describes things in a way that computer.
VLDB 2005 An Efficient SQL-based RDF Querying Scheme Eugene Inseok Chong Souripriya Das George Eadon Jagannathan Srinivasan New England Development Center.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Implemented Systems Presenter: Manos Karpathiotakis Extended Semantic Web Conference 2012.
Spatial Database Souhad Daraghma.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
Practical RDF Chapter 1. RDF: An Introduction
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Knowledge Enabled Information and Services Science Extending SPARQL to Support Spatially and Temporally Related Information Prateek Jain, Amit Sheth Peter.
IST 210 Introduction to Spatial Databases. IST 210 Evolution of acronym “GIS” Fig 1.1 Geographic Information Systems (1980s) Geographic Information Science.
Logics for Data and Knowledge Representation
The Semantic Web Web Science Systems Development Spring 2015.
Analyzing Theme, Space and Time: An Ontology-based Approach Matthew Perry, Farshad Hakimpour, Amit Sheth 14 th International Symposium on Advances in Geographic.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools Mohammad Farhan Husain, Latifur Khan, Murat Kantarcioglu and Bhavani Thuraisingham.
Matthew Perry Ph.D. Dissertation Defense
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
OWL 2 in use. OWL 2 OWL 2 is a knowledge representation language, designed to formulate, exchange and reason with knowledge about a domain of interest.
Querying Structured Text in an XML Database By Xuemei Luo.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
Leveraging Semantic Web techniques to gain situational awareness Can Semantic Web techniques empower perception and comprehension in Cyber Situational.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
1. 2 Sensor Data Management 3 1.Motivating Scenario 2.Sensor Web Enablement 3.Sensor data evolution hierarchy 4.Semantic Analysis Presentation Outline.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Of 35 lecture 5: rdf schema. of 35 RDF and RDF Schema basic ideas ece 627, winter ‘132 RDF is about graphs – it creates a graph structure to represent.
Scalable Distributed Reasoning Using MapReduce Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen Department of Computer Science, Vrije.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Ontology Engineering Lab #2 – September 9,
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
Ch 7: RDF schema 현근수, 김영욱, 백상윤, 이용현 Team C. Introduction Semantic web modeling In RDF: simply creates graph structure to represent data In RDFS: about.
ELIS – Multimedia Lab PREMIS OWL Sam Coppens Multimedia Lab Department of Electronics and Information Systems Faculty of Engineering Ghent University.
Conclusions Presenter: Manolis Koubarakis Extended Semantic Web Conference 2012.
ONTOLOGY ENGINEERING Lab #2 – September 8,
Lecture 10 Creating and Maintaining Geographic Databases Longley et al., Ch. 10, through section 10.4.
Of 38 lecture 6: rdf – axiomatic semantics and query.
An Optimization Technique for RDFS Inference using the Application Order of RDFS Entailment Rules Kisung Kim, Taewhi Lee
Knowledge Enabled Information and Services Science Spatiotemporal and Thematic Semantic Analytics Matthew Perry PhD. Student Kno.e.sis Center, Wright State.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Semantic Sensor Web Amit Sheth LexisNexis Ohio Eminent Scholar
Introduction to the Semantic Web (tutorial) 2009 Semantic Technology Conference San Jose, California, USA June 15, 2009 Ivan Herman, W3C
Tutorial on Semantic Web
Query Rewriting Framework for Spatial Data
Joining Interval Data in Relational Databases
Chapter 8 Advanced SQL.
Semantic Web Basics (cont.)
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

Supporting Complex Thematic, Spatial and Temporal Queries over Semantic Web Data Matthew Perry 1, Amit P. Sheth 1, Farshad Hakimpour 2, Prateek Jain 1 2 nd International Conference on Geospatial Semantics Mexico City, MX, November , Kno.e.sis Center, CSE Dept., Wright State University, Dayton, OH, USA 2.LSDIS Lab, CS Dept., The University of Georgia, Athens, GA, USA

Outline Background and Motivation Contributions Modeling Approach Querying Approach Evaluation Conclusions

Background

From ….. Finding things To ….. Finding out about things Relationships! Semantic Discovery * *

Semantic Associations ρ-path association ρ-iso association

E1:Reviewer E2:PaperE5:Person E4:Person E3:PaperE7:Submission E6:Person author_of friend_of Aggregated RDF Instance Base Ontology Schemas XML HTML RDBMS TEXT How is entity1 (Reviewer) related to entity7 (Submission) ? Semantic Analytics Searching, Exploring and Visualizing semantically meaningful relationships SemDis Project

Motivation

In many analytical applications spatial and temporal data is critical –national security, regulatory compliance, etc. Current semantic analytics research has focused on thematic relationships GIS and Spatial Databases –Traditionally model thematic aspects as directly attached attributes of geospatial objects –Thematic entities/events and relationships should be represented as first-class objects to allow semantic analytics

Motivating Example Query specifies (1)a relationship between a soldier, a chemical agent and a battle location (2)a relationship between members of an enemy organization and their known locations (3)a spatial filtering condition based on the proximity of the soldier and the enemy group in this context Scenario (Biochemical Threat Detection): Analysts must examine soldiers’ symptoms to detect possible biochemical attack Find all soldiers with symptoms indicative of exposure to chemical X that fought in battles within 2 miles of sightings of members of enemy group Y

Contributions Storage and indexing scheme for spatial and temporal SW data (i.e. RDF) Efficient treatment of temporal inferencing Definition and implementation of 4 spatial and temporal query operators Performance study using large dataset

Modeling Approach

Representing ontologies and instance data W3C standards –Resource Description Framework (RDF) Language for representing information about resources Resources are identified by Uniform Resource Identifiers (URIs) – globally-unique Common framework for expressing information allows exchange and reuse without loss of meaning Graph-based data model Relationships are first class objects

rdfs:Class lsdis:Politician lsdis:Speech rdf:Property rdfs:Literal lsdis:give s lsdis:nam e rdfs:domain rdfs:range lsdis:Politician_12 3 lsdis:Speech_456 “Hillary Clinton” lsdis:gives name lsdis:Person rdf:type rdfs:subClassOf statement Defining Properties (domain and range):. Subject Predicate Object Statement (triple):. Subject Predicate Object Defining Class/Property Hierarchies:. Subject Predicate Object Directed Labeled Graph Statement (triple): “Hillary Clinton”. Subject Predicate Object Defining Classes:. Subject Predicate Object Defining Properties:. Subject Predicate Object RDF Model

Incorporation of Temporal Information Use Temporal RDF Graphs defined by Gutiérrez, et al 1 Considers time as a discrete, linearly-ordered domain Associate time intervals with statements which represent the valid-time of the statement –Essentially a quad instead of a triple 1. Claudio Gutiérrez, Carlos A. Hurtado, Alejandro A. Vaisman. “Temporal RDF”. ESWC 2005:

Example Temporal Graph: Platoon Membership E1:Soldier E3:Platoon E5:Soldier E2:Platoon E4:Soldier assigned_to [1, 10] assigned_to [11, 20] assigned_to [5, 15]

Temporal RDF Serialization C A B XMLSchema Datetime temporal CalendarClockInterval Instant inCalendarClockDataType object predicate subject Statement begins ends RDF Reification OWLTime NEW

Example Domain Ontology 1.Matthew Perry, Farshad Hakimpour, Amit Sheth. "Analyzing Theme, Space and Time: An Ontology-based Approach", 14th Intl Symp. on Advances in Geographic Information Systems (ACM-GIS '06), Arlington, VA, November , 2006

E1:Soldier E2:Soldier E3:Soldier E5:Battle E4:Address E6:Address E7:Battle occurred_at located_at lives_at assigned_to E8:Military_Unit assigned_to participates_in Georeferenced Coordinate Space (Spatial Regions) Dynamic EntitiesSpatial OccurrentsNamed Places Thematic Contexts Linking Non-Spatial Entities to Spatial Entities Residency Battle Participation E1:Soldier

Querying Approach

20 Graph Patterns

Query Operators Two spatial operators –spatial_extent() Find all soldiers participating in military events that take place within an input bounding box –spatial_eval() Find all soldiers with symptoms indicative of exposure to chemical X that fought in battles within 2 miles of sightings of enemy group Y Two temporal operators –temporal_extent() Return all soldiers exhibiting a given symptom during a specific time period –temporal_eval() Find all soldiers who exhibited symptoms after participating in a given military event

Implementation SQL-based Approach –Created user-defined functions in Oracle DBMS –Create procedures for spatial and temporal indexing Existing Oracle Technologies –Semantic Technology Component Storage Structures for RDF(S) Data RDFS Inference Procedures SQL-based Querying –Spatial Component Spatial Types – SDO_GEOMETRY –Implementation of Spatial_Region Spatial Indexing (R-Tree) Spatial Operators

SQL-based Querying Approach SQL Table Functions SELECT X, Y FROM TABLE (Table_Func(…)); X Y Z a b c d e f … … …

Spatial Operators (spatial_extent) spatial_extent (graphPattern VARCHAR, spatialVar VARCHAR, ontology RDFModels,, ) returns AnyDataSet; SELECT x FROM TABLE (spatial_extent( '(?x ?y)(?y ?z) (?z ?l)', 'l', SDO_RDF_Models('military'), SDO_GEOMETRY(2001, 8265, SDO_POINT_TYPE( , , NULL), NULL, NULL), 'GEO_DISTANCE(distance=45 unit=mile)'))); Select all soldiers on the crew of a vehicle used in a military event that occurred within 45 miles of a given point

Spatial Operators (spatial_eval) spatial_eval (graphPattern VARCHAR, spatialVar VARCHAR, graphPattern2 VARCHAR, spatialVar2 VARCHAR, spatialRelation VARCHAR, ontology RDFModels) return AnyDataSet; SELECT b FROM TABLE (spatial_eval( '( ?z) (?z ?l)', 'l', '(?b ?c) (?c ?d)', 'd', 'GEO_RELATE(mask=overlap)', SDO_RDF_Models('military'))); Select all platoons with a training area that overlaps the training area of Platoon_12996

Temporal Properties of Context Instances Platoon#456Soldier#789Soldier#123 assigned_to:[3, 12] assigned_to:[6, 20] Temporal Intersection [6, 12] Temporal Range [3, 20]

Temporal Operators (temporal_extent) temporal_extent (graphPattern VARCHAR, intervalType VARCHAR, ontology RDFModels,,, ) return AnyDataSet; SELECT x, a FROM TABLE (temporal_extent( '(?x ?y) (?y ?z) (?x ?a)', 'INTERSECT' SDO_RDF_Models('military'), to_date(' :26:01', 'yyyy-mm-dd hh24:mi:ss'), to_date(' :22:00', 'yyyy-mm-dd hh24:mi:ss'), 'DURING')); Select all soldiers (and their platoons) on the crew of a vehicle actively used during the time period [10/04/1942, 9/21/1944]

Temporal Operators (temporal_eval) temporal_eval (graphPattern VARCHAR, intervalType VARCHAR, graphPattern2 VARCHAR, intervalType2 VARCHAR, temporalRel VARCHAR, ontology RDFModels) return AnyDataSet; SELECT x, a FROM TABLE (temporal_eval( '(?x ?y) (?y ?z) (?z )', 'INTERSECT', '(?a ?b) (?b ?c) (?c )', 'INTERSECT', 'ANYINTERACT', SDO_RDF_Models('military'))); Find all pairs of soldiers such that one soldier was leader of a platoon in Division_2186 and the other soldier was leader of a platoon in Division_2191 during the same time period

Storage Scheme Spatial Indexing Procedure Temporal Indexing Procedure Load RDF Data with Oracle

Temporal Inferencing RDFS Inferencing Rules 1.(x, rdf:type, y) AND (y, rdfs:subClassOf, z)  (x, rdf:type, z) 2.(x, p, y) AND (p, rdfs:domain, a), (p, rdfs:range, b)  (x, rdf:type, a), (y, rdf:type, b) 3.(x, p, y) AND (p, rdfs:subPropertyOf, z)  (x, z, y) Example: (x, participates_in, e):[2, 5] (y, participates_in, e):[3, 7] (z, participates_in, e):[6, 9] By rule 2: (e, rdf:type, event):[2, 9]

Temporal Inferencing Algorithm 1.create table asserted_temporal_triples (subj, prop, obj, start_date, end_date) 2.perform schema-level inferencing 3.perform instance-level inferencing 4.sort redundant_triples by (subj_id, prop_id, obj_id, start) 5.make a single pass and merge overlapping intervals for same statement 6.insert updated triples and intervals into final temporal_triples table asserted_temporal_triples (x, participates_in, e):[2, 5] (y, participates_in, e):[3, 7] (z, participates_in, e):[6, 9] redundant_triples (e, rdf:type, event):[2, 5] (e, rdf:type, event):[3, 7] (e, rdf:type, event):[6, 9] temporal_triples (e, rdf:type, event):[2, 9]

Temporal Inferencing Algorithm 1.create table asserted_temporal_triples (subj, prop, obj, start_date, end_date) 2.perform schema-level inferencing 3.perform instance-level inferencing 4.sort redundant_triples by (subj_id, prop_id, obj_id, start) 5.make a single pass and merge overlapping intervals for same statement 6.insert updated triples and intervals into final temporal_triples table

Function Implementation Oracle Extensibility Framework ODCITable Interface Start() Fetch() Close()

Spatial Function Implementation 1.Start() 1.Translate graph pattern into multi-way join over temporal_triples table 2.Add sdo_relate or sdo_within_distance predicate to reflect spatial relation parameter 2.Fetch() 1.Retrieve rows until all are returned 3.Close() 1.Close cursor for query For spatial_eval: Nested Loop Join Strategy outer spatial_extent() inner filtered spatial_extent()

Temporal Function Implementation 1.Start() 1.Translate graph pattern into multi-way join over temporal_triples table 2.Push temporal filtering down as much as possible e.g., RANGE during [x, y]  all statements start >= x AND end <= y 2.Fetch() 1.Retrieve an intermediate result row 2.Construct INT/RANGE interval for result row 3.Apply temporal filtering condition and return matching rows 3.Close() 1.Close cursor for query For temporal_eval: Nested Loop Join Strategy outer temporal_extent() inner filtered temporal_extent()

Evaluation

Environment –Oracle 10g R2 on RedHat EL –Dual Xeon 3.0 GHz, 2GB RAM –512 MB buffer cache Objective –Test scalability w.r.t (1) dataset size, (2) query complexity Dataset –Thematic: Synthetic RDF Graph (Military Schema) Generated with TOntoGen 1 100k triples1.6 M triples15 M triples –Spatial: US Census Block Group boundary polygons 8739,35283,236 –Temporal: start and end times selected with uniform probability from 2 overlapping date ranges 1.Matthew Perry "TOntoGen: A Synthetic Data Set Generator for Semantic Web Applications", AIS SIGSEMIS Bulletin Volume 2 Issue 2 (April - June) 2005, pp

Scalability w.r.t Dataset Size All times reported are average of 15 trials using a warm cache

Scalability w.r.t. Graph Pattern Size Times reported for 15 million triple dataset

Conclusions and Future Work Conclusions –Approach for realizing spatial and temporal queries over SW data –Motivated by lack of support in current semantic analytics tools –Good scalability for large synthetic dataset Future Work –SPARQL query language extensions