Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins – 2014-2015.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 11. Streaming Data Management Chapter 18 Current Issues: Streaming Data and Cloud Computing The 3rd edition of the textbook.
Kien A. Hua Division of Computer Science University of Central Florida.
Maintaining Variance over Data Stream Windows Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O ’ Callaghan, Stanford University ACM Symp. on Principles.
Data Stream Computation Lecture Notes in COMP 9314 modified from those by Nikos Koudas (Toronto U), Divesh Srivastava (AT & T), and S. Muthukrishnan (Rutgers)
A Virtual Organisation for e-Learning Nicola Capuano, Pierre Carrolaggi, Jerome Combaz, Fabio Crestani, Matteo Gaeta, Erich Herber, Enver Sangineto, Krassen.
1 Continuous Queries over Data Streams Vitaly Kroivets, Lyan Marina Presentation for The Seminar on Database and Internet The Hebrew University of Jerusalem,
FIN 685: Risk Management Topic 5: Simulation Larry Schrenk, Instructor.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Incremental Materialization of RDF Graph Closures for Stream Reasoning Alexandre Mello Ferreira (PhD student) 22/11/2010.
Progress Report on Continuous Data Stream Management  Mining Frequent Itemsets over Data Streams  Music Virtual Channel Presented by: Dr. Yi-Hung Wu.
An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations Presenter: Liyan Zhang Presentation of ICS
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.
1 Elke A. Rundensteiner Topics projects in database and Information systems, such as, web information systems, distributed databases, Etc. Database Systems.
1 Stream-based Data Management IS698 Min Song 2 Characteristics of Data Streams  Data Streams Data streams — continuous, ordered, changing, fast, huge.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
CS538: Advanced Topics in Information Systems. 2 Secure Location transparency Consistent Real-Time Available Black Box: Distributed Storage [GMM] ? Data.
Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia.
1 Murali Mani Topics projects in databases and web applications and XML Database Systems Research Lab @cs.wpi.eduWebpages:
1 PODS 2002 Motivation. 2 PODS 2002 Data Streams data sets Traditional DBMS – data stored in finite, persistent data sets data streams New Applications.
Knowledge Representation Reading: Chapter
Probabilistic Databases Amol Deshpande, University of Maryland.
Triple Stores.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
1 Performance Evaluation of Computer Networks: Part II Objectives r Simulation Modeling r Classification of Simulation Modeling r Discrete-Event Simulation.
Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos.
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
PLATFORM INDEPENDENT SOFTWARE DEVELOPMENT MONITORING Mária Bieliková, Karol Rástočný, Eduard Kuric, et. al.
Moving the RFID Value Chain Value Proposition Cost and Complexity What is it? (passive RFID) Where is it? (active RFID) How is it? (Sensors) Adapt to it.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
System Specification Specify system goals Develop scenarios Define functionalities Describe interface between the agent system and the environment.
Semantic Web Applications GoodRelations BBC Artists BBC World Cup 2010 Website Emma Nherera.
Data Stream Systems Reynold Cheng 12 th July, 2002 Based on slides by B. Babcock et.al, “Models and Issues in Data Stream Systems”, PODS’02.
John Plummer Technical Specialist Data Platform Microsoft Ltd StreamInsight Complex Event Processing (CEP) Platform.
Requirements Documentation CSCI 5801: Software Engineering.
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Secure Sensor Data/Information Management and Mining Bhavani Thuraisingham The University of Texas at Dallas October 2005.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
1 Approximating Quantiles over Sliding Windows Srimathi Harinarayanan CMPS 565.
Chapter 1 Introduction to Databases. 1-2 Chapter Outline   Common uses of database systems   Meaning of basic terms   Database Applications  
MyActivity: A Cloud-Hosted Ontology-Based Framework for Human Activity Querying Amin BakhshandehAbkear Supervisor:
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Adaptive Query Processing in Data Stream Systems Paper written by Shivnath Babu Kamesh Munagala, Rajeev Motwani, Jennifer Widom stanfordstreamdatamanager.
Data Stream Management Systems
Aum Sai Ram Security for Stream Data Modified from slides created by Sujan Pakala.
Ontology-driven complex event processing for real time algal bloom detection AOW Dec 2011 Jonathan Yu Kerry Taylor and Brad Sherman.
Website: Answering Continuous Queries Using Views Over Data Streams Alasdair J G Gray Werner.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Data Mining: Concepts and Techniques Mining data streams
Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.
BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song.
Chapter 2 Database Environment.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
IBM Research: Software Technology © 2005 IBM Corporation Programming Technologies 1 Temporal Rules Vijay Saraswat IBM TJ Watson July 27, 2012.
Versatile Information Systems, Inc International Semantic Web Conference An Application of Semantic Web Technologies to Situation.
WP3: Data Provenance and Access Control Irini Fundulaki, FORTH December 11-12, 2012, Luxembourg.
Stream Reasoning with Linked Data Open Data Open Day 2013 Sina Samangooei, Nick Gibbins 26 June 2013.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Data Streams COMP3017 Advanced Databases Dr Nicholas Gibbins –
Mining Data Streams (Part 1)
Big Data Infrastructure
COMP3211 Advanced Databases
Dieter Gawlick, Oracle October, 2005 (GGF15 in Boston)
Semantic Event-based Service Oriented Architecture
Online Analytical Processing Stream Data: Is It Feasible?
An Analysis of Stream Processing Languages
A framework for ontology Learning FROM Big Data
Presentation transcript:

Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins –

From Knowledge Bases to Semantic Streams 2 Common Semantic Web assumptions: –persistent data storage –relatively static data –complex one-off queries –expressive reasoning

From Knowledge Bases to Semantic Streams 3 Some applications have very different requirements: –data arrives in real-time –too much data to store! –data never stops coming –ongoing lightweight analysis of rapidly changing data

Example Application: MIDAS 4

Application Domains 5 Network monitoring and traffic engineering Sensor networks, RFID tags Telecommunications call records Financial applications Web logs and click-streams Manufacturing processes

Data Streams Relatively recent development in the database community A (potentially unbounded) sequence of tuples (triples) Transactional data streams: log interactions between entities – Credit card: purchases by consumers from merchants – Telecommunications: phone calls by callers to dialed parties Measurement data streams: monitor evolution of entity states – Sensor networks: physical phenomena, road traffic – IP network: traffic at router interfaces 6

One-Time versus Continuous Queries One-time queries Run once to completion over the current data set Continuous queries Issued once and then continuously evaluated over a data stream –“Notify me when the temperature drops below X” –“Tell me when prices of stock Y > 300” 7

Conventional Triplestore 8 SPARQL Engine stored triples query results

Semantic Stream Management System 9 query processor continuous query stream of results stream of triples

Stream Reasoning 10 Continuous queries generate new results as new data is added Continuous reasoning generates new entailments as new data is added –What can we infer from the new data? –What is no longer true?

Conventional Reasoner 11 Reasoner stored triples triples entailed triples

Stream Reasoner 12 query processor ontologies stream of triples stream of entailed triples

Stream Reasoning Systems 13 C-SPARQL SPARQL stream CQELS INSTANS ETALIS Sparkwave...

Query Processing 14

Example: C-SPARQL 15 Queries produce/refer to relations and streams Based on SPARQL, with the addition of: –Streams as new data type (c.f. graphs) –Windows on streams –Registration of streams

Windows 16 Mechanism for extracting a finite set of triples from an infinite stream Various approaches: –Windows based on ordering attribute (e.g. last 5 minutes of tuples) –Windows based on tuple counts (e.g. last 1000 tuples)

Window Behaviour 17 data stream time t1t1 t2t2 t3t3 t4t4 t -4 windows t0t0 t -1 t -2 t -3

Example C-SPARQL Query 18 SELECT DISTINCT ?topic FROM STREAM [RANGE 15m STEP 1m] WHERE { ?user sd:accesses ?document. ?user foaf:knows ?john. ?john foaf:name "John". ?document t:describes ?topic. ?topic skos:subject yago:Movies. }

Example C-SPARQL Query 19 REGISTER STREAM MoviesJohnsFriendsLike COMPUTED EVERY 5m AS CONSTRUCT { ?user sd:likes ?document } FROM STREAM [RANGE 30m STEP 5m] WHERE { ?user sd:likes ?document. ?user foaf:knows ?john. ?john foat:name "John". ?document sd:describes ?topic. ?topic skos:subject yago:Movies. }

Example C-SPARQL Query 20 REGISTER QUERY GlobalCountOfInteractions COMPUTED EVERY 5m AS SELECT ?user COUNT (?document) as ?numberOfMovies FROM STREAM [RANGE 30m STEP 5m] WHERE { ?user sd:likes ?document } GROUP BY { ?user }

Scalability and Completeness 21 SPARQL deals with finite graphs –query evaluation should produce all results for a given query C-SPARQL deals with unbounded data streams –may not be possible to return all results for a given query –trade-off between resource use and completeness of result set –size of buffers used for windows is one example of a parameter that affects resource use and completeness –can further reduce resource use by randomly sampling from streams