Knowledge Streams: Stream Processing of Semantic Web Content Mike Dean Principal Engineer Raytheon BBN Technologies 1.

Slides:



Advertisements
Similar presentations
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Paul Smart, Ali.
Advertisements

Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
™ Suggestions for Semantic Web Interfaces to Relational Databases Mike Dean W3C Workshop on RDF Access to Relational Databases Cambridge,
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
GenSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work Chris Murphy, Swapneel Sheth, Gail Kaiser, Lauren.
On management aspects of future ICT systems Associate Professor Evgeny Osipov Head of Dependable Communication and Computation group Luleå University of.
Flink: Lessons of interoperability Peter Mika Dept. of Business Informatics Free University Amsterdam 1 st Intl. Workshop on.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
HOL9396: Oracle Event Processing 12c
Semantic Web Research: Visual Modelling of OWL-S Services Computer Science Annual Workshop September 2004 Charlie Abela, James Scicluna Department of Computer.
1 Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
Triple Stores.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
What Can Do for You! Fabian Christ
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
Practical RDF Chapter 1. RDF: An Introduction
™ Copyright ©2009 BBN Technologies Semantic BBN Application to the Digital Whitewater Challenge Mike Dean John Hebeler
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Message Brokers and B2B Application Integration Chap 13 B2B Application Integration Sungchul Hong.
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
Fundamentals of Database Chapter 7 Database Technologies.
Microsoft SharePoint Server 2010 for the Microsoft ASP.NET Developer Yaroslav Pentsarskyy
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
updated ’08CmpE 583 Fall 2008Introduction- 1 CmpE 583- Web Semantics: Theory and Practice Atilla ELÇİ Computer Engineering Department Eastern.
Jian Gui WANG New Implementation of Agriculture Models APAN19---Jan New Implementations of Agriculture Models Using Mediate Architecture.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
PROAGE PROAGE – PROSESSIAUTOMAATION AGENTTIPOHJAISET INFORMAATIOPALVELUT Agent-Based Information Services for Process Automation Semantic Web.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
An Open Source GIS Architecture Connected and Linked Data
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Stefan Decker Stanford University Mike Dean BBN Technologies.
Semantic sewer pipe failure detection: Linked data approaches for discovering events Jonathan Yu | Research software engineer Environmental Information.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Web Mashups -Nirav Shah.
Triple Stores.
Building Trustworthy Semantic Webs
Integrating Data for Archaeology
Middleware independent Information Service
Knowledge Management Systems
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
Analyzing and Securing Social Networks
Triple Stores.
LOD reference architecture
Semantic Markup for Semantic Web Tools:
Metadata The metadata contains
Social Abstractions for Information agents
HP Labs and the semantic web
Triple Stores.
AI Discovery Template IBM Cloud Architecture Center
Presentation transcript:

Knowledge Streams: Stream Processing of Semantic Web Content Mike Dean Principal Engineer Raytheon BBN Technologies 1

Assumptions Technology – Intermediate –Familiarity with RDF and OWL Interest in –Stream processing –Scalability 2

Presenter Background Principal Engineer at Raytheon BBN Technologies (1984-present) Principal Investigator for DARPA Agent Markup Language (DAML) Integration and Transition ( ) –Chaired the Joint US/EU Committee that developed DAML+OIL and SWRL Developer and/or Principal Investigator for many Semantic Web tools, datasets, and applications (2000-present) Member of the W3C RDF Core, Web Ontology, and Rule Interchange Format Working Groups –Co-editor of the W3C OWL Reference Local co-chair for ISWC2009 Other SemTech presentations –Semantic Query: Solving the Needs of a Net-Centric Data Sharing Environment (2007, w/ Matt Fisher) –Semantic Queries and Mediation in a RESTful Architecture (2008, w/ John Gilman and Matt Fisher) –Use of SWRL for Ontology Translation (2008) –Semantic BBN: Application to the Digital Whitewater Challenge (2009, w/ John Hebeler) –How is the Semantic Web Being Used? An Analysis of the Billion Triples Challenge Corpus (2009) –Finding a Good Ontology: The Open Ontology Repository Initiative (2010, w/ Peter Yim and Todd Schneider) 3

Outline Motivation Vision Building Blocks Demonstration 4

Motivations Timeliness Performance 5

Timeliness Streaming minimizes latency –Processing elements see events as they occur –Resources are expended only when an event occurs This is in contrast to polling –Latency averages half the polling interval –Resources are expended on every poll –Popular web syndication mechanisms such as RSS and Atom involve polling 6

Performance Many Semantic Web tools provide streaming parsers rather than, or in addition to, model access –Analogous to XML SAX vs. DOM For suitable applications, this can be 10x faster than loading all statements into memory or a KB 7

2 Streaming Stories dumpont of OpenCyc (circa 2003) –HTML-based ontology visualization tool periodically bogged down daml.org server –Reimplementation using event-based Jena ARP parser yielded 10x performance and scalability improvements Billion Triples Challenge 2009 –Streaming analysis of the 2009 corpus was performed at an overall rate of 103K statements/sec on a Mac laptop with a portable external disk –Compare to loading 10-20K statements/second on a server 8

Stream Processing Examples Unix pipes Dataflow architectures Streambase IBM System S/InfoSphere Streams 9

aggregation persistent queries persistent queries augmentation context filter context filter alerts correlation translation inference distribution DataSources Distribution And Processing Elements Users CEP NLP SensorNetwork Imagery RSS IM Gazetteer Sensor Semantic Web Database Persistent pipelines Streams of statements comprising object subgraphs URI naming allows drill-down Provenance, timestamps Processing elements Consume and produce subgraphs Multiple functions may be combined Archive User 2 User 3 Community of Interest 1 Community of Interest 2 User 1 Vision: Knowledge Streams 10

Goals Web-scale –Decentralized among multiple sites –Heterogenous implementations Long-lived, persistent connections –User accountability Introspection over the processing network for control and optimization –E.g. aggregating subscriptions –Balance with security, privacy, and autonomy concerns 11

Building Blocks RDF Content Existing stream processing frameworks Workflow systems Publish/subscribe message oriented middleware 12

RDF Payloads Malleable data –Standards-based graph structure –Can easily add, remove, and transform statements Self-describing –Unique naming via URIs –References to vocabularies and ontologies Potential for inference 13

Workflow Systems Graphical environments for developing processing pipelines –Yahoo Pipes, DERI Pipes, SPARQLMotion –Nice user interfaces for development and execution 14

Semantic Complex Event Processing Complex Event Processing –One of the leading edges of rules technology –Formal specification of higher-level events in terms of lower-level events E.g. alert if the moving average increases 15% within a 10 minute window –Engine can be compiled/optimized for a specific rule set –High-volume deployments in finance and other industries –Most implementations focus on self-contained tuples Semantic Complex Event Processing –Enrich CEP using Semantic Web technology –Emerging topic at recent conferences Early implementations –Wrappers around open source CEP engines –Native implementation Provides a powerful set of operators and engines for Knowledge Streams 15

Implementation Approach Well-defined APIs for implementing operators Operator execution containers –Could encapsulate existing engines Start with manual processing network configuration, then automate 16

Use Cases Dissemination of metadata for new satellite imagery Social network changes Alerting of friends’ new publications … 17

Demo Processing using DERI Pipes with new operators –Ingest of #SemTechBiz tweets using Twitter Streaming API –Conversion of JSON to RDF –Mapping to SIOC vocabulary using SWRL rules –Enrich by matching with contacts –Persistent buffering using Java Message Service –Monitoring 18