Information Integration Intelligence with Semantic Technology Ontolog Forum 2008-01-24 Holger Knublauch

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
From Ontology Design to Deployment Semantic Application Development with TopBraid Holger Knublauch
1 Actuate Corporation © 2010 THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE.
With Microsoft Access 2010© 2011 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Agenda What is BIRT? BIRT Features and Report Gallery Scripting BIRT
Tutorial 8 Sharing, Integrating and Analyzing Data
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Tutorial 11: Connecting to External Data
Overview of Search Engines
DEiXTo.
Framework for Model Creation and Generation of Representations DDI Lifecycle Moving Forward.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Triple Stores.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
UML Tools ● UML is a language, not a tool ● UML tools make use of UML possible ● Choice of tools, for individual or group use, has a large affect on acceptance.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Quick Tutorial – Part 1 Using Oracle BPM with Open Data Web Services David Webber.
Building a UI with Zen Pat McGibbon –Sales Engineer.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
OFC304 Excel 2003 Overview: XML Support Joseph Chirilov Program Manager.
Presenting Statistical Data Using XML Office for National Statistics, United Kingdom Rob Hawkins, Application Development.
Lushan Han, Tim Finin, Cynthia Parr, Joel Sachs, and Anupam Joshi RDF123: from Spreadsheets to RDF.
© Copyright 2009 TopQuadrant Inc. Slide 1 TopQuadrant Metrics and QA Support TopBraid Suite Supporting the Complete Semantic Application Lifecycle.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1 Quick Tutorial – Part 2 Open Data Web Services for Oracle BPM August, 2013 Forms.
Release 11i Workshops Dallas, TX Raleigh, NC Denver, CO Atlanta, GA Detroit, MI Tim Sharpe Oracle E-Business Suite Release 11i Discoverer.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
Exploitation of Dynamic Information Relations in the Service-Oriented AFRL Information Management Systems Andrzej Uszok, Larry Bunch, Jeffrey M. Bradshaw.
DEV337 Modeling Distributed Enterprise Applications Using UML in Visual Studio.NET David Keogh Program Manager Visual Studio Enterprise Tools.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
Semantic Technologies and Application to Climate Data M. Benno Blumenthal IRI/Columbia University CDW /04-01.
1 SMWG Service Management Modelling Notes Anthony Crowson Colin Haddow October 2009, ESTEC October 15, 2008.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Building Dashboards SharePoint and Business Intelligence.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
© 2006 Altova GmbH. All Rights Reserved. Altova ® Product Line Overview.
Architecture for an Ontology and Web Service Modelling Studio Michael Felderer & Holger Lausen DERI Innsbruck Frankfurt,
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.
Manufacturing Systems Integration Division Development Process and Testing Tools for Content Standards Simon Frechette National Institute of Standards.
XP New Perspectives on Microsoft Office Access 2003, Second Edition- Tutorial 8 1 Microsoft Office Access 2003 Tutorial 8 – Integrating Access with the.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
10 Copyright © 2004, Oracle. All rights reserved. Building ADF View Components.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Module 2: Authoring Basic Reports. Overview Creating a Basic Table Report Formatting Report Pages Calculating Values.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Building Enterprise Applications Using Visual Studio®
Utility Evaluation of Tools for Collaborative Development
Stanford Medical Informatics
Semantic Database Builder
Analyzing and Securing Social Networks
Microsoft Office Access 2003
Microsoft Office Access 2003
Tutorial 7 – Integrating Access With the Web and With Other Programs
Presentation transcript:

Information Integration Intelligence with Semantic Technology Ontolog Forum Holger Knublauch

2 About Myself Computer Scientist (PhD, 2002) Post-Doc at Stanford –Lead developer of Protégé-OWL 2006-now TopQuadrant, Inc. –VP, Product Development –Lead developer of TopBraid Suite

3 About TopQuadrant Headquarter: Alexandria, VA Office: Mountain View, CA Also: TopQuadrant Korea

4 TopBraid Suite

5 Information Integration Intelligence Heterogeneous data and schemas –Databases –Spreadsheet files –XML files –Newsfeeds –Online resources (HTML, GRDDL, RDFa) –Web services and data endpoints How to get integrated views to support business intelligence?

6 Why Semantic Technology Class and property definitions (RDFS) Open architecture (URIs, triples, etc) Designed for linking (sameAs etc) Schema reuse (subClassOf etc) Explicit definitions of “semantics” (DL) Self-describing data (generic tools, discovery, schema evolution) Cross-schema querying (SPARQL)

7 Major retailer with an established name in Housewares, Lawn and Garden, Automotive and other products. Can we give our shoppers an integrated way to deal with warrantees, service records, proofs of purchase, etc. for all our product lines? An "Orbitz of Housewares" But they have hundreds of product lines, and new ones every day. How can they do this on this scale? TopBraid Semantic Technologies provides seamless integration of many and varied product lines Customers come to this retailer instead of competitor to get integrated support of new appliances with old Semantic Technology Examples (1)

8 Consumer Electronics. Marketing and distributing information about products. Consumer electronics is notorious for new product categories with new features (game boxes? entertainment centers? HDTV? DVR?) and compatibility dependencies. How do we present our customer base with a seamless integrated picture of all possible products and how they combine, in the face of such a large set of product lines, with changing requirements? TopBraid / Semantic Technologies provides a flexible, extensible way to manage multiple products seamlessly Semantic Technology Examples (2)

9 Health care solution built by CTG. Health care providers as well as patients require a seamless, integrated view of all health care information and services: –tests –available drugs –insurance information –clinic availability, etc Information is available for these things, but cannot be managed in a single seamless way. CTG is using TopBraid to create a seamless health care dashboard. Semantic Technology Examples (3)

10 NASA Constellation project requires integration of information from an astonishingly wide variety of sources - different disciplines (hydraulics, electronics, mechanics, avionics, aerodynamics... ). In the design stage any particular simulation (testing or evaluating design alternatives for a space system) will require a seamless view of a component from any number of perspectives. Even within a single discipline, different groups have information that contributes to a decision. Considering the operations and longevity requirements: Constellation project creates data that will be used 30 years into the future - think about the form of data 30 years ago (A lesson learned with the Space Shuttle, in which line drawings for designs had to be consulted 25 years later). The information architecture has to be flexible enough to withstand the passage of all those years. NASA is using TopBraid / Semantic Technologies to make flexible, future proof data systems to take a person to Mars. Semantic Technology Examples (4)

11 Structure of this Talk Import (Spreadsheets, DBs, XML) Processing (Editing, querying, transforming) Export (Converting, browsing, visualizing) SPARQLMotion (Scripting Language)

12 TopBraid Import Features

13 Spreadsheet Import in TopBraid In practice a lot (!) of useful data resides in spreadsheets Excel Spreadsheets can be quite sophisticated (programs on their own) TopBraid has two importing options –Excel files, each cell becomes an instance –Text files, each row becomes an instance

14 Excel Import in TopBraid Sometimes, spreadsheets are not just single tables Each cell may have a distinct meaning Information about cell position must be preserved

15 TopBraid Spreadsheet Ontology Classes Example Instance

16 Spreadsheet Import Input: Tab-separated text files Table is interpreted as class Columns can be mapped into properties Rows become instances Import wizard can be used to fine tune

17 TopBraid Spreadsheet Import (1)

18 TopBraid Spreadsheet Import (2)

19 Relational Database Import Much enterprise data resides (and needs to stay) in relational databases Relational database importer (D2RQ) built into TopBraid Static import of schema –Tables become classes –Columns become properties –Link tables become object properties Dynamic import of actual data –Rows become instances –On the fly, i.e. data can stay where it is

20 Database Import in TopBraid (1)

21 Database Import in TopBraid (2) TopBraid automatically generates 1.Schema 2.Instances placeholder file (.d2rq) 3.Mapping file (table-to-class mapping) 4.Test file that imports 1. and 2. based on 3.

22 Database Import in TopBraid (3)

23 Database Import in TopBraid (4)

24 Database Import in TopBraid (5) D2RQ Mapping Ontology

25 Relational databases imported by D2RQ become triple sources like any other – but original data can stay where it is Resulting mapping can be fine-tuned Full range of generic RDF/OWL tools can be executed –Inferencing –Merging –Mapping –Querying Not all of these perform equally well Database Import in TopBraid (6)

26 XML Import/Export XML is the favourite syntax in many areas, e.g. data exchange between tools, web services TopBraid supports two approaches –XML Schema import to ontology –Semantic XML

27 XML Schema Import/Export

28 Semantic XML Converts arbitary XML to OWL Keeps reverse-engineering info in the resulting ontology, using annotation properties Can create XML files from OWL Lossless round-tripping of XML Mapping ontologies can be edited

29 Semantic XML Example Each element name becomes a class Each attribute becomes datatype property Nesting is mapped into a dedicated object property (composite:child)

30 Semantic XML Classes Each generated class contains an annotation that points back to the XML element where it came from Similarly for the properties

31 Semantic XML Instances

32 Semantic XML Profiles The Semantic XML class models can be edited and fine-tuned TopBraid provides a couple of standard profiles –XHTML to open.html files (including tidy) –XSD to open XML Schemas More profiles are planned/prepared –X3D –SVG

33 Semantic XML Profile for HTML

34 Semantic XML Profile for XSD

35 Semantic XML Summary Load, query and generate arbitrary XML files (even without XSD) Generated schema can then be fine tuned and reused for other XML files of the same kind SPARQL, rules and inferencers can be used to extract or convert the XML

36 Other Importers UML Class Diagrams Direct Triple Sources –Files (RDF/XML, N3/Turtle, N-Triples) –RSS/Atom Feeds –GRDDL –RDFa –SPARQL Endpoints –RDF databases (Oracle 11g, Jena, AllegroGraph, Sesame)

37 Data Processing So far: data physically converted to a uniform language (RDF/OWL) Semantic integration –Ontology editing –Mapping by built-in inferences –Mapping by constructing new triples

38 Ontology Editing TopBraid Composer is the most sophisticated professional editor for OWL and RDF on the market Modular ontologies Refactoring Form-based & visual editing Customizable and extensible Driven by requirements from real-world projects (NASA etc) Several hundred users (and counting)

39 TopBraid Composer

40 Ontology Mapping via RDFS/OWL rdfs:subClassOf/owl:equivalentClass rdfs:subPropertyOf Then run inferencing Only suitable for trivial cases Limited expressivity

41 Ontology Mapping with SPARQL

42 TopBraid Export Features Triples (databases, files) HTML documentation Semantic Java Server Pages Google Maps, Calendars Spreadsheets, Matrix Business Intelligence Reports Browsing and querying (TopBraid Live)

43 Export/Merge/Convert Triples

44 HTML Export

45 Semantic Java Server Pages (1)

46 Semantic Java Server Pages (2) Content driven by SPARQL queries. Layout defined by JSP template.

47 Semantic Java Server Pages (3) Used extensively to generate all kinds of documents, deliverables

48 Google Maps Resources with geo:lat geo:long values

49 Calendars

50 BIRT Reports Input: Tabular data from SPARQL queries

51 TopBraid Ensemble (1) Rich Internet Application for browsing and editing RDF/OWL

52 TopBraid Ensemble (2)

53 TopBraid Ensemble (3)

54 Structure of this Talk Import (Spreadsheets, DBs, XML) Processing (Editing, querying, transforming) Export (Converting, browsing, visualizing) SPARQLMotion (Scripting Language)

55 SPARQLMotion A visual scripting language for Semantic Web Technology Import – Process – Export Use case: repeatable data processing and information integration tasks SPARQLMotion itself is defined as an OWL ontology Instance scripts can be edited with any OWL editing tool Has an extensible architecture

56 SPARQLMotion Example

57 SPARQLMotion Language Scripts consist of modules Modules have a type (e.g. ApplyPellet) The output of one module is the input to its successors (RDF, XML and/or variable bindings) Branching (if-else), Iterations (while) and merging supported

58 SPARQLMotion Module Types Input –Something-to-RDF –Something-to-XML –User Input Processing –RDF-to-RDF –XML-to-RDF –RDF-to-XML Output –RDF-to-Output –XML-to-Output

59 SPARQLMotion Module Library (1) Java Implementation Java Implementation Java Implementation OWL Representation is backed by Java classes in execution engine Module implementations are plug-ins to the engine

60 SPARQLMotion Module Library (2)

61 SPARQLMotion Module Library (3)

62 SPARQLMotion Module Library (4)

63 SPARQLMotion Module Library (5)

64 Complex SPARQLMotion Example

65 SPARQLMotion Use Cases Convert files to databases Combine multiple RSS feeds Create spreadsheets and charts Run periodic background checks Create XML input for other tools Control web pages Create maps and calendars Run inferences periodically …

66 Summary Semantic Web languages are an attractive foundation for data integration tasks Generic methods and tools can be used, exploiting ontological metadata The TopBraid Suite product family is a comprehensive solution covering import, processing and export.

67 Extra Slides

68 TopBraid Suite