This Briefing is: UNCLASSIFIED Aha! Analytics 2278 Baldwin Drive Phone: (937) 477-2983, FAX: (866) 450-3812 1 Dave Lush, Senior SME Aha! Analytics Semantic.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
EMRLD A RIM-based Data Integration Approach Pradeep Chowdhury Manager, Data Integration.
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Basic guidelines for the creation of a DW Create corporate sponsors and plan thoroughly Determine a scalable architectural framework for the DW Identify.
Information and Business Work
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Presented to: By: Date: Federal Aviation Administration Enterprise Information Management SOA Brown Bag #2 Sam Ceccola – SOA Architect November 17, 2010.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
February Semantion Privately owned, founded in 2000 First commercial implementation of OASIS ebXML Registry and Repository.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Understanding Data Warehousing
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Organizational Memory: Issues in Design & Implementation Sree Nilakanta May 1, 2000.
Emerging Technologies Work Group Master Data Management (MDM) in the Public Sector Don Hoag Manager.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
Ontology for Federation and Integration of Systems Cross-track A2 Summary Anatoly Levenchuk & Cory Casanave Co-chairs 1 Ontology Summit 2012
Adaptive Hypermedia Tutorial System Based on AHA Jing Zhai Dublin City University.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
CSS/417 Introduction to Database Management Systems Workshop 4.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Service Service metadata what Service is who responsible for service constraints service creation service maintenance service deployment rules rules processing.
1/22/08 RTR Project Presentation to TPTF RTR Project Michael Daskalantonakis & Brian Cook.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
This Briefing is: UNCLASSIFIED Aha! Analytics 2278 Baldwin Drive Phone: (937) , FAX: (866) A Recurring Knowledge Transfer Problem, Linked.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
OWL Representing Information Using the Web Ontology Language.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Information Architecture The Open Group UDEF Project
Portals: Architecture & Best Practices Greg Hinkle February 2005.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Information Integration 15 th Meeting Course Name: Business Intelligence Year: 2009.
SICoP Presentation A story about communication Michael Lang BEARevelytix April 25, 2007.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
1 Acquisition Automation – Challenges and Pitfalls Breakout Session # E11 Name: Jim Hargrove and Allen Edgar Date: Tuesday, July 31, 2012 Time: 2:30 pm-3:45.
SAP NetWeaver Business Intelligence SAP Netweaver Business Warehouse (SAP NetWeaver BW) the name of the Business Intelligence,
Managing Large RDF Graphs Vaibhav Khadilkar Dr. Bhavani Thuraisingham Department of Computer Science, The University of Texas at Dallas December 2008.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
SysML v2 Model Interoperability & Standard API Requirements Axel Reichwein Consultant, Koneksys December 10, 2015.
The Semantic Web By: Maulik Parikh.
Cloud based linked data platform for Structural Engineering Experiment
Collaborative Vocabulary Management
LOD reference architecture
Anatomy of a modern data-driven content product
Concepts & Thoughts on Operational Resiliency (Feb 11)
Presentation transcript:

This Briefing is: UNCLASSIFIED Aha! Analytics 2278 Baldwin Drive Phone: (937) , FAX: (866) Dave Lush, Senior SME Aha! Analytics Semantic Integration Layer for The As-Is Enterprise Data Warehouse

UNCLASSIFIED 2 Purpose(s)  Communicate Some Observations About the General Data Integration Problem  Cite and Discuss the Semantic Technologies  Propose a Semantic Data Integration Layer for the General Data Warehouse Architecture  Discuss a Lexis Nexus SSI Data Analytics Supercomputer (DAS) Based Solution  Present Initial Thoughts on the Plan

UNCLASSIFIED 3 Topics  Purpose  Background  Givens/Problems/Tasks  Approaches to Data/Info Integration  Semantic Technologies  General Solution Architecture  LNSSI DAS Based Solution Architecture  Thoughts On the Plan

UNCLASSIFIED 4 Background  Data Integration Problems  Application and Enterprise Model Based Approaches  Data Integration Problems Persist  Not Adequately Leveraging Available Metadata  Need for Improved Discovery and Semantic Integration  Emergence of Semantic Technologies  Emergence of LNSSI DAS Capability

UNCLASSIFIED 5 The Primary Givens/Problem/Task  Givens:  A Collection of Disparate Legacy Databases Perhaps Already Migrated to an Enterprise Data Warehouse Each with Own Independently Developed Logical Data Model and Query Interface  The Requirement To Pose Single Unified Queries Across The Collection Of Legacy Databases And Achieve Semantically Consistent (Coherent) Results  The Problem:  Difficulties in Achieving Useful Results Because of Unresolved Semantic Disconnects in the Disparate Logical Models  Note: The Problem Is Not Primarily One of Discovery of Relevant Already Existing Product Objects But Rather One of Discovering and Semantically Integrating Requisite Product Content From Multiple Sources  The Task at Hand:  Define, Design, and Implement a Capability for the Semantic Integration and Unified Query of the Collection of Disparate Legacy Databases to Achieve Semantically Coherent Results

UNCLASSIFIED 6 Basic Data/Info Integration Approaches  Application Centric Approach  Do It All in the Application Layer Via Ad Hoc Hand Coding  This Is Very Expensive And Difficult!  Enterprise Information Model and Data Warehouse Approach  Do It Via EDW ETL Methods/Tools in Context of Strict Conformance with Overarching Enterprise Info Model  This Is Also Very Expensive And Difficult And Requires Great Discipline!  Enterprise Information Integration (EII) Approach  Establish Common Single View of Disparate Legacy Sources  Process/Parse Common Domain-wide Queries into Individual Legacy Source Queries and Execute Source Queries  Integrate Source Query Results into Unified Response to the Domain-wide Query

UNCLASSIFIED 7 The Basic Data Integration Challenge Data Interface Legacy Databases Application Data Interface The application must process the unified query, formulate and submit associated queries against the disparate databases, and properly integrate the results into a unified response. This requires that the application handle disparate data interfaces. and that the application contain the necessary semantics regarding the problem domain and the relationships/mappings between problem domain and legacy data models, and the code that accomplishes the mappings. Logical models for these databases were generally developed independently of each other.

UNCLASSIFIED 8 The Enterprise Data Warehouse Approach Enterprise Data Warehouse Application Data Warehouse Services Layer The legacy databases are migrated to a data warehouse in the context of an overarching enterprise data model so that the logical data models for the individual databases are semantically consistent with the overall model. The application still must process the unified query, formulate and submit associated queries against the disparate warehouse databases, and properly integrate the results into a unified response. But this process in theory shouldn’t have serious semantic inconsistency problem because the individual logical databases in the warehouse are supposed to have logical models which are consistent with an over arching enterprise information model. target data Logical models for these databases are consistent with overarching domain model Extract Transform Load (ETL) Services Common Enterprise Model & Meta-data Meta-Data Mgt Tool source data

UNCLASSIFIED 9 Problems Ensue  The Imperative to Abide by a Standard Global Data Model Does Not Prevail  Stove Piped DBs Abound  Semantics of the Stove Pipes Are Inconsistent  Federated Queries Yield Semantically Inconsistent Results  Cannot Replace/Re-engineer Legacy DBs Housed in the EDW  Cannot Replace the EDW Platform (e.g.Teradata) In Use Today

UNCLASSIFIED 10 New Imperative  Must Have Some Effective Way to Semantically Integrate the Information Acquired from the Multiplicity of Databases

UNCLASSIFIED 11 Semantic Integration  Use Semantic Technologies in Context of the EII Approach (cited previously)  Unified Ontology of Current Situation/View Is Developed and Expressed in OWL or Appropriate Successor Language  Semantic Relationships Between Legacy Data and Rules for Transformation From Legacy to Current View Are Specified and Captured Via OWL or Appropriate Successor Language  Queries in Terms of Current Unified View Are Parsed and Transformed Into Queries of Legacy Sources by a Semantic Query Engine.  Individual Legacy Source Queries Are Executed.  Results Are Transformed and Processed Into a Unified Response by the Semantic Mash-up Engine.

UNCLASSIFIED 12 Semantic Technologies  Rapidly Maturing with Very Noteworthy Applications  Enhanced Knowledge Discovery  Data/Knowledge Integration  Foundational Semantic Technology Constructs  Ontology: Machine Readable Specification of the Essence of a Given Domain  Machine Readable Knowledge/Facts  Machine Readable Rules  Standard Language(s) for Expressing the Above  XML, RDF, RDFS, OWL, RuleML  RDF Triple Store Capabilities for Storing the Above  Standard Query Languages for Searching the Above  SPARQL  Open Source Semantic Application Frameworks  Commercial Capabilities  Oracle Semantic Technologies  TopQuadrant  Metatomix  Ontoprise

UNCLASSIFIED 13 The General Solution Architecture  Semantic Layer Between Apps and Data  Unifying Domain Ontology  Linkage Ontology  OWL/RDF Data Management  Semantic Query Engine  Semantic Mash-up  Semantic Tech Architecture and Building Blocks  RDF(S), OWL, RuleML  Jena  Oracle Semantic Technologies  Semantic Application Development Environment (e.g. Top Quadrant)

UNCLASSIFIED 14 The Semantic Approach Enterprise Data Warehouse Application Data Warehouse Services Layer The legacy databases have been migrated to the data warehouse independently each with their own logical model. The overall domain has a robust domain ontology. There are linking ontology and rules which relate & map the domain ontology to the underlying logical models. The application captures and submits the unified query to the semantic engine. The semantic engine processes the query to a standard semantic form and then applies the ontologies and rules to formulate the requisite queries against the individual databases. The individual queries are submitted and the individual responses are received by the semantic mash-up service which uses the available semantic data including the query to create an integrated semantically consistent result for the original query. Domain Ontologies & Rules Ontology & Rule Authoring Tools Semantic Layer Semantic Query Engine Semantic Query Results Mash-Up Domain, Legacy, & Derived Facts Linking Ontologies & Rules OWL/RDF DBMS

UNCLASSIFIED 15 Typical EDW Architecture  Does Not Have a Data Integration Layer  This Is a Problem If Total Discipline in Conforming to Enterprise Model Is Not Exercised  And It Has Not Been Exercised  Legacy Databases Independently Migrated  Accomplishing Data Integration in the Application Layer Is Difficult and Expensive

UNCLASSIFIED 16 Data Access Tier (ODBC/JDBC) EDW Data Tier Pre-Generated Cache Web Reports & Charts (Output = HTML, XML, PDF, XLS, DOC, other) Presentation Transformation Tier Users Power Users AF Portal / AF COP (Presentation Containers = RIA, iFrame, HTML, WSRP Portlets, other) Presentation Tier Business Intelligence Tools Cognos BOBJ Other (Siebel, MS) Application Tier AJAX Current EDW Architecture Figure 4: Layered EDW Architecture Key Observations The as-Is architecture does not include an explicit knowledge mgt or semantics layer. This is a problem because the effectiveness of the as-is EDW depends on its ability to resolve semantic disconnects between the databases tthat must be queried. Building these capabilities into the application layer code is very difficult and costly and not responsive to highly dynamic situations.

UNCLASSIFIED 17 EDW Architecture with Semantic Layer Data Access Tier (ODBC/JDBC) EDW Data Tier Pre-Generated Cache Web Reports & Charts (Output = HTML, XML, PDF, XLS, DOC, other) Presentation Transformation Tier User s Power Users AF Portal / AF COP (Presentation Containers = RIA, iFrame, HTML, WSRP Portlets, other) Presentation Tier Business Intelligence Tools Cognos BOBJ Other (Siebel, MS) Application Tier AJAX Semantic Tier Semantic Tools Figure 5: EDW Architecture with Semantic Layer Domain Ontologies & Rules Ontology & Rules Authoring Tools Semantic Layer Semantic Query Engine Semantic Query Results Mash-Up Domain, Legacy, & Derived Facts Linking Ontologies & Rules OWL/RDF DBMS Key Observations The to-be architecture includes a semantics layer which mediates between the domain query and the data layer. The semantic layer includes the domain ontology and linkage ontologies which drive the processing of domain queries and the semantic mashup of individual query results coming from the source databases.

UNCLASSIFIED 18 A Major Obstacle: Computational Complexity  Many Operations of the Semantic Layer Are Computationally Intensive  Complex Queries Across Multiple Large Data Sources Are Computationally Intensive  Some Kind of Specialized Solution to Execution of the Semantic Operations and Multiple Source Queries Is Required

UNCLASSIFIED 19 Hypothesis  The LNSSI DAS Capability Can Be Brought to Bear on Execution of the Semantic Operations and Of Course the Source Queries As Well with Significant Benefits  So the Big Question Is:  Can the LNSSI DAS Be Applied to Large Ontologies, Rule Sets, and Large Data to Provide a Very High Performance Semantic Query Engine?

UNCLASSIFIED 20 The LNSSI DAS Based Solution Architecture  Semantic Layer  Federated Domain Ontologies  Transformation Rules  Semantic Engines  LNSSI DAS Used in Two Contexts:  Semantic Ops in the Semantic Layer  Source Queries at the Data Services Level

UNCLASSIFIED 21 Enterprise Data Warehouse Query Application LNSSI DAS Services Layer The legacy databases have been migrated to the data warehouse independently each with their own logical model. The overall domain has a robust domain ontology. There are linking ontology and rules which relate & map the domain ontology to the underlying logical models. The application captures and submits the unified query to the semantic engine. The semantic engine processes the query to a standard semantic form and then applies the ontologies and rules to formulate the requisite queries against the individual databases. The individual queries are submitted and the individual responses are received by the semantic mash-up service which uses the available semantic data including the query to create an integrated semantically consistent result for the original query. Domain Ontologies & Rules Ontology & Rule Authoring Tools Semantic Layer Services Semantic Update/Query Engine Semantic Query Results Mash-Up Domain, Legacy, & Derived Facts Linking Ontologies & Rules LNSSI DAS Based OWL/RDF Query The LNSSI DAS Based Solution Architecture RDMS Based OWL/RDF DBMS

UNCLASSIFIED 22 What To Do?  Lets Execute a Prototype Project  To Test the Hypothesis That the LNSSI DAS Can Be Successfully Brought to Bear On Large Data Integration Problems Requiring Semantic Integration

UNCLASSIFIED 23 General Approach  Find a Sponsor with Requisite $  Form Appropriate Team and Agreements  Initiate a Prototype Project  Apply Semantic Technologies  Ontology, RFD(S), OWL  RDF Triple Store, SPARQL  Inference Engine(s)  JENA  Leverage LNSSI DAS Architecture and Capability  Create an LNSSI Semantic Integration Platform  Find a Benchmark Problem for Which There Is Already Data, Associated Semantics, and Existing Query Performance Data

UNCLASSIFIED 24 Major Activities  Project Initiation  Initial Analysis, Technology Research, and Knowledge Engineering  CONOPS and System Requirements Development/Specification  Detailed Program/Project Management  Knowledge Engineering  Domain Ontology Acquisition/Development  Ontology Legacy Data Relationship, Mapping, and Rule Acquisition/Development  Acquisition/Development of the Underlying Data  Architecture Development and System Design  Detailed Apps Requirements/Design  Inclusion of RDF/OWL Data Mgt  Inclusion of Semantic Query Engine  Development of Semantic Mashup Capabilities  Implementation Planning  Implementation  Test and Eval

UNCLASSIFIED 25