Modular Ontology architecture for using human defined sets of concepts Presentation by OntologyStream Inc Paul Stephen Prueitt, PhD Ontology Tutorial 5,

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management Tenth Edition
Advertisements

ARCH-01: Introduction to the OpenEdge™ Reference Architecture Don Sorcinelli Applied Technology Group.
A plan to deploy Ontology mediation information flow architecture for US Customs and Border Protection Presentation by OntologyStream Inc Paul Stephen.
Total Information Awareness with Informational Transparency in Secure Channels March 16, 2005 Core Ontology safeguarding national security Ontology Tutorial.
Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
OASIS Reference Model for Service Oriented Architecture 1.0
The Experience Factory May 2004 Leonardo Vaccaro.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
A Bridge to the Future Using a Public (non-ownable) Demand Side Communication Infrastructure.
Managing Data Resources
The Architecture Design Process
Requirements Analysis Concepts & Principles
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
Fundamentals of Information Systems, Second Edition
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Overview of Software Requirements
Building Knowledge-Driven DSS and Mining Data
CSC230 Software Design (Engineering)
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Problems with reuse – Increased maintenance costs; lack of tool support; not-invented- here syndrome; creating, maintaining, and using a component library.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Computer System Analysis Chapter 10 Structuring System Requirements: Conceptual Data Modeling Dr. Sana’a Wafa Al-Sayegh 1 st quadmaster University of Palestine.
Copyright © 2014 McGraw-Hill Education. All rights reserved
Database Design, Application Development, and Administration, 5 th Edition Copyright © 2011 by Michael V. Mannino All rights reserved. Chapter 2 Introduction.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Requirements Analysis
Karolina Muszyńska. Reverse engineering - looking at the solution to figure out how it works Reverse engineering - breaking something down in order to.
المحاضرة الثالثة. Software Requirements Topics covered Functional and non-functional requirements User requirements System requirements Interface specification.
Human Resource Management Lecture 27 MGT 350. Last Lecture What is change. why do we require change. You have to be comfortable with the change before.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 6 Slide 1 Requirements Engineering Processes l Processes used to discover, analyse and.
Architecture-Based Runtime Software Evolution Peyman Oreizy, Nenad Medvidovic & Richard N. Taylor.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
What is a Business Analyst? A Business Analyst is someone who works as a liaison among stakeholders in order to elicit, analyze, communicate and validate.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
1 Introduction to Software Engineering Lecture 1.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Fundamentals of Information Systems, Second Edition 1 Systems Development.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Information Architecture The Open Group UDEF Project
® IBM Software Group © 2009 IBM Corporation Essentials of Modeling with the IBM Rational Software Architect, V7.5 Module 15: Traceability and Static Analysis.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
Deployment of Ontology Mediation Of Information Flow Modified from Presentations made in 2002, 2003 and 2004 This material is not specific to any project.
Information Integration 15 th Meeting Course Name: Business Intelligence Year: 2009.
Context for Sarbines -Oxley Sarbines-Oxley makes executives and officers of all public corporations listed on any American stock exchange take personal.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
® IBM Software Group © 2009 IBM Corporation Viewpoints and Views in SysML Dr Graham Bleakley
1 Software Requirements Descriptions and specifications of a system.
Managing Data Resources File Organization and databases for business information systems.
Software Life Cycle Models
OLAP Systems versus Statistical Databases
ece 627 intelligent web: ontology and beyond
Data Warehousing Concepts
Lecture 10 Structuring System Requirements: Conceptual Data Modeling
Presentation transcript:

Modular Ontology architecture for using human defined sets of concepts Presentation by OntologyStream Inc Paul Stephen Prueitt, PhD Ontology Tutorial 5, copyright, Paul S Prueitt 2005

The best example of an ontology is the set of positive integers Set of positive integers Mathematical models of natural systems Arrow of timeGeographical positions Instances in the world where the concepts of a counting number are essential Accounting Quantitative measurement

Set of positive integers Instances in the world where the concepts of a counting number are essential The concept of an integer is used without the specific use of a concept effecting the definition of the concept, of “two-ness” for example. The existence of this set of concepts allows a great diversity of human activities. The “ontology standard” is enforced by the correctness of the concepts and by the ease in which new applications can be found. The standard is ultra-stable and resilient because the concepts are correct. The standard is not owned by anyone.

Modular ontology is used to measure the properties of events with sets of concepts. processes Notation e(i) = w(i)/s(i) The measurement of an event has a weakly structured and a structure part { e(i) } { w(i) } { s(i) } Semantic extraction Discrete analysis Events occur in a real world as part of complex processes. Largely because events are seen as having patterns and structure, software engineers can build relational databases, or XML repositories to help us understand and interact with information that is situation specific. With ontology, human communities will be able to reveal a set of concepts, and define regular relationships between concepts. We call this “Ontology mediation of information flow”. The formal representations of the concepts are used to organize data and to move data from one place to another. This has to be demonstrated. We will illustrate Ontology mediation of information flow, as an example, in the development and use of Harmonized Trade Tariff Schedule Administrative Rulings. A HTS Administrative Ruling is a short public document that ties together a code used to determine duties on imported or exported commodities. A second example is suggested whereas Selectivity and Targeting reports are seen as measurement of selectivity and targeting events by Custom and Border Protection.

processes Semantic extraction A framework holds a higher level abstraction representing an analysis of how things follow each other. Example: event-Structure Ontology Framework (e-SOF) has 18 cells developed from the cross product of the three dimensions : {past,present,future}; {people,places,things}; {how,why} Example: risk/gains Ontology Framework (rg-OF) has 40 cells developed from the cross product of the three dimensions: {Risk, Gain}; {Anomaly, Trend}; { measurement/assessment, name/group, event/context, rule, policy/component, function/behavior } Ontology Framework

processes Explicit ontology such as OWL DL By aligning the internal (implicit set of concepts) in a semantic extraction computation with the explicit form of concept representation, provided by the OWL DL standard, one is able to organize information expressed as concepts in free form text. One is able to use look up tables, lists, controlled vocabularies and taxonomies to expand that statement of these conceptual expressions so that the expression is as clear, complete and consistent as possible. One is able to move the information from a single event into a computational space where specific structure is available to bring relevant information to the report development process. One is able to, after the fact, create a better report about an event, such as an administrative ruling or a selectivity and targeting action. One is able to develop long term trending and analogy detection using specific information about how things are related to each other in the real world. Ontology Framework

A modular ontology management infrastructure provides various services in the context of field reporting over transactions upper level ontologies “other” upper level ontology Law governing US Customs Advanced Trade Data Economic Supply Chain Data Findings ontologyEntities ontologyGain/Risk ontology sources of data Location ontology Later application areas HTS Ontology

Written reports Structural Event. Ontology Framework In our work, human knowledge is captured separately in two computer computable forms: implicit (semantic extraction ontology) and explicit OWL DL ontology Gain / Risk Ontology Framework.

{ who, where, what, how, why } x { past, present, future} Structural Event Ontology Framework The classical, existing from Greek times, six interrogatives is partitioned into three parts; {people, places, things} + { event structure with causality } + time { people, places, things } event structure 18 questions from frames (past, who, how), (past, who, why), (present, who, how) (present, who, why) (future, who, how) (future, who, why) Etc… event Structure Ontology Framework (e-SOF) ** ** e-SOF was “discovered” by Dr. Paul S. Prueitt while thinking about a US Customs ontology prototype in March 2005

Ontology Framework Ontology Reasoner Scoped Ontology Individuals Knowledge Management visualization Knowledge Engineer visualization By internally adjusting the rules within any one of the commercially available semantic extraction (implicit) ontology we measure text, or structured data in a single record, using a three element frame ( y, x, z) where x is from the set { people, places, things } where y is form the set { past, present, future } and where z is from the set { how, why } There are 3*3*2 = 18 of these three element frames, each which can be seen to ask a question. The measurement using linguistic and structural knowledge to answer those questions that can be answered. Those that are not answered are left blank. Other semantic extraction tools can be similarly manipulated to produce an alignment between internal ontology (not often OWL) and external OWL DL ontology (which is our standard).

High Risk Ontology Expression Bio-systems Weapon-systems Commodity history analysis Entry Reports and Findings { concepts } Ontology Framework with Differential Ontology Expressions informs aligns Ontology expression about the risks measured from historical analysis of commodities US Customs cultural viewpoints expressed as sets of concepts Shipping manifests Entity histories

High Risk Ontology Expression Bio-systems Weapon-systems Commodity history analysis Entry Reports and Findings { concepts } Rapid knowledge acquisition and reporting about a transaction Ontology expression about the risks measured from historical analysis of commodities US Customs cultural viewpoints expressed as sets of concepts A transaction: Nautilus Explorer (“Nautilus”) owns and operates the M/V NAUTILUS EXPLORER, a 116-foot Canadian-flagged long-range dive boat. Nautilus would like to embark passengers in San Diego, California, on two separate occasions, for three days of diving in Mexican waters before returning to San Diego. The passengers would be embarked and disembarked at the same location in San Diego. Semi-automated generation of Reports

We take the first two dimensions of a framework to be { Anomality, Trend } union { Gain, Risk } And the other dimension to be: { measurement, assessment, name, group, event, context, rule, policy, component, function/behavior } Then, in the cross product, we have four sets of ten concepts. In fact the ten concepts are five sets of two concepts – each with an interesting “oppositional scale type” relationship. { measurement, assessment, name, group, event, context, rule, policy, component, function/behavior } ** This Gain/Risk Ontology Framework was “discovered” by Dr Prueitt in March 2005 while thinking about possible US Customs Selectivity and Targeting enhancements. Dr Peter Stephenson and Dr Prueitt are extending this in the context of Cyber Security ontology mediation data analysis. gain/risk Ontology Framework (gf-OF) **

Semantic Extraction Link Analysis Pattern recognition Ontology Tools Statistics Advanced Trade Data Harmonized Tariff Schedule Detailed work with tools over available data Practical problem: Provide the three Cs, clarity, consistency, and completeness in EACH judicial review of a commodity in passage across national boarders. Integrated collection of reified ontologies with some specific inferences and some information organization and retrieval Possible deployment as U. S. Custom’s Total Information Awareness (TIA) capability

Data Transfer Object (SOI) Scoped Ontology Individual Transactions Findings Entry Entry Summary Script SOI pushes information Portal pulls information databases Script pulls information Ontology Individuals have a subsumption relationship to upper abstract ontologies Ontology Framework Ontology Reasoner Scoped Ontology Individuals Human machine interface Knowledge Management visualization Knowledge Engineer visualization client visualization An event

SOI design by-passes the critical “visualization” choke point Scoped Ontology Individuals Human machine interface SOI Stack of SOIs supporting analysis of analysis Ontology Framework Ontology reasoning The mental event is the model for the Scoped Ontology Individual (SOI). The SOI is a minimal formal ontology (defined in OWL DL) that binds the concepts and data together about a single event. The Framework’s small number of concepts organize the organization of everything that is known about the data elements that occur in a Harmonized Tariff Schedule administrative ruling. Once the data elements have been used as the initial conditions for SOI formation, additional SQL queries may be made, or additional ontology subsetting may be made so as to bring new information or information that was not initially known “into the visualized frame”.

Scoped Ontology Individuals Human machine interface SOI Stack of SOIs supporting analysis of analysis Ontology Framework Ontology reasoning Visualization of ontology: The concept of a Scoped Ontology Individual (SOI) opens up a visualization paradigm that has never been exposed before (it is an original concept that is based on decades of work in cognitive neuroscience) SOI design by-passes the critical “visualization” choke point that occurs when Ontology Systems are built on the relational data base model (as is done in our ontology augmentation of rule engines). This by-pass is created when data elements in a report is used to subset upper ontologies and domain ontologies to produce the minimal set of “concepts” needed to frame the data. If Framework Ontology is being used, then this subsetting process has an expansion / contraction cycle that produces very small SOI objects. (see previous slide)

Ontology Framework Ontology Reasoner Scoped Ontology Individuals Knowledge Management visualization Readware MITi Inc and InOrb Technologies have teamed to develop a demonstration capability based on the use of Readware internal ontology API to create text elements that populate the 18 cells of the e-SOF. We use the triple: ( y, x, z) where x is from the set { people, places, things } where y is form the set { past, present, future } and where z is from the set { how, why } This involves three steps: 1) Coding eight probes that use the internal Readware stem-based text understanding computations to find information and classify this information as answers to people, places, things, past, present, future, how or why questions. 2) There are some options, but the one we are investigating first is to use the People Places and Things probes first. This is a well know “Named entity extraction” approach. 3) Then when one of the these three probes “finds” something; then the local neighborhood (in the Readware stem structure) is examined to see if more of one or more of the 18 questions can be answered. Custom’s analyst

The other choke point is dependency on a relational database Data Transfer Object (SOI) Scoped Ontology Individual Transactions Findings Entry Entry Summary Script databases Script pulls information ILOG Rulebase Reasoner An event Ontology Augmentation of a rule based engine

For complex reasons, demonstration about how to use ontology have often used a fixed data set with doctored data to pretend as if scalability issues have been solved or are not relevant. These demonstrations fail far short of correctness and hid specific known weaknesses of classical IT architecture. The scalability issue comes from the need to extend ontology or XML, add delete or modify concepts. These extension requirements come from many different origins, different communities of practice, and as circumstances change. Extensibility is the key contribution that XML has brought. For example without a common data encoding paradigm, the scalability issue creates a second choke point. The relational database must have a fixed data schema. The work on such a solution is under the XML MetaData Repository standards process: XMDR, RDF or OWL DL may, or may not, solve this problem. Modular ontology helps, but the principles developed in differential ontology, formative ontology and Framework Ontology seem essential to solving the whole problem as completely as possible. With these approaches, we find by- passes to technology problems that are seen now by the XMDR standards committee as being unsolvable. The definition of a event specific Scoped Ontology Individual is one of those by-passes. On the relational database dependency

There are some existing software products, Convera, AeroText, MITi, Semagix, Autonomy, and others; were a common data encoding solution exists. A data encoding solution is generally protected by patents, and is used to provide computational efficiency; one of the best examples is PriMentia's Hilbert engine were a key-less hash table type data encoding allows contextual search in the most natural fashion. Autonomy has also the technology that Michael Lynch developed in the Autonomy spin-off N-Corp. Semagix, Applied technical Systems, and 15 or 20 others have excellent data encoding solutions. If an government agency selected the two or three best technologies, the communications between the internal representation would be required. This may or may not be easy, depending on the specific technologies. In Summary: These software products create an integration of classically understood methods using a common data encoding. Each COTS product uses a different internal data representation, and so the use of more than one COTS product will create binding issues. A modular ontology management architecture can be used to integrate technologies like semantic extraction and related knowledge discovery in data technology (implicit ontology) ontology development and editing (explicit ontology) advanced algorithms related to risk definition and decision support visualization technology

So government agencies really have two solution paths: 1)Choice one or two vendors after actually understanding what each vendor provides and create a complete solution with that tool set. The requires integration architecture. 2)Learn from a Trade Study process what the methods are that make COTS semantic extraction work, move around the patents and other IP; and develop a unique application that is specific to that government agency. In either case, the greater challenge is the technology transition challenge. If the technology is not a LOT better than the current beta sites and doctored demonstrations, then the transition effort will fail. But, leaving transition issues aside, let us look closer that these two options High level view of integration architecture

So we have two solution paths: 1)Choice one vendor after actually understanding what each vendor provides and create a complete solution with that tool set. But how to select? CoreSystem CoreOntology first takes on the underlying stability issue by moving forward a design time Iconic language that may revolutionize how society uses computers. Current generation best of bread technology The list of possible qualified candidates for offering a complete solution might be less than 20 companies. In many cases, these companies are highly capitalized and would provide stability for some period of time. However, the underlying XML and ontology standards are not stable. One would expect that better global solutions will exist within five years. So one needs to know that the sets of concepts can be exported and transformed as the market matures. Current generation may not solve all problems in an optimal fashion Next generation tools are no yet ready to produce systems

So we have two solution paths: 2) Learn from a Trade Study process what the methods are that make COTS semantic extraction work, move around the patents and other IP; and develop a unique application that is specific to Customs. These two diagrams are from OntologyStream Inc. There is no suggestion that this non-capitalized small company has the management skills required to build out an application specifically designed from the principles discussed by Prueitt and his colleagues. So we have sought the support and guidance from SAIC or IBM to bring a small team together to develop a government owned system based on these principles and at the smallest possible cost.

Summary Current contractors almost always treat ontology and XML technology as if the same as relational database technology. Current contractors are gaming the contracts so that maximum Time and Materials resources can be expended. Ontology and XML standards committees struggle with the issues of private intellectual property and hidden agendas. Ontology visualization by users is required to find optimal solutions consist with cultural expectations. Ontology and XML standards have not been able to address ontology visualization or process models that place Ontology and XML into complex work flow. A single payer entity is needed to bind together the best technology and to resolve IP and philosophical differences.