Co-champions: Mike Bennett, Andrea Westerinen

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Learning Semantic Information Extraction Rules from News The Dutch-Belgian Database Day 2013 (DBDBD 2013) Frederik Hogenboom Erasmus.
Financial Industry Semantics and Ontologies The Universal Strategy: Knowledge Driven Finance Financial Times, London 30 October 2014.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
How to survive the document & data tsunami? Lambda Verdonckt Business Analyst TenForce.
Data warehouse example
Information and Business Work
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
DATA WAREHOUSING.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future June COPYRIGHT ©2015 SAPIENT CORPORATION.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Chapter 10 Architectural Design
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Knowledge representation
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Andreas Abecker Knowledge Management Research Group From Hypermedia Information Retrieval to Knowledge Management in Enterprises Andreas Abecker, Michael.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
1 Software Engineering: A Practitioner’s Approach, 6/e Chapter 10a: Architectural Design Software Engineering: A Practitioner’s Approach, 6/e Chapter 10a:
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Personalized Recommendation of Related Content Based on Automatic Metadata Extraction Andreas Nauerz 1, Fedor Bakalov 2, Birgitta.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Overview of Statistical NLP IR Group Meeting March 7, 2006.
International Workshop 28 Jan – 2 Feb 2011 Phoenix, AZ, USA Ontology in Model-Based Systems Engineering Henson Graves 29 January 2011.
Chapter 9 Architectural Design. Why Architecture? The architecture is not the operational software. Rather, it is a representation that enables a software.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
4/19/ :02 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
THE SP SYSTEM AS AN AID TO CRIME PREVENTION AND DETECTION (CPD)
The Semantic Web By: Maulik Parikh.
Based on four case studies and a follow-up survey, we have identified the key success factors for realizing value from DDS (digital data stream) investments.
Sharing lessons through effective modelling
Track B Summary Co-champions: Mike Bennett, Andrea Westerinen
IC Conceptual Data Model (CDM)
“Intelligent User Interfaces” by Hefley and Murray A 1993 Perspective
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Knowledge Representation and Reasoning into Machine and Deep Learning
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
CHAPTER 2 CREATING AN ARCHITECTURAL DESIGN.
Ontology Summit 2016 – Track B
Accelerating intelligent automation for competitive advantage
Chapter 1 The Systems Development Environment
Analyzing and Securing Social Networks
European Network of e-Lexicography
SmaRT Visualization of Legal Rules for Compliance
COMP62342: Ontology Engineering for the Semantic Web
FIBO-aligned Semantic Triples
Design Model Like a Pyramid Component Level Design i n t e r f a c d s
2. An overview of SDMX (What is SDMX? Part I)
Ontology Summit Synthesis
2. An overview of SDMX (What is SDMX? Part I)
CSE 635 Multimedia Information Retrieval
Chapter 9 Architectural Design.
Semantic Information Modeling for Federation
KNOWLEDGE MANAGEMENT (KM) Session # 37
Upper Ontologies for Specifying Context
Co-champions: Mike Bennett, Andrea Westerinen
Improving Machine Learning using Background Knowledge
Chaitali Gupta, Madhusudhan Govindaraju
Jonathan Griffin, Managing Director, IFIS Publishing &
Indegene’s AI/NLP Powered Pharmacovigilance/Safety Solution
Presentation transcript:

Co-champions: Mike Bennett, Andrea Westerinen Track B Summary Co-champions: Mike Bennett, Andrea Westerinen Synthesis 2: 26 April 2017

Track B : Using Background Knowledge to Improve Machine Learning Results Track Co-champions: Mike Bennett Andrea Westerinen Blog Page http://ontologforum.org/index.php/Blog:Improving_Machine_Learning_using_Background_K nowledge

Motivations Machine Learning (ML) is based on defining and using mathematical models to perform tasks, predict outcomes, make recommendations, etc. Initial models can be specified by a data scientist, and/or constructed through combinations of supervised and unsupervised learning and pattern analysis Challenges with this: If no background knowledge is employed, the ML results may not be understandable There is a bewildering array of model choices and combinations Background knowledge could improve the quality of ML results by using reasoning techniques to select learning models and prepare the training and examined data (reducing large, noisy data sets to manageable, focused ones)

Objective The objective of this Ontology Summit 2017 track is to understand: Challenges in using different kinds of background knowledge in machine learning Role of ontologies, vocabularies and other resources to improve machine learning results Design/construction/content/evolution/... requirements for an ontology to support machine learning

Summit Graphical Overview

Track B Positioning

Track B: The Story So Far… In the first Track B session on March 15, there were two presentations on combining ontologies with machine learning and natural language processing technologies in order to improve results. In the first case, ontologies were combined with ML to improve decision support. The benefits included improving the quality of decisions, making decisions more understandable, and adapting the decision making processes in response to changing conditions. Regarding combining ontologies with NLP processing, this was in support of digital forensics and situational awareness. Concept extraction from natural language text was improved by using an ontology to isolate the meanings/semantics of the concepts and provide “artificial intuition” into the text. In the second Track B session, we aimed to continue discussing the use of ontologies to improve machine learning understandability and natural language processing. We had presentations on : Meaning-Based Machine Learning (MBML), discussing how to get meaningful output from existing machine learning techniques, The use of FIBO and corporate taxonomies to extract and integrate information from data warehouses, operational stores and natural language communications, Driving knowledge extraction via the use of a semantic model/ontology.

Track B Session 2 Content (12 April) Dr Courtney Falk, Infinite Machines Machine-based Machine Learning Bryan Bell, Expert System Leveraging FIBO with Semantic Analysis to Perform On-Boarding, KYC and CDD Tatiana Erekhinskaya, Lymba Corporation Converting Text into FIBO-Aligned Semantic Triples We learned about phishing, bears and cheese…

Dr Courtney Falk, Infinite Machines Title: Machine-based Machine Learning Abstract: Meaning-based machine learning (MBML) is a project to investigate how to get meaningful output from existing machine learning techniques. MBML builds from the Ontological Semantics Technology (OST) where natural language resources map to an ontology. An application of MBML to detecting phishing emails provides some initial experimental results. Finally, future research directions are explored.

Courtney Falk: Highlights: “Ontological Semantics” No logical formalism Outputs Text Meaning Representation Word senses Latent semantics analysis Vectors Meaning based machine learning – starts with human acquirers then learns Phishing application Results (performance etc.) Decision trees Status Early days – need larger data sets to prove the approach

Bryan Bell, Expert System Title: Leveraging FIBO with Semantic Analysis to Perform On-Boarding, KYC and CDD Abstract: The Financial Industry Business Ontology (FIBO) provides a common ontology and taxonomy for financial instruments, legal entities, and related knowledge. It provides regulatory and compliance value by ensuring that a common language can be used for data harmonization and reporting purposes. This session discusses using the formal structure of FIBO and other corporate taxonomies on unstructured information in data warehouses, operational stores and natural language communications (such as news articles, research reports, customer interactions, emails, and product descriptions), in order to create new value and aid in onboarding new customers, establishing a dependable know-your-customer process and complete on-going customer due diligence processes. Financial Industry Business Ontology (FIBO) provides the common language for bridging interoperability gaps and organizing content in a consistent way.

Bryan Bell Highlights 4 stages of analysis Morphological Grammatical Logical Semantic Context: identify context to return most appropriate word sense Live demo – using today’s news Examples Stock Bears Financial services portal based on Financial Industry Business Ontology (FIBO) Regulatory: Know Your Customer (KYC), due diligence etc. based on free-form text

Tatiana Erekhinskaya, Lymba Corporation Title: Converting Text into FIBO-Aligned Semantic Triples Abstract: Ontologies are playing a major role in federating multiple sources of structured data within enterprises. However, the unstructured documents remain mostly untouched or require manual labor to be included into consolidated knowledge management process. At Lymba, we are developing a knowledge extraction tool that automatically identifies instances of concepts/classes and relations between them in the text. The extraction is driven by a semantic model or ontology. For example, using FIBO terminology, the system recognizes time/duration constraints in contracts, money values, and their meaning - transaction value, penalty, fee, etc. and links them to the parties in the contract. The extracted knowledge is represented in a form of semantic triples, which can be persisted in an RDF storage to allow integration with other sources, inference, and querying. One more useful add-on is natural language querying capability, when a query like “Find clauses with time constraints for payor” is automatically converted into semantic triples, and then into SPARQL. This talk provides an overview of Lymba’s knowledge extraction pipeline and knowledge representation framework. Semantic parsing and triple-based representation provide a bridge between semantic technologies and NLP, leveraging inference techniques and existing ontologies. We show how Lymba’s Semantic Calculus framework allows easy customization of the solution to different domains.

Tatiana Erekhinskaya: Highlights Text mining (text to triples) FIBO triples Federation, visualization, natural language querying, extension Contract Processing Sensitive to roles in the ontology Uses nuanced FIBO structures of Party in Role etc. Also identifies e.g. Fee as a monetary amount playing a role in a context Knowledge extraction – 11 steps Named entity / layered extraction Applications Automated ontology extension Financial domain: Contract / compliance / risk / reporting (BCBS239 etc.) Cheese

Chat Transcript: Key Points Automated learning more contexts Identifying the context of a term Linguistic – what part of speech Semantic – what subject area Visualization Neologisms: how handled? Cheese Street name for heroin Attempts to disguise context – illicit activity Bear What about a bear market? (sentiment) Different tokenizations in different languages Use of a language ontology Can we analyze these chat logs with those tools?

Questions?