BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Ontology Assessment – Proposed Framework and Methodology.
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future June COPYRIGHT ©2015 SAPIENT CORPORATION.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
On Roles of Models in Information Systems (Arne Sølvberg) Gustavo Carvalho 26 de Agosto de 2010.
CSI315CSI315 Web Development Technologies Continued.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
A hybrid method for Mining Concepts from text CSCE 566 semester project.
© copyright 2011 Semantic Insights™ Semantic Search/Research using PriArt: A DoD IG Example Chuck Rehberg CTO/Chief Scientist Trigent Software/Semantic.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
Flexible Text Mining using Interactive Information Extraction David Milward
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
Data Mining By Dave Maung.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
© copyright 2014 Semantic Insights™ “A New Natural Language Understanding Technology for Research of Large Information Corpora." By Chuck Rehberg, CTO.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
International Workshop Jan 21– 24, 2012 Jacksonville, Fl USA Model-based Systems Engineering (MBSE) Initiative Slides by Henson Graves Presented by Matthew.
Information Retrieval
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
An Ontological Approach to Financial Analysis and Monitoring.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Applying ontology and linguistics to automate reading, writing, and reporting of knowledge and information in a semantic wiki Chuck Rehberg Semantic Insights.
® IBM Software Group © 2009 IBM Corporation Viewpoints and Views in SysML Dr Graham Bleakley
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Data mining in web applications
The Role of Ontologies for Mapping the Domain of Landscape Architecture An introduction.
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Rafael Almeida, Inês Percheiro, César Pardo, Miguel Mira da Silva
ece 627 intelligent web: ontology and beyond
Geospatial and Problem Specific Semantics Danielle Forsyth, CEO and Co-Founder Thetus Corporation 20 June, 2006.
One Language. One Enterprise.™
Introduction to Information Retrieval
About Thetus Thetus develops knowledge discovery and modeling infrastructure software for customers who: Have high value data that does not neatly fit.
Presentation transcript:

BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™

The Big Mechanism Vision Big Mechanisms are causal, explanatory models of complicated systems in which interactions have important causal effects. The collection of Big Data is increasingly automated, but the creation of Big Mechanisms remains a human endeavor made increasingly difficult by the fragmentation and distribution of knowledge. To the extent that we can automate the construction of Big Mechanisms, we can change how science is done. The Big Mechanism program will develop technology to read research abstracts and papers to extract fragments of causal mechanisms, assemble these fragments into more complete causal models, and reason over these models to produce explanations. The domain of the program will be cancer biology with an emphasis on signaling pathways. - Broad Agency Announcement - Big Mechanism, DARPA-BAA-14-14, January 30, 2014

The Big Challenge Valuable information is locked-up in research abstracts and papers The rate of change of information can be high The information is fragmented Keyword search and statistical methods for locating the information of interest in documents still leaves the human to determine and extract the useful information Effectively automating the identification and extraction of useful information requires some level of natural language understanding Automating the extraction of useful information from natural language texts (i.e. without human interpreters) requires the application of both domain and world knowledge

About Semantic Insights™ Who are we? “Semantic Insights” is the R&D division of Trigent Software, Inc. What is our Mission? Automate research tasks (faster, better, cheaper) Why Semantics? Semantics allows us to operate at the “meaning level”, to “separate the know from the show”* Why us? Bright People, Proprietary Technology (IP), Passion for Excellence and Track Record of Delivery *

The Semantic Insights Research Assistant (SIRA) Project Mission: – Research: The SIRA Technology was developed to automate research tasks requiring natural language, domain-knowledge, understanding and reasoning. – Development: All SIRA-based products must be easy-to-use requiring little or no training beyond what the user already understands. Mission Status: – Trigent Software has spent nearly 10 years doing fundamental self-funded research resulting in 6 patents for a fast scaleable inference engine, as well as, numerous aspects of natural language processing and natural language understanding – Our current clients are “early adopters” with a time-critical need for specific detailed information that cannot be met using conventional technology – We have begun testing products based on this research – More fundamental research needs to be done to realize its full potential Today the SIRA Technology can: 1.Semantically understand a statement of your interest expressed in Natural Language (i.e. your research statements) 2.Use that understanding to Read through a vast number of documents 3.Quickly identify the semantically relevant information of interest in a large corpus of Natural Language text 4.Restructure and Report the findings in useful ways including Natural Language text

Goal of the SIRA Technology Goal: Automate the capture and application of domain knowledge, world knowledge and experience to automate the extraction and reporting of useful information from natural language texts. 1.Automate To the extent possible remove the requirement for human guidance or interpretation 2.Knowledge Semantic Quanta in relation, representing a model of what is, or is possible (e.g. a robust kind of Ontology) Linguistic representation of Semantic Quanta (e.g. a rich dictionary) Inferences and Implications Experience 3.Extract Recognize and map natural language into semantic item clusters 4.Useful information Information that is sought, already known, or directly related (skip over the rest) 5.Report Translate “report requests” into various aggregations and representations of only the useful information 6.Natural language texts Natural language prose formatted as text, pdf, doc, docx, html…

What is required to automate extraction and presentation of useful information from natural language text? 1.Natural Language Understanding – Natural Language Processing (NLP) – Determine possible meanings in linguistic and semantic context – Requires enough domain and world knowledge 2.Adding to the Understanding – Using the system, adds to the same domain and world knowledge already used in Natural Language Understanding 3.Reasoning, Querying and Reporting – Finding only the require information in domain and world knowledge – Applying reasoning algorithms to add to the domain and world knowledge

Requirements for Natural Language Understanding 1.The various relationships expressed in each sentence need to be identified and understood in context. 2.The relationships expressed across sentences need to be identified and understood in context. 3.The possibly valid senses of each term in a sentence need to be identified and evaluated in context. 4.Multiple ways of expressing the semantically same relationships need be recognized. 5.Multiple ways of expressing the semantically same terms need be recognized. 6.The mapping of natural language expressions to ontological expressions needs to be identified and processed in context. 7.Evidentiary and implication relationships need to be identified and evaluated in context. 8.Context in natural language text needs to be identified and exploited at various levels of scoping.

Requirements for Adding to the Understanding 1.An adequate representation of the semantic items, including concepts, relationship, instances, properties and units of measure needs to be identified and populated 2.Representations of time and identity need to be defined and populated 3.An adequate representation of the definition of semantic items, including senses, synonyms, and rich linguistic metadata, needs to be identified and populated 4.The mapping of natural language expressions to ontological expressions resulting from “machine reading” needs to be exploited to update the ontology (concepts, relationship, instances, properties and units of measure) and dictionary. 5.Evidentiary and implication relationships need to be updated based on analysis of the updated Ontology.

Requirements for Reasoning, Querying and Reporting 1.The resulting ontology needs to be made widely available for ad hoc access (e.g. as a linked data endpoint queryable using SPARQL) 2.Researchers need to also be able to query the natural language corpus directly using natural language to discover new relationships not yet known in the ontology. 3.Various reasoning algorithms including, pattern discovery, analogy, and implication, need to be employed over the ontology to discover non-explicit knowledge 4.Reports in various representations (including fresh and quoted natural language prose) need to be generated from the ontology.

SIRA Approach to Natural Language Understanding

SIRA Approach to adding to Understanding

SIRA Approach to Reasoning, Querying and Reporting

BAA – Big Mechanism

SIRA Technology to Big Mechanism “Read”

SIRA Technology to Big Mechanism “Assembly”

SIRA Technology to Big Mechanism “Explanation”