SWG Strategy (C) Copyright IBM Corp. 2006, 2012. All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS,

Slides:



Advertisements
Similar presentations
1 Knowledge Representation Introduction KR and Logic.
Advertisements

SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. International Technology Alliance Programme: Fact Extraction using a Controlled Natural.
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction.
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,
Fall Meeting October 2013 International Technology Alliance in Network and Information Sciences Human-Machine Conversations to Support Coalition Missions.
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Semiotics and NLP.
International Technology Alliance in Network & Information Sciences Dave Braines, John Ibbotson, Graham White (IBM UK) SPIE Defense Security & Sensing.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Why study grammar? Knowledge of grammar facilitates language learning
Statistical NLP: Lecture 3
Chapter 4 Basics of English Grammar
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Language Specfication and Implementation - PART II: Semantics of Procedural Programming Languages Lee McCluskey Department of Computing and Mathematical.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dörre, Peter Gerstl, and Roland Seiffert Presented By: Jake Happs,
Information Extraction from Documents for Automating Softwre Testing by Patricia Lutsky Presented by Ramiro Lopez.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Created By: Benjamin J. Van Someren.  Natural Language Translation – Translating one natural language such as German to another natural language such.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
Chapter 4 Basics of English Grammar Business Communication Copyright 2010 South-Western Cengage Learning.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
© British Council, All rights reserved. Language Awareness in the Primary Classroom An ELIS WSA-EC course, under licence from British Council Session.
Interpreting Dictionary Definitions Dan Tecuci May 2002.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Learning Science and Mathematics Concepts, Models, Representations and Talk Colleen Megowan.
International Technology Alliance in Network & Information Sciences Using the English Resource Grammar to extend fact extraction capabilities v1.1 David.
PUBLIC RELEASE – DISTRIBUTION UNLIMITED SPIE 2015 The International Technology Alliance in Network and Information Sciences Collaborative human- machine.
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
Chapter 15 Natural Language Processing (cont)
Writing an ERG mal-rule David Mott IBM Emerging Technology Services.
Structural Modeling. Objectives O Understand the rules and style guidelines for creating CRC cards, class diagrams, and object diagrams. O Understand.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
Artificial Intelligence: Natural Language
Albert Gatt LIN3021 Formal Semantics Lecture 4. In this lecture Compositionality in Natural Langauge revisited: The role of types The typed lambda calculus.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Role of NLP in Linguistics Dipti Misra Sharma Language Technologies Research Centre International Institute of Information Technology Hyderabad.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
Deep structure (semantic) Structure of language Surface structure (grammatical, lexical, phonological) Semantic units have all meaning components such.
Role of NLP in Linguistics Dipti Misra Sharma Language Technologies Research Centre International Institute of Information Technology Hyderabad.
Natural Language Processing Chapter 1 : Introduction.
Knowledge Representation
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Complex sentence analysis (2) D. Mott, ETS, IBM 5 th Nov 2014.
 2003 CSLI Publications Ling 566 Oct 17, 2011 How the Grammar Works.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
NATURAL LANGUAGE PROCESSING
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
A Database of Narrative Schemas A 2010 paper by Nathaniel Chambers and Dan Jurafsky Presentation by Julia Kelly.
ACITA 12 demo outline v0 Dr David Mott (IBM UK) International Technology Alliance In Network & Information Sciences International Technology Alliance In.
1 Commonsense Reasoning in and over Natural Language Hugo Liu Push Singh Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA.
NL Processing and Fact Extraction 11th May 2013
Human-Information Interaction
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Learning Attributes and Relations
Statistical NLP: Lecture 3
Chapter 4 Basics of English Grammar
Translation Problems.
Writing Analytics Clayton Clemens Vive Kumar.
Linguistic Essentials
Chapter 4 Basics of English Grammar
CS246: Information Retrieval
The 7Cs: A Pedagogical Framework for Grammar Teaching and Learning
Progress report on Semantic Role Labeling
Presentation transcript:

SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS, Hursley, IBM UK Steve Poteet, Ping Xue, Anne Kao, Boeing

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 2 Project 4 Task 2 Research Objectives Improve extraction of facts (in CE) from documents (in Natural Language) –unambiguous "semantics" of the document –machine can assist analyst in inference of new conclusions Provide rationale for linguistic and analytic processing –allow the human to be part of the NL processing –reasoning, argumentation about ambiguities, incomplete parsing... Define a model of linguistics, grammar, semantics –facilitate configuration of NLP tools in a CNL –human analyst can better understand the processing Improve Expressibility of CE –interest in CE, but needs a more "stylistic" grammar How is the Natural Language Processing related to the "Analysts Conceptual Model" (ACM)

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 3 Example processing (1) BCT patrol in East Rashid discover a bomb-making facility on Abu Tajara Street //MGRSCOORD: 38S MB // the patrol unit '|BCT patrol|' finds the facility '|p6|' and is contained in the place '|East Rashid|' and is located in the place '|East Rashid|' and is a NATO military unit.... ISSUES: names are a bit strange unnecessary "contained in" missed the bomb-making and the "on..." ignored the MGRS information "a NATO military unit" is unnecessary? BUT: this is CE, fully conformant to the ACM this is machine-processable this has a defined semantics rationale for processing is available

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 4 Current NL Processing Stanford Parser Entity Extractor Situation Extractor Names CE Aggregator CEStore SYNCOIN Reports Message PreProcessor "Stylistic" CE Conceptual Model (concepts, logical rules, linguistic expression) Proper Nouns (places, units) For Analysis Just exploratory steps

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 5 Conceptual Model(s) Meta Model Concept, Entity Concept, Relation Concept, Conceptual Model belongs to, has as domain Semiotic Triangle Thing, Meaning, Symbolstands for, expresses General Agent, Spatial Entity, Temporal Entity, Situation, Container has as agent role, is contained in Linguistic Sentence, Phrase, Word, Noun, Linguistic Category, Linguistic Frame has as modifier, is parsed from ACM Place, Church, Person, Village, IED, Facility,....is located in meaning symbol thing conceptualises stands for expresses "Our" Semiotic Triangle, based on the original [Ogden, C. K. and Richards, I. A. (1923). ]

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 6 Lots of stuff we didnt talk about !

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 7 Extending ITA Controlled English

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 8 Strong need to improve stylistic expressiveness Allow "common name" identity handling –the person John... Prepositional phrases –in, at, on Adjectives Reduce need to state the type explictly –John... Collections –the group of... Tense and aspect inflection in verbs... John met the group of US soldiers in East Rashid

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 9 Parallel NL and CNL parsers NL Parser CNL Parser lexicon conceptual model Reference English Grammar Semantic Theory Increase stylistic expressibility of CE Better understanding of linguistics stylistically expressive CE basic CE or predicate logic or CE-in-Java stylistically expressive CE NLP

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 10 Discussion Is SYNCOIN representative? –similar to style of intelligence reports –use of CE from reports to allow analysis was a key requirement in Pathfinder –but we should check again with Gavin Pearson Should we be analysing "chat"? –it was felt that a possible application for NL processing was in extracting information from "chat" –there are other US groups that are analysing chats –is chat more or less complex than fuller NL? –should we swap out the Stanford parser with something simpler? What about slang and acronyms? –is this just a question of using the same techniques with a different mapping of language to concept? –or looking for predefined patterns at a pre-parsing stage?

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 11 Conclusions We should continue with SYNCOIN –analysis of intelligence reports is our chosen path –fundamentally the same problem (try to communicate concepts via language) so we expect principles to be relevant to chat We should position chat and reports on a "space" –we will find examples of chat, and review whether it is similar or fundamentally different to more formal reporting eg is it the degree to which information is explicit in the text, or the degree of grammaticality? –maybe analysis of chat is a separate transition? We should compare our work with that of other US groups –we believe that the use of CE to facilitate the linguistic processing is different to other work