OntoSoar: Feeding a Growing Ontology CS 652 Information Extraction and Integration Fall 2012 Peter Lindes pl 12/4/2012OntoSoar1.

Slides:



Advertisements
Similar presentations
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Logic form identification of medical clinical trials Clint Tustison.
Finding Genealogy Facts with Linguistic Analysis Peter Lindes, Deryle W. Lonsdale, David W. Embley Brigham Young University © 2014 Peter Lindes 3/19/2014PL.
Complexity must become Linear or Decrease Smart data infrastructure: The sixth generation of mediation for data science Peter Fox 1
Automating the Extraction of Genealogical Information from Historical Documents Aaron P. Stewart David W. Embley March 20, 2011.
Outline Introduction Soar (State operator and result) Architecture
Scott N. Woodfield David W. Embley Stephen W. Liddle Brigham Young University.
The Query Compiler Varun Sud ID: 104. Agenda Parsing  Syntax analysis and Parse Trees.  Grammar for a simple subset of SQL  Base Syntactic Categories.
References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri.
1 Soar Semantic Memory Yongjia Wang University of Michigan.
Cognitive Linguistics Croft & Cruse 10 An overview of construction grammars (part 1, through )
Theories of Mind: An Introduction to Cognitive Science Jay Friedenberg Gordon Silverman.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
An Abstract Framework for Extraction Plans and Heuristics in a Data Extraction System Alan Wessman Brigham Young University Based on research supported.
Soar Progress on NL-Soar, and Introducing XNL-Soar Deryle Lonsdale, Jamison Cooper-Leavitt, and Warren Casbeer ( and the rest of the BYU NL-Soar.
Generative Grammar(Part ii)
MASTERS THESIS DEFENSE MASTERS THESIS DEFENSE Solving Winograd Schema Challenge: Using Semantic Parsing, Automatic Knowledge Acquisition and Logical Reasoning.
Learning Prepositions for Spatial Relationships in BOLT Soar Workshop 2012 James Kirk, John Laird 6/21/
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
© British Council, All rights reserved. Language Awareness in the Primary Classroom An ELIS WSA-EC course, under licence from British Council Session.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
FROntIER: A Framework for Extracting and Organizing Biographical Facts in Historical Documents Joseph Park.
Intro to Lexing & Parsing CS 153. Two pieces conceptually: – Recognizing syntactically valid phrases. – Extracting semantic content from the syntax. E.g.,
Scanned Books: Annotator Training. Project Overview Untapped sources – 100,000+ scanned/OCRed books – Problem: cost-effective extraction Extraction tools.
Soar and Construction Grammar Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 2014 Soar Workshop © 2014 Peter Lindes 6/19/2014PL 2014.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Introduction to CL & NLP CMSC April 1, 2003.
Ontology-based Information Extraction with a Cognitive Agent Peter Lindes 1, Deryle Lonsdale, David Embley Brigham Young University AAAI Now at.
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
FROntIER: Fact Recognizer for Ontologies with Inference and Entity Resolution Joseph Park, Computer Science Brigham Young University.
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
“Automating Reasoning on Conceptual Schemas” in FamilySearch — A Large-Scale Reasoning Application David W. Embley Brigham Young University More questions.
Lecture 1 Lec. Maha Alwasidi. Branches of Linguistics There are two main branches: Theoretical linguistics and applied linguistics Theoretical linguistics.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Scanned Books: Annotator Training. Project Overview Untapped sources – 100,000+ scanned/OCRed books – Problem: cost-effective extraction Extraction tools.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Cognitive Science and Biomedical Informatics Department of Computer Sciences ALMAAREFA COLLEGES.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
1 CS 385 Fall 2006 Chapter 7 Knowledge Representation 7.1.1, 7.1.5, 7.2.
OntoSoar: Soar Finds Facts in Text Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 33 rd Soar Workshop, June 2013 pl 6/6/201333rd.
What is an obituary?  After someone passes away you may want something written in the paper in memorial of that person. The obituary also includes information.
+ KS1 PARENTS MEETING Thursday 17 th September 2015 Welcome to Ss Peter & Paul. KS1 Parents Meeting.
1 First Order Logic CS 171/271 (Chapter 8) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Cognitive Architectures and General Intelligent Systems Pay Langley 2006 Presentation : Suwang Jang.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Scanned Books: Annotator Training. Project Overview Untapped sources – 200,000+ scanned/OCRed books – Problem: cost-effective extraction Extraction tools.
Extracting and Organizing Facts of Interest from OCRed Historical Documents Joseph Park, Computer Science Brigham Young University.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Presented By Sharmin Sirajudeen S7 CS Reg No :
Interactive Task Learning: Language Processing for Rosie John E. Laird and Peter Lindes University of Michigan 1.
Extracting Data Automatically from Scanned Books with OntoSoar
Cognitive Language Processing for Rosie
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
A Web of Knowledge for Family History (Research Directions)
D-Dupe-like Mary Ely Example
Joseph S. Park and David W. Embley Brigham Young University
An Integrated Theory of the Mind
Business Intelligence
Using Language to Find Genealogy Facts
© Richard Goldman October 31, 2006
Presentation transcript:

OntoSoar: Feeding a Growing Ontology CS 652 Information Extraction and Integration Fall 2012 Peter Lindes pl 12/4/2012OntoSoar1

Project Goals Use linguistic technologies to: – Find more facts – Learn new categories and relations Technologies to be used: – OntoES – Link Grammar Parser – LG-Soar – Soar – Discourse Representation Theory pl 12/4/2012OntoSoar2

What is OntoES? A system for building OSM models Capable of representing extraction ontologies Processes text to extract facts pl 12/4/2012OntoSoar3

pl 12/4/2012OntoSoar4

pl 12/4/2012OntoSoar5

What is Soar? pl 12/4/2012OntoSoar6 A cognitive architecture A system that implements that architecture Major elements: – Short- and long-term memories – Decision procedure – Perception and action modules – Various kinds of learning Example applications: – TacAirSoar – BOLT Project

pl 12/4/2012OntoSoar7

pl 12/4/2012OntoSoar8

pl 12/4/2012OntoSoar9

pl 12/4/2012OntoSoar10

OntoSoar pl 12/4/2012OntoSoar11

Raw OCR’d Text Example pl 12/4/2012OntoSoar12

Segmented Text pl 12/4/2012OntoSoar Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ; m. 1856, Mary Augusta Andruss, 992 Broad St., Newark, N. J., who was b. 1825, dau. of Judge Caleb Halstead Andruss and Emma Sutherland Goble. Mrs. Lathrop died at her home, 992 Broad St., Newark, N. J., Friday morning, Nov. 4, The funeral services were held at her residence on Monday, Nov. 7, 1898, at half- past two o'clock P. M. Their children: 1. Charles Halstead, b. 1857, d William Gerard, b. 1858, d Theodore Andruss, b. i Emma Goble, b

Parsing and Semantics pl 12/4/2012OntoSoar Xp | Ss | Wd VJlsi MVp | | +----G Pa--+-MVp-+-IN-+ +--VJrsi-+ +-IN-+ | | | | | | | | | | | | | LEFT-WALL Charles.b Halstead was.v-d born.a in.r 1857 and.j-v died.v-d in.r Charles Halstead, b. 1857, d Charles Halstead was born in 1857 and died in person(P1) named(P1, "Charles Halstead") born(P1, "1857") died(P1, "1861") Person(P1) Person_Name(P1, "Charles Halstead") Person_BirthDate(P1, "1857") Person_DeathDate(P1, "1861") Predicates: Extracted facts:

More Complex Parsing pl 12/4/2012OntoSoar15 Charles Christopher Lathrop, N. Y. City, was born in 1817 and died in 1865 and was the son of Mary Ely and Gerard Lathrop ; Ss MXs | +----Xd VJlsi G G----+ | +-G+-G-+Xc+ +---Pv--+-MVp-+-IN-+ | | | | | | | | | | | | Charles.b Christopher.b Lathrop, N. Y. City, was.v-d born.v in.r VJrsi VJlsi Ost Ju | +--MVp-+-IN-+ +-VJrsi-+ +-Ds-+-Mp-+ +--G--+-SJls-+ | | | | | | | | | | | | and.j-v died.v-d in.r 1865 and.j-v was.v-d the son.n of Mary.b Ely.m and.j-n --SJrs G---+ | | Gerard.m Lathrop [;]

More Complex Semantics pl 12/4/2012OntoSoar16 person(P2) named(P2, "Charles Christopher Lathrop") place(GE1) named(GE1, "N. Y. City") livedIn(P2, GE1) born(P2, "1817") died(P2, "1865") person(P3) named(P3, "Mary Ely") son(P2, P3) person(P4) named(P4, "Gerald Lathrop") son(P2, P4) couple(P3, P4) Person(P2) Person(P3) Person(P4) Person_Name(P2, "Charles Christopher Lathrop") Person_Name(P3, "Mary Ely") Person_Name(P4, "Gerald Lathrop") Person_BirthDate(P2, "1817") Person_DeathDate(P2, "1865") Parent_has_Child(P3, P2) Parent_has_Child(P4, P2) Male(P2) Parent_with_Parent(P3, P4) GeoEntity(GE1) GeoEntity_Name(GE1, "N. Y. City") Person_livedIn_GeoEntity(P2, GE1)

More Learning pl 12/4/2012OntoSoar17 He graduated B. A. from Rensselaer Polytechnic College, Troy, N. Y MXs Os Js MXs Xd Ss--+ +-G+-Mp G G Xd-+Xca+ +-G+ | | | | | | | | | | | | | he graduated.v-d B. A. from Rensselaer Polytechnic College, Troy.b, N. Y. pro3SingMasc(X1) institution(I1) named(I1, "Rensselaer Polytechnic College") graduatedFrom(X1, I1) person(X1) place(GE2) named(GE2, "Troy, N. Y.") locatedIn(I1, GE2) person(P5) named(P5, "Gardner Bullard") sameAs(X1, P5) Person(P5) GeoEntity(GE2) Person_Name(P5, "Gardner Bullard") GeoEntityName(GE2, "Troy, N. Y.") Male(P5) Institution(I1) Institution_Name(I1, "Rensselaer Polytechnic College") Person_graduatedFrom_Institution(P5, I1) Institution_locatedIn_GeoEntity(I1, GE2)

Processing Steps Gather section of text Segment into sentence fragments Parse with the LG-Parser Build predicates with LG-Soar Resolve pronouns using DRT Convert predicates to facts Match extracted facts against conceptual model Record facts that match Learn from partial matches pl 12/4/2012OntoSoar18