Soar and Construction Grammar Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 2014 Soar Workshop © 2014 Peter Lindes 6/19/2014PL 2014.

Slides:



Advertisements
Similar presentations
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.
Advertisements

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.
AeroDAML Applying Information Extraction to Generate DAML Annotations Dr. Paul Kogut Lockheed Martin Management & Data Systems.
1 OOA-HR Workshop, 11 October 2006 Semantic Metadata Extraction using GATE Diana Maynard Natural Language Processing Group University of Sheffield, UK.
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Extracting Names Using Layout Clues in Genealogical Books Aaron Stewart David W. Embley March 20, 2010.
Finding Genealogy Facts with Linguistic Analysis Peter Lindes, Deryle W. Lonsdale, David W. Embley Brigham Young University © 2014 Peter Lindes 3/19/2014PL.
Automating the Extraction of Genealogical Information from Historical Documents Aaron P. Stewart David W. Embley March 20, 2011.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Scott N. Woodfield David W. Embley Stephen W. Liddle Brigham Young University.
1 Automating the Extraction of Genealogical Information from the Web GeneTIQS Troy Walker & David W. Embley Family History Technology Conference March.
David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Aaron Stewart, and Cui Tao* Brigham Young University, Provo, Utah, USA *Mayo Clinic, Rochester,
Domain-Independent Data Extraction: Person Names Carl Christensen and Deryle Lonsdale Brigham Young University
Enabling Search for Facts and Implied Facts in Historical Documents David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale, Spencer Machado, Thomas Packer,
Principled Pragmatism: A Guide to the Adaptation of Philosophical Disciplines to Conceptual Modeling David W. Embley, Stephen W. Liddle, & Deryle W. Lonsdale.
Soar NL-Soar update Deryle Lonsdale BYU Linguistics
The Importance of Architecture for Achieving Human-level AI John Laird University of Michigan June 17, th Soar Workshop
1 Querying the Web for Genealogical Information Troy Walker Spring Research Conference 2003 Research funded by NSF.
1 Deryle Lonsdale, Jeremiah McGhee, Nathan Glenn, and Tory Anderson.
A Brief Survey of Web Data Extraction Tools (WDET) Laender et al.
Table Interpretation by Sibling Page Comparison Cui Tao & David W. Embley Data Extraction Group Department of Computer Science Brigham Young University.
An Abstract Framework for Extraction Plans and Heuristics in a Data Extraction System Alan Wessman Brigham Young University Based on research supported.
Soar Progress on NL-Soar, and Introducing XNL-Soar Deryle Lonsdale, Jamison Cooper-Leavitt, and Warren Casbeer ( and the rest of the BYU NL-Soar.
1 Automating the Extraction of Domain-Specific Information from the Web A Case Study for the Genealogical Domain Troy Walker Spring Research Conference.
Title Example 1 Presenter Name. Systems Approach Framework 1 Systems Theory is about understanding complex and large-scale interactions based on our perceptions.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Web Document Analysis: How can Natural Language Processing Help in Determining Correct Content Flow? Hassan Alam, Fuad Rahman and Yuliya Tarnikova Human.
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
Learning Prepositions for Spatial Relationships in BOLT Soar Workshop 2012 James Kirk, John Laird 6/21/
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Deryle W. Lonsdale, David W. Embley, Stephen W. Liddle, and Joseph Park BYU Data Extraction Research Group.
OntoSoar: Feeding a Growing Ontology CS 652 Information Extraction and Integration Fall 2012 Peter Lindes pl 12/4/2012OntoSoar1.
FROntIER: A Framework for Extracting and Organizing Biographical Facts in Historical Documents Joseph Park.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003.
Joseph Park Brigham Young University.  Motivation.
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
Understanding Natural Language
updated CmpE 583 Fall 2008 Ontology Integration- 1 CmpE 583- Web Semantics: Theory and Practice ONTOLOGY INTEGRATION Atilla ELÇİ Computer.
Ontology-based Information Extraction with a Cognitive Agent Peter Lindes 1, Deryle Lonsdale, David Embley Brigham Young University AAAI Now at.
Bootstrapping Regular-Expression Recognizer to Help Human Annotators Tae Woo Kim.
Developing “Geo” Ontology Layers for Web Query Faculty of Design & Technology Conference David George, Department of Computing.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
FROntIER: Fact Recognizer for Ontologies with Inference and Entity Resolution Joseph Park, Computer Science Brigham Young University.
Sergey Gromov Yulia Krasilnikova Vladimir Polyakov (NRTU MISIS, Moscow) KNOWLEDGE BASE CREATION FOR NATIONAL NANOTECHNOLOGY NETWORKS «CONSTRUCTIONAL NANOMATERIALS»
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Cost-Effective Information Extraction from Lists in OCRed Historical Documents Thomas Packer and David W. Embley Brigham Young University FamilySearch.
Ontology-Centered Personalized Presentation of Knowledge Extracted from the Web Ralitsa Angelova.
“Automating Reasoning on Conceptual Schemas” in FamilySearch — A Large-Scale Reasoning Application David W. Embley Brigham Young University More questions.
Introduction to Linguistics Class # 1. What is Linguistics? Linguistics is NOT: Linguistics is NOT:  learning to speak many languages  evaluating different.
Ontology and Databases 1. We'll go around with a self-introduction of participants (10~15 minutes) - we'll skip this if we have more than 20 participants.
OntoSoar: Soar Finds Facts in Text Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 33 rd Soar Workshop, June 2013 pl 6/6/201333rd.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu,
“The Future of Family History Technology” in Academic Research FHTW15 – Panel David W. Embley.
Extracting and Organizing Facts of Interest from OCRed Historical Documents Joseph Park, Computer Science Brigham Young University.
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Syntactical skills in preschoolers  Age 2-3: move from telegraphic speech to more complicated sentences  Syntactical errors such as “I runned” aren’t.
Extracting Data Automatically from Scanned Books with OntoSoar
Cognitive Language Processing for Rosie
Overview of Year 1 Progress Angelo Cangelosi & ITALK team
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Cognitive Language Comprehension in Rosie
Vision for an Automatically Constructed FH-WoK
Joseph S. Park and David W. Embley Brigham Young University
Extracting Full Names from Diverse and Noisy Scanned Document Images
Grant Number: IIS Institution of PI: Brigham Young University PI’s: David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale Title:
Artificial Intelligence 2004 Speech & Natural Language Processing
Using Language to Find Genealogy Facts
Joseph Park Brigham Young University
Presentation transcript:

Soar and Construction Grammar Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 2014 Soar Workshop © 2014 Peter Lindes 6/19/2014PL 2014 Soar Workshop1

Goals Long term goals – Build computational models of human language processing – Apply these models to real-world applications OntoSoar project goals – Extract genealogy facts from family history books – Project extracted information onto a conceptual model to populate a searchable database 6/19/2014PL 2014 Soar Workshop2

The Problem: Example 1 6/19/2014PL 2014 Soar Workshop3

A Simple Ontology 6/19/2014PL 2014 Soar Workshop4 Charles Christopher Lathrop has born on died on

The Problem: Example2 6/19/2014PL 2014 Soar Workshop5

A More Complex Ontology 6/19/2014PL 2014 Soar Workshop6 Myra Harwood Jonathan Squires J. Wilbur Squires Feb. 13, 1874

Related Work 6/19/2014PL 2014 Soar Workshop7 OntoES Embley et al. OntoES Embley et al. Link Grammar Sleater & Temperley Link Grammar Sleater & Temperley Soar Newell, Laird Soar Newell, Laird LG-Soar Lonsdale et al. LG-Soar Lonsdale et al. Construction Grammar Bergen & Chang, Bryant, Feldman Construction Grammar Bergen & Chang, Bryant, Feldman Theory Melby, Lakoff, Johnson, Tomasello Theory Melby, Lakoff, Johnson, Tomasello OntoSoar

The Solution 6/19/2014PL 2014 Soar Workshop8 Thus, intelligence is the ability to bring to bear all the knowledge that one has in service of one’s goals. Newell (1990), p. 90 Page Layout * Text Analysis Syntax Semantics Pragmatics World Knowledge Conceptual Models

OntoSoar Architecture 6/19/2014PL 2014 Soar Workshop9

Applying Construction Grammar 6/19/2014PL 2014 Soar Workshop10 Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;

Building Meaning 6/19/2014PL 2014 Soar Workshop11 Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;

Meaning Structures Compared 6/19/2014PL 2014 Soar Workshop12 … his widow married JONATHAN SQUIRES, who was born in Ohio, July 25, 1823, by whom she had one son, J. Wilbur, born June 16, 1865, Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;

Output Example 6/19/2014PL 2014 Soar Workshop13 2: Charles Christopher Lathrop, N. Y. City, born 1817, died 1865, son of Mary Ely and Gerard Lathrop ; ';' Facts extracted: Reporting 8 objects: X2: Name(osmx327, "Charles Christopher Lathrop") X1: Son(osmx331) X1: Person(osmx331) X4: Name(osmx336, "Mary Ely") X3: Person(osmx339) X6: Name(osmx342, "Gerard Lathrop") X5: Person(osmx345) X7: Date(osmx349, "1817") X7: BirthDate(osmx349, "1817") X8: Date(osmx354, "1865") X8: DeathDate(osmx354, "1865") Reporting 7 relations: Y1(osmx359): Person(osmx331) identified by Name(osmx327) Y2(osmx362): Person(osmx339) identified by Name(osmx336) Y3(osmx365): Person(osmx345) identified by Name(osmx342) Y4(osmx368): Person(osmx331) born on BirthDate(osmx349) Y5(osmx371): Person(osmx331) died on DeathDate(osmx354) Y7(osmx374): Son(osmx331) of Person(osmx345) Y6(osmx377): Son(osmx331) of Person(osmx339) 3: GP married 1856, Mary Augusta Andruss, 992 Broad St., Newark, N. J. ',' Facts extracted: Reporting 3 objects: X10: Name(osmx380, "Mary Augusta Andruss") X9: Spouse(osmx384) X9: Person(osmx384) X11: Date(osmx388, "1856") X11: MarriageDate(osmx388, "1856") Reporting 2 relations: Y8(osmx393): Person(osmx384) identified by Name(osmx380) Y9(osmx396): Person(osmx331) married Spouse(osmx384) MarriageDate(osmx388)

Results on Examples 6/19/2014PL 2014 Soar Workshop14

Results on Other Data 6/19/2014PL 2014 Soar Workshop15

Possible Interactive Learning 6/19/2014PL 2014 Soar Workshop16

Conclusions Nuggets Segmenter Adapted LG Parser A construction grammar meaning analyzer Mapped meaning schemas to user-defined ontologies A single integrated processing pipeline Evaluation on 12 randomly- selected pages Coal More constructions and world knowledge Parser problems Performance depends on document style Named entity recognizer Need an integrated incremental architecture How to make the system learn its knowledge 6/19/2014PL 2014 Soar Workshop17 It works! … and, it could work a lot better.

Extras 6/19/2014PL 2014 Soar Workshop18

Linguistic Analysis – Case 1B 6/19/2014PL 2014 Soar Workshop19 Charles Christopher Lathrop, …, son of Mary Ely and Gerard Lathrop ;