QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM.

Slides:



Advertisements
Similar presentations
Knowledge Representation using First-Order Logic
Advertisements

Symbol Table.
Semantics (Representing Meaning)
Properties of Text CS336 Lecture 3:. 2 Generating Document Representations Want to automatically generate with little human intervention Use significant.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
Copyright © 2002 Cycorp Inference in Cyc Logical Aspects of Inference Incompleteness in Searching Incompleteness from Resource Bounds and Continuable Searches.
Copyright © 2002 Cycorp Why use logic? CycL Syntax Collections and Individuals (#$isa and #$genls) Microtheories Foundations of Knowledge Representation.
Copyright © 2002 Cycorp Why use logic? CycL Syntax Collections and Individuals (#$isa and #$genls) Microtheories Foundations of Knowledge Representation.
QUIRK:Project Progress Report Monterey, June Cycorp IBM.
Copyright © 2002 Cycorp Events in Cyc Roles and Event Predicates Actor Slots Sub-events OE Example: Events and Roles.
UML CASE Tool. ABSTRACT Domain analysis enables identifying families of applications and capturing their terminology in order to assist and guide system.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Modal logic and databases. Terms Object terms Concept terms ↓ t: object denoted by concept t in some context Type designations: o (object) and c (concept)
Query Biased Snippet Generation in XML Search Yi Chen Yu Huang, Ziyang Liu, Yi Chen Arizona State University.
Testing Bridge Lengths The Gadsden Group. Goals and Objectives Collect and express data in the form of tables and graphs Look for patterns to make predictions.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Ontology Mapping with Cyc doug foxvog 14 July 2004
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 The BT Digital Library A case study in intelligent content management Paul Warren
QUIRK: QUestion Answering = Information Retrieval + Knowledge Cycorp IBM Presenter: Stefano Bertolo (Cycorp)
Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.
Tables to Linked Data Zareen Syed, Tim Finin, Varish Mulwad and Anupam Joshi University of Maryland, Baltimore County
Author: William Tunstall-Pedoe Presenter: Bahareh Sarrafzadeh CS 886 Spring 2015.
Artificial intelligence project
Report Writing.
A Cognitive Substrate for Natural Language Understanding Nick Cassimatis Arthi Murugesan Magdalena Bugajska.
Processing of large document collections Part 7 (Text summarization: multi- document summarization, knowledge- rich approaches, current topics) Helena.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Carnegie Mellon School of Computer Science Copyright © 2001, Carnegie Mellon. All Rights Reserved. JAVELIN Project Briefing 1 AQUAINT Phase I Kickoff December.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Searching for Common Sense: Populating Cyc from the Web Presented by Yu-Chung Shen 2007/05/03.
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
Srihari Sadagoparamanujam. Agenda IntroductionCharacteristicsCYC.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Unclassified//For Official Use Only 1 Analysis of Uncertain Data in Text Documents Carnegie Mellon University and DYNAM i X Technologies PI : Jaime G.
AQUAINT IBM PIQUANT ARDACycorp Subcontractor: PIQUANT Question Answering System ARDA AQUAINT Program June Workshop 2002 This work was supported in part.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
Information Dynamics & Interoperability Presented at: NIT 2001 Global Digital Library Development in the New Millennium Beijing, China, May 2001, and DELOS.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
1 Knowledge Based Systems (CM0377) Lecture 6 (last modified 20th February 2002)
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
QAR Question Answer Relationship. Objective~ Knowing the type of question being asked will help you to figure out the answer. Knowing the type of question.
SharePoint 2010 Business Intelligence Module 7: Filter Web Parts.
Refining the Use Cases 1. How Use Cases Evolve  Early efforts typically define most of the major use cases.  The refining stages complete the process.
Chapter – 8 Software Tools.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
PIQUANT Question Answering System
Text Based Information Retrieval
Interpreters Study Semantics of Programming Languages through interpreters (Executable Specifications) cs7100(Prasad) L8Interp.
Associative Query Answering via Query Feature Similarity
David W. Embley Brigham Young University Provo, Utah, USA
6.001 SICP Further Variations on a Scheme
CS246: Information Retrieval
Knowledge Representation I (Propositional Logic)
6.001 SICP Interpretation Parts of an interpreter
Chapter 2: Prolog (Introduction and Basic Concepts)
Topic: Semantic Text Mining
Presentation transcript:

QUIRK:Project Progress Report December Cycorp IBM

Notable Progress Query decomposition extensions Argument-structure approximation Syntactic analysis of textual sources Reflexive justifications

Single Literal Query Decomposition P? Q?, R?, Z? Q?R? Z? (Q & R & Z)  P

Multi Literal Query Decomposition P? Q?, R?, Z? Q?, R? Z? (Q & R & Z)  P

Multi Literal Query Decomposition (likes Bob ?X) (isa X French), (isa X Movie), (likes Amy X) ((isa X French) & (isa X Movie) & (likes Amy X))  (likes Bob X) (likes Amy X)(isa X French), (isa X Movie),

Examples Joins in external DBs (NIMA, USGS… ) –Airports in Travis County, TX –Hospitals located in port cities –... Web services, e.g. IMDB –Actors from the ‘50s As a bridge between KR formats

Davidsonian KR bridge Wellington defeated Napoleon in Waterloo. (thereExists ?EV (and (isa ?EV DefeatingAnOpponent) (performedBy ?EV Wellington) (objectActedOn ?EV Napoleon) (eventOccursAt ?EV Waterloo)))

Argument-Type bridge John lives in a French village (thereExists ?V (and (isa ?V Village) (geographicalSubRegions France ?V) (residesInRegion John ?V)))

Registration of multi-literal removal modules at the moment sufficiently few such modules exist that they can be defined in code plans for declarative registration of such modules in Cyc’s KB even with run-time KB edits.

Arg based query generation (thereExists ?EV (and (isa ?EV AttackOnObject) (maleficiary ?EV Djibouti) (performedBy ?EV ?WHO)) [SUBJ [VERB PERSON$ attack *Djibouti)

Secretary Input: –A CycL query such as (president France ?WHO) –A textual paragraph Output: a ranked list of CycL terms that –represent entities mentioned in the paragraph –are type-appropriate as substitutions for the free variables in the query (?WHO:Person) Three types of Secretary

Secretary 1 Use IBM’s Talent system to learn new lexical entries Tag paragraph with lexical mappings Select type-appropriate CycL tags Rank them by proximity to query focus, as determined by recorded position in the paragraph of all the ground terms in the CycL query

Secretary 2 Use IBM’s Talent system to learn new lexical entries Use output of UPenn’s dependency parser to generate a set of CycL interpretation of the paragraph Select “best” interpretation Return CycL entity in the appropriate relationship to the query’s predicate.

Secretary 3 Use IBM’s Talent system to learn new lexical entries Use output of UPenn’s dependency parser to generate a set of CycL interpretation of the paragraph Select “best” interpretation and turn it into a virtual assertion in Cyc’s KB Ask the original query in the KB so obtained Return all answers.

General observations Secretary 2 and 3 have better precision than Secretary 1, but much lower recall –possibly due to the non-verb-like nature of many Cyc predicates; need to check if the same holds true of multi-literal events Linear proximity of Secretary 1 is almost as good as the argument based analysis of Secretary 2 and 3

Introspective Justifications

Dialog evaluation Basic knowledge representation performed for each of the topics Used KRAKEN GUI for interpretation of questions Used KRAKEN NL generation for reporting answers to analyst

KRAKEN GUI Which paintings about war did Picasso create?

Contextual vs Keyhole approach Several questions asked simultaneously: –“Need background data on the Cuban dissident Elizardo Sanchez to include birth data, education, work ethics, organization affiliations to name a few.” Analyst happy with a summary of all facts known about an entity of interest

Lessons learned Analysts like to ask/see questions/anwers in context Single question/single answer approach could be extended to: –dossier about entity X –preliminary dialog on desired properties of dossier inferred from properties of entity X Justifications become interesting only if answers are sufficiently surprising.

Definitional Questions Evaluation Expectations: –large answer set –both redundancy AND irrelevance –opportunity for structuring answer set by salient features of question focus Actual experience –limited answer set –mostly redundancy

Original Plan Use appositives to learn type of entity “Massimo Cacciari”  “Venice Mayor” Use Cyc to –understand type (a kind of elected official) –generatelist of questions salient for the type when was he elected? what is his party affiliation? … Answer salient questions from textual sources

Revised Plan Use syntactic analysis to extract appositives and relevant VPs Cluster strings so extracted Return one string from each cluster, ranked by the size of the cluster.

Lessons learned Punctuation and function words are crucial Textual sources don’t always support an analysis by “salient features” Semantic analysis not necessarily useful of the end result is expected to be a string that could be easily interpreted by the analyst.