A Tripartite Question Answering Architecture for Integrating Diverse Knowledge Resources Boris Katz, Gary Borchardt, Sue Felshin and Jimmy Lin MIT Computer.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Knowledge Management and Engineering David Riaño.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
START: Natural Language Access to Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris.
ENTERFACE’08 Multimodal high-level data integration Project 2 1.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Overall Information Extraction vs. Annotating the Data Conference proceedings by O. Etzioni, Washington U, Seattle; S. Handschuh, Uni Krlsruhe.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Universität Stuttgart Universitätsbibliothek Information Retrieval on the Grid? Results and suggestions from Project GRACE Werner Stephan Stuttgart University.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Multimedia Databases (MMDB)
Survey of Semantic Annotation Platforms
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Artificial intelligence project
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Flexible Text Mining using Interactive Information Extraction David Milward
Carnegie Mellon School of Computer Science Copyright © 2001, Carnegie Mellon. All Rights Reserved. JAVELIN Project Briefing 1 AQUAINT Phase I Kickoff December.
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs Artificial Intelligence Center SRI International Menlo Park, California (with Douglas.
Secure Systems Research Group - FAU SW Development methodology using patterns and model checking 8/13/2009 Maha B Abbey PhD Candidate.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Oxygen Indexing Relations from Natural Language Jimmy Lin, Boris Katz, Sue Felshin Oxygen Workshop, January, 2002.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.
National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation design of energy processes.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
HITIQA: Scenario Based Question Answering Tomek Strzalkowski, et al The State University of New York at Albany Paul Kantor, et al Rutgers University Boris.
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Strategies for Advanced Question Answering Sanda Harabagiu & Finley Lacatusu Language Computer Corporation HLT-NAACL2004 Workshop.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Using Semantic Relations to Improve Information Retrieval
NATURAL LANGUAGE PROCESSING
AQUAINT Mid-Year Workshop: Observations and Comments Jimmy Lin MIT Artificial Intelligence Laboratory.
Pattern-Directed Programming
CSE 635 Multimedia Information Retrieval
Deniz Beser A Fundamental Tradeoff in Knowledge Representation and Reasoning Hector J. Levesque and Ronald J. Brachman.
Information Retrieval
Presentation transcript:

A Tripartite Question Answering Architecture for Integrating Diverse Knowledge Resources Boris Katz, Gary Borchardt, Sue Felshin and Jimmy Lin MIT Computer Science and Artificial Intelligence Laboratory October 8, 2004

MIT Moving Forward Question answering today… Mostly focused on simple questions Driven by IR and named-entity detection One-shot interactions: “context free” Focused on textual documents Future directions More complex questions Deeper semantic processing Knowledge from multiple resources Extended user interactions: “scenario-based QA” Multimodal QA: retrieving audio and video MIT AQUAINT Phase 2 Focus

MIT Project Goals Develop advanced QA capabilities Push the envelope in NLP technology Create natural user-system interactions Provide seamless access to heterogeneous data Fuse knowledge from multiple resources Integrate linguistic, statistical, and knowledge-based strategies Build a comprehensive end-to-end QA system Focus on deployment in real-world environments Contribute to theories of knowledge representation and language comprehension

MIT Tripartite QA Architecture

MIT Top Layer: Understanding Language Coordinate natural language interactions with users Primary responsibilities: Analyze natural language sentences Disambiguate user information needs interactively Manage discourse and dialog

MIT Bottom Layer: Accessing Resources Complex questions require multiple heterogeneous resources to answer Our solution: OmniStore, a uniform knowledge repository based on ternary expressions Sources of knowledge: Structured and semi-structured databases Syntactic and semantic relations automatically extracted from free text Natural language annotations attached to opaque knowledge segments

MIT OmniStore

MIT Middle Layer: Connecting the Pieces Bridge the gap between questions and knowledge required to answer those questions Knowledge fusion and complex reasoning: Decompose complex questions into combinations of simpler questions Efficiently access resources required to answer individual questions Combine smaller “nuggets of knowledge” into a coherent response

MIT In This Presentation Explicit, syntactically-based decomposition of questions Using syntactic cues to decompose questions into combinations of simpler questions Answering simpler questions with different resources Implicit, semantically-based decomposition of questions Applying domain rules to decompose questions into combinations of simpler questions Answering simpler questions using the CNS WMD Terrorism Database Managing extended user interactions Creating more natural dialog by handling ellipsis The beginnings of...

MIT MIT AQUAINT QA Server START+ IMPACT+ Omnibase+ WMD Terrorism database Infoplease Biography.com WorldBook

MIT Answering Complex Questions Syntactically decomposing questions: Semantically decomposing questions: How many people live in the capital of the third largest Asian country? What is the third largest Asian country? What is its capital? How many people live there? Could HAMAS carry out an attack in the United States with biological agents? Does HAMAS have the expertise to carry out an attack using biological agents? Does HAMAS have the motivation to carry out an attack in the United States?

MIT Syntactic Decomposition Parse questions into nested ternary expressions Successively resolve groups of ternary expressions containing unbound variables Answer sub-questions by replacing variables with values How many people live in the capital of the 3rd largest Asian country? 1. What is the 3rd largest Asian country? ANSWER = Kazakhstan ANSWER = Almaty 2. What is the capital of Kazakhstan? ANSWER = 1.2 million 3. How many people live in Almaty?

MIT A Complete Example How many people live in the capital of the 3rd largest Asian country? in capital+9815 > in capital+9815 > in Almaty > country+9813 = Kazakhstan The third largest Asian country is Kazakhstan. capital+9815 = Almaty The capital of Kazakhstan is Almaty. *numeral* = 1.2 million The population of Almaty is 1.2 million.

MIT

Ellipsis "What country in Africa has the largest population?" "How about area?" "area" X X "country" "Africa" "population" possible antecedents There are three NPs in the previous query. Which one should be replaced? START employs linguistic and ontological knowledge to resolve ambiguities: Lexical semantic properties of English nouns Reasoning over relevant domain knowledge

MIT

Resource 1 Resource 2 Resource n Natural Language Questions Symbolic Queries Syntactic and Semantic Decomposition using Domain Knowledge Individual Resources … A Visualization

MIT The WMD Terrorism Database

MIT Knowledge Templates stylized natural language “wrappers” around selected database fields In [1995], [religious cult] [Aum Supreme Truth] carried out a [use of agent] in [Japan], involving [chemical agent] [sarin].

MIT constants (e.g., "1995") unnamed variables (e.g., "something") named variables (e.g., "some year") restricted variables (e.g., "some year (> 1990)") reported variables (e.g., "what year")... Query Arguments In [1995], [religious cult] [Aum Supreme Truth] carried out a [use of agent] in [Japan], involving [chemical agent] [sarin]. Each field can be similarly treated…

MIT [ ] could carry out an attack in [ ] using a [ ]. “Could the KKK be involved in an attack using biological weapons?” “Could an attack be carried out in Italy involving chemical weapons?” “Are any groups trying to conduct an attack in the United States?” “What groups will be able to carry out an attack in the US?” “In what countries could Hizballah execute an attack?” “Aum Shinrikyo could carry out an attack with what agent types?” From Language to Queries Many natural language questions can be represented by the same knowledge template

MIT Two Domain Rules [some group] has the expertise to carry out an attack using a [some agent type]. [some group] has the motivation to carry out an attack in [some country]. [some group] could carry out an attack in [some country] using a [some agent type]. (A group could carry out an attack if the group has the expertise and the motivation to do so.) [some group] has the expertise to carry out an attack using a [some agent type]. In [something], [something] [some group] carried out a [something (<>attempted acquisition) (<>hoax/prank/threat) (<>plot only)] in [something], involving [some agent type] [something]. (A group has the expertise to carry out an attack if the group has been involved in a WMD terrorism incident other than an attempted acquisition, hoax, etc., or unexecuted plot.)

MIT

Terrorist Activities Which groups have been involved in attacks in the United States? Has Aum Shinrikyo carried out an attack in Japan with a biological agent? In what countries have organizations executed attacks with radiological weapons? Did the Japanese Red Army carry out a threat in Japan? Has the KKK been engaged in an attack in the US? What groups have put on a hoax in the United States? Did Aum Supreme Truth plot to use a chemical agent in the United States? Which groups have acquired a chemical weapon? What groups have issued a threat? Did the Animal Liberation Front issue a threat?

MIT Database Contents What group types are there? What groups are in the WMD DB? Is Aum Shinrikyo portrayed in the WMD Terrorism Database? Is Turkey in the WMD Terrorism DB? Is PFLP in the WMD Terrorism DB? Is the Japanese Red Army specified in the WMD Terrorism Database? Is the KKK included? What event types are specified in the WMD DB? What countries are in the WMD DB? Is the Netherlands in the WMD DB? What agent types are in the WMD Terrorism Database?

MIT Relationships What group type is Dark Harvest? What groups are right-wing organizations? What event types are left-wing groups associated with? What group types are in Mexico? What countries have criminal organizations? Are criminal organizations in Lithuania? Religious cults have a presence in what countries? Are nationalist groups associated with radiological agents? What groups are associated with use of agents? Is Aum Shinrikyo associated with use of agents? What groups does Canada have? The Red Army Faction is in what countries? What groups have a presence in Turkey? What groups are associated with nuclear agents? Has use of agents occurred in Germany?

MIT Capabilities and Motivations Does Hizballah want to carry out an attack in Lebanon? Which groups have the motivation to carry out an attack in France? Does Hizballah have the expertise to carry out an attack with a chemical agent? What groups have the expertise to carry out an attack with a chemical agent? Could Hizballah conduct an attack in Turkey using a biological agent? Using what agent types could Hizballah execute an attack in Lebanon? Are the Chechen rebels able to carry out an attack in Georgia using a chemical agent? In what countries could Hizballah carry out an attack using biological agents?

MIT Patterns of Global Terrorism -- Profiles

MIT Resource 1 Resource 2 Resource n Natural Language Questions Symbolic Queries Syntactic and Semantic Decomposition using Domain Knowledge Individual Resources … A Visualization

MIT User: "Does Hizballah have the know-how to carry out an attack with chemical weapons?" System: "Are you interested in determining whether Hizballah can carry out an attack in some specific country using chemical agents?" Scenario-Guided Interaction

MIT Summary Initial realization of the tripartite QA architecture completed New capabilities: Explicit, syntactically-based decomposition of questions Augmented handling of elliptic questions Implicit, semantically-based decomposition of questions Incorporated resources: CNS WMD Terrorism database A range of web-based resources (e.g., Infoplease)