AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Information Retrieval in Practice
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Search Engines and Information Retrieval
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Automatic Web Page Categorization by Link and Context Analysis Giuseppe Attardi Antonio Gulli Fabrizio Sebastiani.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Overview of Search Engines
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Introduction to Natural Language Processing Heshaam Faili University of Tehran.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
Search Engines and Information Retrieval Chapter 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
CoGenTex, Inc. Ontology-based Multimodal User Interface in MOQA AQUAINT 18-Month Workshop San Diego, California Tanya Korelsky Benoit Lavoie Ted Caldwell.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Carnegie Mellon School of Computer Science Copyright © 2001, Carnegie Mellon. All Rights Reserved. JAVELIN Project Briefing 1 AQUAINT Phase I Kickoff December.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Edinburg March 2001CROSSMARC Kick-off meetingICDC ICDC background and know-how and expectations from CROSSMARC CROSSMARC Project IST Kick-off.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Natural language processing tools Lê Đức Trọng 1.
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
AQUAINT Phase II Six Month Workshop – October 2004 Fusing Rich Information Extracted from Multiple Media and Languages to Generate Contextualized, Complex.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
AQUAINT June 2002 Workshop June 2002 Just-in-Time Interactive Question Answering Sanda Harabagiu: PI Language Computer Corporation.
Faculty Faculty Richard Fikes Edward Feigenbaum (Director) (Emeritus) (Director) (Emeritus) Knowledge Systems Laboratory Stanford University “In the knowledge.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Distance Education Network & Information Sciences Institute USC Viterbi School of Engineering Presented by Erin Shaw Research Computer Scientist Center.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content Kalina Bontcheva, Diana Maynard, Hamish Cunningham, Horacio.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Information Retrieval in Practice
What is IR? In the 70’s and 80’s, much of the research focused on document retrieval In 90’s TREC reinforced the view that IR = document retrieval Document.
Multimedia Information Retrieval
Introduction to Information Retrieval
Information Retrieval
Presentation transcript:

AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering Vasileios Hatzivassiloglou, Kathleen R. McKeown Columbia University Dan Jurafsky, Wayne H. Ward, James H. Martin University of Colorado

AQUAINT Mid-Year PI Meeting – June 2002 Our focus – Question type (I) Distinguish between questions answerable with –Unique facts (TREC-like) –Facts but not absolute facts; depend on source; perspective; time –Opinions / subjective answers When was Mullah Omar born? vs. Who controls Jalalabad? vs. Will the King’s return be good for Afghanistan?

AQUAINT Mid-Year PI Meeting – June 2002 Our focus – Question type (II) Questions with multiple answers Questions with long answers –Definitions –Descriptive answers (different levels of detail) –Lists of related facts

AQUAINT Mid-Year PI Meeting – June 2002 Our focus – Multiple sources Integrate answers from multiple sources Use similarities across sources to locate core part of the answer Highlight important differences between sources

AQUAINT Mid-Year PI Meeting – June 2002 Our focus – Answer form Answer contains –Core part where sources agree –Differences in perspective –Trends in time Text is not copied verbatim Text generation allows for concise combination of materials from multiple sources

AQUAINT Mid-Year PI Meeting – June 2002 Our focus – Q&A Environment Spoken and written questions Specialized language model for accepting questions in realistic, noisy environments Context management system allows for –clarifications –follow-up questions

AQUAINT Mid-Year PI Meeting – June 2002 Technology Innovations Specialized speech recognition and dialog management Semantic parsing of questions and source text Event recognition Information fusion

AQUAINT Mid-Year PI Meeting – June 2002 Progress in the first six months Revised architecture Software support for integrated system System modules prototyped –Baseline Q&A system –Question hierarchy –Semantic Parser –Event Detection Questions of different types collected

AQUAINT Mid-Year PI Meeting – June 2002 Revised Architecture Answer planning Web Question classification Specialized language model Spoken question Speech recognition Recognition feedback Typed question Recognized question Semantic parser MG Google Local collections, TREC Answer extraction and combination Answer strategy selector Query manager Event detection Information fusion Context/dialog manager Short answers Long answers Learned answer plans

AQUAINT Mid-Year PI Meeting – June 2002 Integrated System Support Defined APIs for communication between system modules Added ability to communicate via data structures in memory (interprocess calls) Ability to read/write XML Implemented common system foundation and module manager

AQUAINT Mid-Year PI Meeting – June 2002 Baseline Q&A System Question is transformed to a search engine query Answers are retrieved from the web or a fixed collection Sentences or paragraphs containing the answer are extracted

AQUAINT Mid-Year PI Meeting – June 2002 Prototype System metal company bus

AQUAINT Mid-Year PI Meeting – June 2002 Question Analysis Tokenization Part-of-speech assignment Named entity extraction Syntactic parsing Recognition of key and target phrases –How many cities in the US have a stadium?

AQUAINT Mid-Year PI Meeting – June 2002 Semantic Parser What effect does a prism have on light? A prism has ____ effect on light [ cause prism] has [ result ____ ] on [ theme light] [ agent Newton] split [ theme white light] [ result into its spectrum of colours] [ cause by beaming it through a prism] Implemented domain-independent version of semantic parser (40% recall, 60% precision)

AQUAINT Mid-Year PI Meeting – June 2002 Question Classification By question type (detailed hierarchy has been built) Distinguish between descriptive and non- descriptive answers –Who is the President of the United States? –How do I become President?

AQUAINT Mid-Year PI Meeting – June 2002 Question Mining Collection of questions and answers –From FAQs (mostly descriptions) –From trivia sites (mostly non-descriptive facts) Implemented classifier between these types –Features: words, length, part-of-speech

AQUAINT Mid-Year PI Meeting – June 2002 Answer Generation Bottom-up, from the data –Clustering organizes similar answers together –Fusion matches common parts –Generation combines answer fragments in a concise response Top-down, using question-specific plans –Appropriate for lists of facts

AQUAINT Mid-Year PI Meeting – June 2002 Event Recognition Atomic events vs. TDT events vs. topics Events as a basis for segmenting documents and classifying document fragments as matching a question Event algebra will allow –grouping sub-events –linking related events –detecting updates

AQUAINT Mid-Year PI Meeting – June 2002 Detecting an event Events can be detected on –participants (named entities, semantic roles) –time –location –limited verbal features Collected data and human annotations Implemented event detection system (80% accuracy)

AQUAINT Mid-Year PI Meeting – June 2002 Goals for the first six months Initial FrameNet parser (limited coverage) Identification of participants, time, location Identifying paraphrases from comparable news reports on the same event Adapting information fusion from summarization to question-answering Building prototype Q&A system

AQUAINT Mid-Year PI Meeting – June 2002 Goals for the first six months Initial FrameNet parser (limited coverage) Identification of participants, time, location Identifying paraphrases from comparable news reports on the same event Adapting information fusion from summarization to question-answering Building prototype Q&A system

AQUAINT Mid-Year PI Meeting – June 2002 Progress on Other Items Initial syntactic question paraphrasing Hierarchy of question types Tools for searching a specific collection Event recognition prototype built Started analysis of data sources for questions with multiple or long answers

AQUAINT Mid-Year PI Meeting – June 2002 Goals for the next six months Full syntactic and lexical question paraphrasing Classifier for choosing appropriate question type Integrate semantic labels into question analysis Answer strategy selector Initial context management module Participation in TREC Data collection and classification of questions with multiple or long answers