Scott Wen-tau Yih Joint work with Ming-Wei Chang, Chris Meek, Andrzej Pastusiak Microsoft Research The 51st Annual Meeting of the Association for Computational.

Slides:



Advertisements
Similar presentations
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Advertisements

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC
Kai-Wei Chang Joint work with Scott Wen-tau Yih, Chris Meek Microsoft Research.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Deep Learning in NLP Word representation and how to use it for Parsing
Scott Wen-tau Yih (Microsoft Research) Joint work with Vahed Qazvinian (University of Michigan)
LING 581: Advanced Computational Linguistics Lecture Notes April 27th.
1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.
Scott Wen-tau Yih Joint work with Kristina Toutanova, John Platt, Chris Meek Microsoft Research.
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
MCTest-160 [1] contains pairs of narratives and multiple choice questions. An example of narrative QA: Story: “Sally had a very exciting summer vacation.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
STRUCTURED PERCEPTRON Alice Lai and Shi Zhi. Presentation Outline Introduction to Structured Perceptron ILP-CRF Model Averaged Perceptron Latent Variable.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Tag-based Social Interest Discovery
Latent Semantic Analysis Hongning Wang VS model in practice Document and query are represented by term vectors – Terms are not necessarily orthogonal.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Semantic Search: different meanings. Semantic search: different meanings Definition 1: Semantic search as the problem of searching documents beyond the.
Syntactic and semantic models and algorithms in Question Answering Alexander Solovyev Bauman Moscow Sate Technical University RCDL.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky (Technion) Eugene Agichtein (Emory) Evgeniy Gabrilovich (Yahoo!
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Textual Entailment | Learning Lexical Entailment | Wikipedia | Extraction Types | Results & Evaluations | Conclusions & Future Work 1 /20 Extracting a.
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Constructing Knowledge Graph from Unstructured Text Image Source: Kundan Kumar Siddhant Manocha.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Relation Alignment for Textual Entailment Recognition Cognitive Computation Group, University of Illinois Experimental ResultsTitle Mark Sammons, V.G.Vinod.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Evgeniy Gabrilovich and Shaul Markovitch
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn)
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date: From Word Representations:... ACL2010, From Frequency... JAIR 2010 Representing Word... Psychological.
Creative Quotations from Jack Lemmon ( ) born on Feb 8 US actor ; won Oscars for Mister Roberts, 1955 & Save the Tiger, 1971.
Hui Fang (ACL 2008) presentation 2009/02/04 Rick Liu.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
Ensembling Diverse Approaches to Question Answering
Linguistic Graph Similarity for News Sentence Searching
A Brief Introduction to Distant Supervision
Vector-Space (Distributional) Lexical Semantics
Lecture 19: Question Answering
Recognizing Partial Textual Entailment
Web-Mining Agents Multi-Relational Latent Semantic Analysis
Table Cell Search for Question Answering Huan Sun
Question Answering & Linked Data
Enriching Taxonomies With Functional Domain Knowledge
Presentation transcript:

Scott Wen-tau Yih Joint work with Ming-Wei Chang, Chris Meek, Andrzej Pastusiak Microsoft Research The 51st Annual Meeting of the Association for Computational Linguistics (ACL-2013)

Given a factoid question, find the sentence that Contains the answer Can sufficiently support the answer Q: Who won the best actor Oscar in 1973? S 1 : Jack Lemmon was awarded the Best Actor Oscar for Save the Tiger (1973). S 2 : Academy award winner Kevin Spacey said that Jack Lemmon is remembered as always making time for others.

Lemmon was awarded the Best Supporting Actor Oscar in 1956 for Mister Roberts (1955) and the Best Actor Oscar for Save the Tiger (1973), becoming the first actor to achieve this rare double… Source: Jack Lemmon -- Wikipedia Who won the best actor Oscar in 1973?

Tree edit-distance [Punyakanok, Roth & Yih, 2004] Represent question and sentence using their dependency trees Measure their distance by the minimal number of edit operations: change, delete & insert Quasi-synchronous grammar [Wang et al., 2007] Tree-edit CRF [Wang & Manning, 2010] Discriminative learning on tree-edit features [Heilman & Smith, 2010; Yao et al., 2013]

Q: Who won the best actor Oscar in 1973? S: Jack Lemmon was awarded the Best Actor Oscar. Can matching Q & S directly perform comparably?

Q: Who won the best actor Oscar in 1973? S: Jack Lemmon was awarded the Best Actor Oscar. Using a simple word alignment setting Link words in Q that are related to words in S Determine whether two words can be semantically associated using recently developed lexical semantic models

Investigate unstructured and structured models that incorporate rich lexical semantic information Enhanced lexical semantic models (beyond WordNet) are crucial in improving performance Simple unstructured BoW models become very competitive Outperform previous tree-matching approaches

Introduction Problem definition Lexical semantic models QA matching models Experiments Conclusions

What is the fastest car in the world? The Jaguar XJ220 is the dearest, fastest and most sought after car on the planet. Words that are semantically related [Harabagiu & Moldovan, 2001]

Introduction Problem definition Lexical semantic models Synonymy/Antonymy Hypernymy/Hyponymy (the Is-A relation) Semantic word similarity QA matching models Experiments Conclusions

Synonyms can be easily found in a thesaurus Degree of synonymy provides more information ship vs. boat Polarity Inducing LSA (PILSA) [Yih, Zweig & Platt, EMNLP-CoNLL-12] A vector space model that encodes polarity information Synonyms cluster together in this space Antonyms lie at the opposite ends of a unit sphere hot burning cold freezing

Acrimony: rancor, conflict, bitterness; goodwill, affection Affection: goodwill, tenderness, fondness; acrimony, rancor acrimonyrancorgoodwillaffection… Group 1: “acrimony” … Group 2: “affection” … ……………… Inducing polarity

Q:What color is Saturn? S:Saturn is a giant gas planet with brown and beige clouds. Q:Who wrote Moonlight Sonata? S:Ludwig van Beethoven composed the Moonlight Sonata in 1801.

A KB that contains 2.7 million concepts Relations discovered by Hearst patterns from 1.68 billion Web pages Degree of relations based on frequency of term co-occurrences Evaluated on SemEval-12 Relational Similarity [Zhila et al., NAACL-HLT-2013] “Y is a kind of X” – What is the most illustrative example word pair? XY automobilevan wheatbread weatherrain politiciansenator

A “back-off” solution when the exact lexical relation is unclear Measuring Semantic Word Similarity Vector space model (VSM) Similarity score is derived by cosine Heterogeneous VSMs [Yih & Qazvinian, HLT-NAACL-2012] Wikipedia context vectors RNN language model word embedding [Mikolov et al., 2010] Clickthrough-based latent semantic model [Gao et al., SIGIR-2011]

Introduction Problem definition Lexical semantic models QA matching models Bag-of-words model Learning latent structures Experiments Conclusions

Word Alignment – Complete bipartite matching Every word in question maps to every word in sentence What is the fastest car in the world? The Jaguar XJ220 is the dearest, fastest and most sought after car on the planet.

Issue of the bag-of-words models Unrelated parts of sentence will be paired with words in question Q: Which was the first movie that James Dean was in? S: James Dean, who began as an actor on TV dramas, didn’t make his screen debut until 1951’s “Fixed Bayonet.”

The latent structure: word alignment with the many-to-one constraints Each word in needs to be linked to a word in. Each word in can be linked to zero or more words in. What is the fastest car in the world? The Jaguar XJ220 is the dearest, fastest and most sought after car on the planet.

Introduction Problem definition Lexical semantic models QA matching models Experiments Dataset Evaluation metrics Results Conclusions

Created based on TREC QA data Manual judgment for each question/answer-sentence pair Training – Q/A pairs from TREC 8-12 Clean: 5,919 manually judged Q/A pairs (100 questions) Development and Test: Q/A pairs from TREC 13 Dev: 1,374 Q/A pairs (84 questions) Test: 1,866 Q/A pairs (100 questions)

I&L: Identical Word & Lemma Match

WN: WordNet Syn, Ant, Hyper/Hypo

LS: Enhanced Lexical Semantics

NER&AnsType: Named Entity & Answer Type Checking

*Updated numbers; different from the version in the proceedings

Three reasons/sources of errors Uncovered or inaccurate entity relations Lack of robust question analysis Need of high-level semantic representation and inference Q: In what film is Gordon Gekko the main character? S: He received a best actor Oscar in 1987 for this role as Gordon Gekko in “Wall Street”.

Answer sentence selection using word alignment Leveraging enhanced lexical semantic models to find semantically related words Key findings Rich lexical semantic information improves both unstructured (BoW) and structured (LCLR) models Outperform the dependency tree matching approaches Future Work Applications in community QA, paraphrasing, textual entailment High-level semantic representations