Download presentation
Presentation is loading. Please wait.
Published byElmer Shields Modified over 9 years ago
1
Probabilistic Lexical Models for Textual Inference Eyal Shnarch, Ido Dagan, Jacob Goldberger Probabilistic Lexical Models for Textual Inference Eyal Shnarch, Ido Dagan, Jacob Goldberger Bar Ilan University @ IBM July 20121/34
2
The entire talk in a single sentence we addresswith awhich lexical textual inference principled probabilistic model improves state-of-the art Bar Ilan University @ IBM July 20122/34
3
Outline we addresswith awhich lexical textual inference principled probabilistic model improves state-of-the art 123 Bar Ilan University @ IBM July 20123/34
4
we addresswith awhich lexical textual inference principled probabilistic model improves state-of-the art 123 Bar Ilan University @ IBM July 20124/34
5
Textual inference – useful in many NLP apps improves state-of-the-art principled probabilistic model in Belgium Napoleon was defeated In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. Napoleon was Emperor of the French from 1804 to 1815. lexical textual inference Napoleon was not tall enough to win the Battle of Waterloo At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Bar Ilan University @ IBM July 20125/34
6
BIU NLP lab Bar Ilan University @ IBM July 2012 Chaya Liebeskind 6/34
7
Lexical textual inference Complex systems use parser Lexical inference rules link terms from T to H Lexical rules come from lexical resources H is inferred from T iff all its terms are inferred Complex systems use parser Lexical inference rules link terms from T to H Lexical rules come from lexical resources H is inferred from T iff all its terms are inferred Improves state-of-the-art principled probabilistic modellexical textual inference In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. in Belgium Napoleon was defeated Bar Ilan University @ IBM July 20127/34 1 st or 2 nd order co-occurrence
8
Textual inference for ranking Improves state-of-the-art principled probabilistic modellexical textual inference In which battle was Napoleon defeated? In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many 1 2 3 4 5 Bar Ilan University @ IBM July 2012 a b c d e 8/34
9
Ranking textual inference – prior work Improves state-of-the-art principled probabilistic modellexical textual inference Transform T’s parsed tree into H’s parsed tree Based on principled ML model (Wang et al. 07, Heilman and Smith 10, Wang and Manning 10) Transform T’s parsed tree into H’s parsed tree Based on principled ML model (Wang et al. 07, Heilman and Smith 10, Wang and Manning 10) Syntactic- based methods Fast, easy to implement, highly competitive Practical across genres and languages (MacKinlay and Baldwin 09, Clark and Harrison 10, Majumdar and Bhattacharyya 10) Fast, easy to implement, highly competitive Practical across genres and languages (MacKinlay and Baldwin 09, Clark and Harrison 10, Majumdar and Bhattacharyya 10) Heuristic lexical methods Bar Ilan University @ IBM July 20129/34
10
Lexical entailment scores – current practice Count covered/uncovered (Majumdar and Bhattacharyya, 2010; Clark and Harrison, 2010) Similarity estimation (Corley and Mihalcea, 2005; Zanzotto and Moschitti, 2006) Vector space (MacKinlay and Baldwin, 2009) Mostly heuristic Count covered/uncovered (Majumdar and Bhattacharyya, 2010; Clark and Harrison, 2010) Similarity estimation (Corley and Mihalcea, 2005; Zanzotto and Moschitti, 2006) Vector space (MacKinlay and Baldwin, 2009) Mostly heuristic Bar Ilan University @ IBM July 201210/34 principled probabilistic model
11
we addresswith awhich lexical textual inference principled probabilistic model improves state-of-the art 123 Bar Ilan University @ IBM July 201211/34
12
Probabilistic model – overview Improves state-of-the-art principled probabilistic modellexical textual inference which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed knowledge integration term-level sentence-level t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 h1h1 h2h2 h3h3 annotations are available at sentence-level only x1x1 x2x2 x3x3 Bar Ilan University @ IBM July 201212/34
13
Knowledge integration Distinguish resources reliability levels WordNet >> similarity-based thesauri (Lin, 1998; Pantel and Lin, 2002) Consider transitive chains length The longer a chain is the lower its probability Consider multiple pieces of evidence More evidence means higher probability Distinguish resources reliability levels WordNet >> similarity-based thesauri (Lin, 1998; Pantel and Lin, 2002) Consider transitive chains length The longer a chain is the lower its probability Consider multiple pieces of evidence More evidence means higher probability Bar Ilan University @ IBM July 2012 which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed t rule 1 rule 2 transitive chain r is a rule multiple evidence 13/34
14
Probabilistic model – term level Improves state-of-the-art principled probabilistic modellexical textual inference which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed OR t' r is a rule is the reliability level of the resource which suggested r ACL 11 short paper this level parameters: one per input lexical resource t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 h1h1 h2h2 h3h3 multiple evidence Bar Ilan University @ IBM July 201214/34
15
Probabilistic model – overview Improves state-of-the-art principled probabilistic modellexical textual inference which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed knowledge integration term-level sentence-level Bar Ilan University @ IBM July 201215/34
16
Probabilistic model – sentence level Improves state-of-the-art principled probabilistic modellexical textual inference we define hidden binary random variables: x t = 1 iff h t is inferred from T (zero otherwise) which battle was Napoleon defeated h1h1 h2h2 h3h3 x1x1 x2x2 x3x3 final sentence- level decision AND y Modeling with AND gate: Most intuitively However Too strict Does not model terms dependency Bar Ilan University @ IBM July 201216/34
17
Probabilistic model – sentence level Improves state-of-the-art principled probabilistic modellexical textual inference this level parameters which battle was Napoleon defeated h1h1 h2h2 h3h3 x1x1 x2x2 x3x3 y1y1 y2y2 y3y3 final sentence- level decision we define another binary random variable: y t – inference decision for the prefix h 1 … h t P(y t = 1) is dependent on y t-1 and x t M-PLM x t = 1 iff h t is inferred by T (zero otherwise) Bar Ilan University @ IBM July 201217/34
18
M-PLM – inference Improves state-of-the-art principled probabilistic modellexical textual inference which battle was Napoleon defeated h1h1 h2h2 h3h3 x1x1 x2x2 x3x3 y1y1 y2y2 y3y3 final sentence- level decision q ij (k) can be computed efficiently with a forward algorithm Bar Ilan University @ IBM July 201218/34
19
M-PLM – summary Improves state-of-the-art principled probabilistic modellexical textual inference Observed Lexical rules which link terms Learning we developed EM scheme to jointly learn all parameters Bar Ilan University @ IBM July 201219/34
20
so how our model does? Improves state-of-the-art principled probabilistic modellexical textual inference In which battle was Napoleon defeated? In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many 1 2 3 4 5 Bar Ilan University @ IBM July 201220/34
21
we addresswith awhich lexical textual inference principled probabilistic model improves state-of-the art 123 Bar Ilan University @ IBM July 201221/34
22
Evaluations – data sets improves sate-of-the-art principled probabilistic modellexical textual inference Ranking in passage retrieval for QA (Wang et al. 07) 5700/1500 question-candidate answer pairs from TREC 8-13 Manually annotated Notable line of work from recent years: Punyakanok et al. 04, Cui et al. 05, Wang et al. 07, Heilman and Smith 10, Wang and Manning 10 Recognizing textual entailment within a corpus 20,000 text-hypothesis pairs in each RTE-5, RTE-6 Originally constructed for classification Bar Ilan University @ IBM July 201222/34
23
Evaluations – baselines Syntactic generative models Require parsing Apply sophisticated machine learning methods (Punyakanok et al. 04, Cui et al. 05, Wang et al. 07, Heilman and Smith 10, Wang and Manning 10) Lexical model – Heuristically Normalized-PLM AND-gate for the sentence-level Add heuristic normalizations to addresses its disadvantages (TextInfer workshop 11) Performance in line with best RTE systems Syntactic generative models Require parsing Apply sophisticated machine learning methods (Punyakanok et al. 04, Cui et al. 05, Wang et al. 07, Heilman and Smith 10, Wang and Manning 10) Lexical model – Heuristically Normalized-PLM AND-gate for the sentence-level Add heuristic normalizations to addresses its disadvantages (TextInfer workshop 11) Performance in line with best RTE systems improves sate-of-the-art principled probabilistic modellexical textual inference HN-PLM Bar Ilan University @ IBM July 201223/34
24
QA results – syntactic baselines improves sate-of-the-art principled probabilistic modellexical textual inference Bar Ilan University @ IBM July 201224/34
25
QA results – syntactic baselines + HN-PLM improves sate-of-the-art principled probabilistic modellexical textual inference +0.7% +1% Bar Ilan University @ IBM July 201225/34
26
QA results – baselines + M-PLM improves sate-of-the-art principled probabilistic modellexical textual inference +3.2% +3.5% M-PLM Bar Ilan University @ IBM July 201226/34
27
RTE results – M-PLM vs. HN-PLM improves sate-of-the-art principled probabilistic modellexical textual inference +7.3% +1.9% +6.0% +3.6% Bar Ilan University @ IBM July 201227/34
28
First approach - summary Clean probabilistic lexical model As a lexical component or as a stand alone inference system Superiority of principled methods over heuristic ones Attractive passage retrieval ranking method Code available - BIU NLP downloads M-PLM limits Processing is term order dependent Lower performance on classification vs. HN-PLM does not normalize well across hypotheses length Clean probabilistic lexical model As a lexical component or as a stand alone inference system Superiority of principled methods over heuristic ones Attractive passage retrieval ranking method Code available - BIU NLP downloads M-PLM limits Processing is term order dependent Lower performance on classification vs. HN-PLM does not normalize well across hypotheses length improves state-of-the-art principled probabilistic modellexical textual inference Bar Ilan University @ IBM July 201228/34
29
we address with a which lexical textual inference principled probabilistic model improves state-of-the art 123 Bar Ilan University @ IBM July 2012 A (very) new 4 second approach: resources as observers 29/34
30
each resource is a witness Bar Ilan University @ IBM July 2012 which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 h1h1 h2h2 h3h3 t' 30/34
31
Bottom-up witnesses model Bar Ilan University @ IBM July 2012 which battle was Napoleon defeated Battle of Waterloo French army led by Napoleon was crushed t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 h1h1 h2h2 h3h3 x1x1 x2x2 x3x3 AND y Likelihood 31/34
32
Advantages of the second approach Bar Ilan University @ IBM July 201232/34
33
(near) future plans Context model There are other languages than English Deploy the new version of a Wikipedia-base lexical resource with the Italian dump Test the probabilistic lexical models for other languages Cross language textual entailment Context model There are other languages than English Deploy the new version of a Wikipedia-base lexical resource with the Italian dump Test the probabilistic lexical models for other languages Cross language textual entailment Bar Ilan University @ IBM July 201233/34
34
Cross Language Textual Entailment Bar Ilan University @ IBM July 2012 quale battaglia fu sconfitto Napoleone Battle of Waterloo French army led by Napoleon was crushed Italian monolingual English-Italian phrase table English monolingual Thank You 34/34
35
Bar Ilan University @ IBM July 201235/34
36
Demo examples: [Bap,WN] no transitivity Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [WN,Wiki] Barak Obama's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people Barak Obama - WN is out of date, need a new version of Wikipedia Bill_Clinton's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people ------------------------------------------------------------------------------ [Bap,WN] this time with Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [VO,WN,Wiki] in the Battle_of_Waterloo the French army led by Napoleon was crushed in which battle Napoleon was defeated? ------------------------------------------------------------------------------ [all] 1. in the Battle_of_Waterloo the French army led by Napoleon was crushed 72% 2. Napoleon was not tall enough to win the Battle_of_Waterloo47% 3. at Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo 34% 4. Napoleon engaged in a series of wars, and won many47% 5. Napoleon was Emperor of the French from 1804 to 18159% [a bit long run] Demo examples: [Bap,WN] no transitivity Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [WN,Wiki] Barak Obama's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people Barak Obama - WN is out of date, need a new version of Wikipedia Bill_Clinton's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people ------------------------------------------------------------------------------ [Bap,WN] this time with Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [VO,WN,Wiki] in the Battle_of_Waterloo the French army led by Napoleon was crushed in which battle Napoleon was defeated? ------------------------------------------------------------------------------ [all] 1. in the Battle_of_Waterloo the French army led by Napoleon was crushed 72% 2. Napoleon was not tall enough to win the Battle_of_Waterloo47% 3. at Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo 34% 4. Napoleon engaged in a series of wars, and won many47% 5. Napoleon was Emperor of the French from 1804 to 18159% [a bit long run] Bar Ilan University @ IBM July 201236/34
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.