Download presentation
Presentation is loading. Please wait.
Published byRebecca Day Modified over 9 years ago
1
Do Supervised Distributional Methods Really Learn Lexical Inference Relations? Omer Levy Ido Dagan Bar-Ilan University Israel Steffen Remus Chris Biemann Technische Universität Darmstadt Germany
2
Lexical Inference
3
Lexical Inference: Task Definition
4
Distributional Methods of Lexical Inference
5
Unsupervised Distributional Methods
6
Supervised Distributional Methods
7
Main Questions
8
Experiment Setup
9
9 Word Representations 3 Representation Methods: PPMI, SVD (over PPMI), word2vec (SGNS) 3 Context Types Bag-of-Words (5 words to each side) Positional (2 words to each side + position) Dependency (all syntactically-connected words + dependency) Trained on English Wikipedia 5 Lexical-Inference Datasets Kotlerman et al., 2010 Baroni and Lenci, 2011 (BLESS) Baroni et al., 2012 Turney and Mohammad, 2014 Levy et al., 2014
10
Supervised Methods
11
Are current supervised DMs better than unsupervised DMs?
12
Previously Reported Success Prior Art: Supervised DMs better than unsupervised DMs Accuracy >95% (in some datasets) Our Findings: High accuracy of supervised DMs stems from lexical memorization
13
Lexical Memorization
14
Avoid lexical memorization with lexical train/test splits If “animal” appears in train, it cannot appear in test Lexical splits applied to all our experiments
15
Experiments without Lexical Memorization 4 supervised vs 1 unsupervised Cosine similarity Cosine similarity outperforms all supervised DMs in 2/5 datasets Conclusion: supervised DMs are not necessarily better
18
In practice: Almost as well as Concat & Diff Best method in 1/5 dataset
21
Prototypical Hypernyms
23
Recall: portion of real positive examples ( ✔ ) classified true Match Error: portion of artificial examples ( ✘ ) classified true Bottom-right: prefer ✔ over ✘ Good classifiers Top-left: prefer ✘ over ✔ Worse than random Diagonal: cannot distinguish ✔ from ✘ Predicted by hypothesis
24
Prototypical Hypernyms
25
Prototypical Hypernyms: Analysis
26
Conclusions
28
What if the necessary relational information does not exist in contextual features?
29
The Limitations of Contextual Features
31
Also in the Paper… Theoretical Analysis Explains our empirical findings Sim Kernel: A new supervised method Partially addresses the issue of prototypical hypernyms
32
Theoretical Analysis
33
Lexical Inference: Motivation
34
Lexical Inference
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.