Computing Word-Pair Antonymy *Saif Mohammad *Bonnie Dorr φ Graeme Hirst *Univ. of Maryland φ Univ. of Toronto EMNLP 2008.

Slides:



Advertisements
Similar presentations
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Associations Let’s do the  2  dance It’s a question of frequency.
A UTOMATICALLY A CQUIRING A S EMANTIC N ETWORK OF R ELATED C ONCEPTS Date: 2011/11/14 Source: Sean Szumlanski et. al (CIKM’10) Advisor: Jia-ling, Koh Speaker:
Kai-Wei Chang Joint work with Scott Wen-tau Yih, Chris Meek Microsoft Research.
Generating High-Coverage Semantic Orientation Lexicons From Overtly Marked Words and a Thesaurus † Institute for Advanced Computer Studies and CLIP lab.
Scott Wen-tau Yih (Microsoft Research) Joint work with Vahed Qazvinian (University of Michigan)
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
CHAPTER 11 Inference for Distributions of Categorical Data
PSY 307 – Statistics for the Behavioral Sciences
Jimmy Lin The iSchool University of Maryland Wednesday, April 15, 2009
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.
Chapter 9 Hypothesis Testing 9.4 Testing a Hypothesis about a Population Proportion.
Predicting the Semantic Orientation of Adjectives
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.
Towards the automatic identification of adjectival scales: clustering adjectives according to meaning Authors: Vasileios Hatzivassiloglou and Kathleen.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Claims about a Population Mean when σ is Known Objective: test a claim.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Unsupervised Word Sense Disambiguation Rivaling Supervised Methods Oh-Woog Kwon KLE Lab. CSE POSTECH.
Copyright ©2011 Nelson Education Limited Large-Sample Tests of Hypotheses CHAPTER 9.
Multi-Prototype Vector Space Models of Word Meaning __________________________________________________________________________________________________.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Learning process, strategies, and web-based concordancers: a case study 指導教授 : 陳 明 溥 研 究 生 : 許 良 村 Sun, Y. C. (2003). Learning process, strategies, and.
Introduction to Probability and Statistics Thirteenth Edition Chapter 9 Large-Sample Tests of Hypotheses.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
1 Nonparametric Statistical Techniques Chapter 17.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Chapter Outline Goodness of Fit test Test of Independence.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Wed, Nov 1, 2006.
© Copyright McGraw-Hill 2004
Testing Hypotheses about a Population Proportion Lecture 31 Sections 9.1 – 9.3 Wed, Mar 22, 2006.
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Link Distribution on Wikipedia [0407]KwangHee Park.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1 Where we are going : a graphic: Hypothesis Testing. 1 2 Paired 2 or more Means Variances Proportions Categories Slopes Ho: / CI Samples Ho: / CI / CI.
The Chi-Square Distribution  Chi-square tests for ….. goodness of fit, and independence 1.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Testing Hypotheses about a Population Proportion
Review of Chapter 11 Comparison of Two Populations
Chapter 9 Hypothesis Testing
Inferences on Two Samples Summary
Hypothesis Tests Regarding a Parameter
Testing Hypotheses about a Population Proportion
Testing Hypotheses about a Population Proportion
 Measures of central tendency  Measures of central tendency are a combination of two words i.e. ‘measure’ and ‘Central tendency’. Measure means methods.
Presentation transcript:

Computing Word-Pair Antonymy *Saif Mohammad *Bonnie Dorr φ Graeme Hirst *Univ. of Maryland φ Univ. of Toronto EMNLP 2008

Introduction Antonymy: pair of semantically contrasting words. Ex: Strongly antonymous: Hot  Cold Semantically contrasting:Enemy  Fan Not antonymous:Penguin  Clown

Usage Detecting contradictions Detecting humor Automatic creation of thesaurus

Problem Definition Given a thesaurus, find out the antonymous category pairs. Assign the degree of antonymy to each pair of antonymous categories.

Hypothesis(1) The Co-occurrence Hypothesis of Antonyms – Antonymous word pairs occur together much more often than other word pairs.

Hypothesis(1) Empirical proof: – 1,000 antonymous pairs from Wordnet – 1,000 randomly generated word pairs – Use BNC as corpus, set window size 5. – Calculate the MI for each word pairs and average it AverageStandard deviation Antonymous pair Random pair

Hypothesis(2) The Distributional Hypothesis of Antonyms – Antonyms occur in similar contexts more often than non-antonymous words – Ex work: activity of doing job play: activity of relaxation

Hypothesis(2) Empirical proof: – Use the same set of word pairs in hypothesis(1) – Calculate the distributional distance between their categories AverageStandard deviation Antonymous pair Random pair

Distributional Distance between Two Thesaurus Categories c 1,c 2 : thesaurus category I(x,y):pointwise mutual information between x and y T(c):the set of all words w such that I(c,w)>0

Method Determine pairs of thesaurus categories that are contrasting in meaning Use the co-occurrence and distributional hypotheses to determine the degree of antonymy of word pairs

Method 16 affix rules were applied to Macquarie Thesaurus 2,734 word pairs were generated as a seed set. Exceptions: sect  X  insect Relatively few

Method 10,807 pairs of semantically contrasting word pairs from WordNetWordNet

Method If any word in thesaurus category C1 is antonymous to any word in category C2 as per a seed antonym pair, then the two categories are marked as contrasting. If no word in C1 is antonymous to any word in C2, then the categories are considered not contrasting

Method Degree of antonymy----category level – By distributional hypothesis of antonyms, we claim that the degree of antonymy between two contrasting thesaurus categories is directly proportional to the distributional closeness of the two concepts

Method Degree of antonymy----word level – target words belong to the same thesaurus paragraphs as any of the seed antonyms linking the two contrasting categories  highly antonymous – target words do not both belong to the same paragraphs as a seed antonym pair, but occur in contrasting categories  medium antonymous – target words with low tendency to co-occur  lowly antonymous

Method Adjacency Heuristic – Most thesauri are ordered such that contrasting categories tend to be adjacent

Evaluation 1,112 Closest-opposite questions designed to prepare students for GRE(Graduate Record Examination) – 162 questions as the development set – 950 questions as the test set

Evaluation Closest-opposite questions – Ex: adulterate: a. renounce b. forbid c. purify d. criticize e. correct

Evaluation Closest-opposite questions – Ex: adulterate: a. renounce b. forbid c. purify d. criticize e. correct 摻雜的 純淨的批評 正確 禁止聲明放棄

Evaluation

Discussion The automatic approach does indeed mimic human intuitions of antonymy. In languages without a wordnet, substantial accuracies may be achieved. Wordnet and affix-generated seed are complementary.

Conclusion Proposed an empirical approach to antonymy that combines corpus co-occurrence statistics with the structure of a thesaurus. The system can identify the degree of antonymy between word pairs. An empirical proof that antonym pairs tend to be used in similar contexts.

Thanks