Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework MEANING project Experiment 6.H(d) Luis Villarejo and Lluís M à rquez.

Slides:

Advertisements

Similar presentations

Building Wordnets Piek Vossen, Irion Technologies.

Advertisements

University of Sheffield NLP Module 4: Machine Learning.

Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.

Word Sense Disambiguation for Machine Translation Han-Bin Chen

Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.

TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.

Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK.

Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.

CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.

CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.

Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun

A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK

Distributed Representations of Sentences and Documents

Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.

ELN – Natural Language Processing Giuseppe Attardi

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

Multilingual Word Sense Disambiguation using Wikipedia Bharath Dandala (University of North Texas) Rada Mihalcea (University of North Texas) Razvan Bunescu.

BILINGUAL CO-TRAINING FOR MONOLINGUAL HYPONYMY-RELATION ACQUISITION Jong-Hoon Oh, Kiyotaka Uchimoto, Kentaro Torisawa ACL 2009.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.

Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.

A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge Ping Chen University of Houston-Downtown Wei Ding University of Massachusetts-Boston.

CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.

Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning Author: Chaitanya Chemudugunta America Holloway Padhraic Smyth.

A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.

Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,

Complex Linguistic Features for Text Classification: A Comprehensive Study Alessandro Moschitti and Roberto Basili University of Texas at Dallas, University.

1 Query Operations Relevance Feedback & Query Expansion.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.

Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.

SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.

An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee

Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov

Part 5. Minimally Supervised Methods for Word Sense Disambiguation.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.

Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.

CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.

Word Translation Disambiguation Using Bilingial Bootsrapping Paper written by Hang Li and Cong Li, Microsoft Research Asia Presented by Sarah Hunter.

National Taiwan University, Taiwan

CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.

Hedge Detection with Latent Features SU Qi CLSW2013, Zhengzhou, Henan May 12, 2013.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.

4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.

Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.

FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.

Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.

Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.

Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.

CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.

WordNet WordNet, WSD.

Extracting Why Text Segment from Web Based on Grammar-gram

Presentation transcript:

Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework MEANING project Experiment 6.H(d) Luis Villarejo and Lluís M à rquez

Preface This document stands for the 1st draft of the description of Experiment 6.H(d): “Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework”. To be included in the Working Paper describing experiment 6.H: Bootstrapping.

Outline Introduction Our approach vs Mihalcea’s Technical details Experiments Results Conclusions Future work

Introduction “The Role of Non-Ambiguous Words in Natural Language Disambiguation” Rada Mihalcea University of North Texas Task: Automatic resolution of ambiguity in NL. Problem: The lack of large amounts of annotated data. Proposal: Inducing knowledge from non-ambiguous words via equivalence classes to automatically build an annotated corpus.

Introduction In this experiment we explore whether training example sets can be enlarged with automatically extracted examples associated to each sense. Some work has been recently done in the direction of extracting examples, in a non/supervised manner, associated to word senses like the one presented by Mihalcea. R. in The Role of Non-Ambiguous Words in Natural Language Disambiguation where POS tagging, Named entity tagging and WSD were approached. We will only tackle here WSD. We not only did our best to reproduce the conditions, in which experiments were developed, described in the paper by Mihalcea. R. but also explored new possibilities not taken into account in the paper. However, our scenario is better since our unique source for obtaining the extra training examples is SemCor, and therefore, we do not need to perform any kind of disambiguation. The semantic relations, used to acquire the target words for the extra examples, were taken from the MCR.

“Equivalence classes consist of words semantically related.” Introduction Focuses on: Part of Speech Tagging Named Entity Tagging Word Sense Disambiguation WordNet 1.6 Synonyms ? Hyperonyms ? Hyponyms ? Holonyms ? Meronyms ? Target WordMeaningsMonosemous equivalents plant living_organismflora manufacturing_plantindustrial_plant What happens with the other two meanings of plant?? (actor and trick)

Manually selected word set: Our list: child, day, find, keep, live, material, play and serve. Rada’s: bass, crane, motion, palm, plant and tank. Only two, specially selected, and clearly differentiated senses per word: {play }: play games, play sports “Tom plays tennis” {play }: play a role or part “Gielgud played Hamlet” Source for examples on the equivalent words: Relation of equivalence between words: Our approach vs Mihalcea’s 5 verbs + 3 nouns6 ¿nouns? SemCorRaw corpus (monosemous words, no need for annotation) Synonymy, Hyperonymy, Hyponymy and mixes (levels 1, 2 and both) ¿Synonymy? Our approachMihalcea Equivalent corpus sizes.

Features used: Technique used to learn: Use of the examples coming from the equivalent words: Our approach vs Mihalcea’s Two left words, Two right words Two left POS, Two right POS One left word, One right word One left POS, One right POS Bag of words (WSD task in Meaning project) Two left words, Two right words Nouns before and after Verbs before and after Sense specific keywords?? Our approachRada’s SVMTimbl Memory Based Learner Added to the original word set examples and training a classifier on it Training a classifier on it

Technical details SVM_light (Joachims): Each word is a binary classification problem which has to decide between two possible labels (senses). Positive examples of one sense are negative for the other. 10-fold cross validation Testing with a random folder from the originals. Training with the rest of the originals plus the equivalents. C parameter tuning by margin (5 pieces and 2 rounds) for each classification problem. Linear kernel

Experiments Baselines: Examples from Original word set codified with all features. Examples from Original word set codified only with BOW feature. Experiments over each Baseline: Examples from Equivalents codified with all features. Examples from Equivalents codified only with BOW feature. Examples from Equivalents added in equal proportions. Relations explored over each experiment: Hyponymy levels 1, 2 and both Hyperonymy levels 1, 2 and both Synonymy Mixes: SHypo1, SHypo2, SHypo12, SHype1, SHype2, SHype12 Total: 8 words * 2 baselines * 3 experim * 13 relations = 624 results

Results – Originals BOW, Added BOW (I) MFSAccuracy Baseline 1 (Originals only BOW feature) In detail#origsMFSBaseline#added Best accuracy & relation Child BagSHype1 Day BagSinon Find BagSHype2 Keep BagSHype1 Live BagHypo2 Material BagHype12 Play BagSinon Serve BagHipo1

Results – Originals BOW, Added BOW (II) MFSAccuracy Baseline 1 (Originals only BOW feature) Best Global Results#original exs#exs addedAccuracy BagSHype BagSinon BagSHypo BagHypo

Results – Originals All Feats, Added Both (I) MFSAccuracy Baseline 2 (Originals All features) In detail#origsMFSBaseline#added Best accuracy & relation Child BagSHypo1 Day Hypo1 Find Hypo1 Keep BagSHype1 Live BagHype1 Material SHype12 Play SHypo1 Serve Hypo1

Results – Originals All Feats, Added Both (II) MFSAccuracy Baseline 2 (Originals All features) Best Global Results#original exs#exs addedAccuracy BagSHype BagSHype BagSinon SHype

Results – O-all, A-both, keeping proportions I MFSAccuracy Baseline 2 (Originals All features) In detail#origsMFSBaseline#added Best accuracy & relation Child BagSHypo1 Day Sinon Find Hype1 Keep SHype1 Live Material BagHype2 Play BagSHypo1 Serve Hypo1

Results – O-all, A-both, keeping proportions II MFSAccuracy Baseline 2 (Originals All features) Best Global Results#original exs#exs addedAccuracy BagSHypo SHype SHype Hype Note: Examples not randomly added.

Results Other results: Accuracy improves (slightly) when choosing manually which equivalents to take into account. Experiments with a set of 41 words (nouns and verbs) with all senses per word (varying from 2 to 20) proved to have worse results (accuracies on all mixes of relations and features are below the baseline).

Conclusions The work presented by R. Mihalcea has some dark points. The criteria used to select the words, involved in the experiment, the criteria to select word senses, the restriction on the number of senses per word, the semantic relations used to get the monosemous equivalents or the features used to learn are not satisfactorily described. Although this, Mihalcea’s results showed that equivalents carry useful information to do WSD (better than MFS 76.60% against 70.61%). But is this information useful to improve the state-of-the-art in WSD?. Experiments carried out here showed that adding examples coming from the equivalents seems to improve the results in a restricted framework. This means using a small word set, only two senses per word and clearly differentiated senses. When we moved to an open framework, this means using a bigger word set is used (41 words), no special selection of words, no special selection of senses and no restriction on the number of senses per word (varying from 2 to 20), results proved to be worse. MFS Classifier trained on the automatically generated corpora Classifier trained on the manually generated corpora Accuracy

Future Work Do the differences between the features set used by Mihalcea and the one we used critically affected the results on the 41 words experiment? Exploiting the feature extractor over SemCor to enrich the feature set used. Ideas are welcome. Study the correlation between the number of examples added and the accuracy obtained. Restrict the addition of examples coming from the equivalent words (second class examples). Randomly select which examples to add when keeping proportions or restricting the addition.